RE: LeoThread 2024-11-03 06:11

Let’s consider we have a dataset residing on an S-shaped curve such as a logarithmic curve. Fitting a high-order parabola passing through the known points with zero error is always possible. On the other hand, we can fit a straight line with a high error rate.

The first solution generates an overly complex model and models the implicit noise as well as the dataset. As a result, we can expect a high error for a new data point on the original S-shaped curve.

Conversely, the second model is far too simple to capture the relationship between the input and output. Hence, it will perform poorly on new data, too: