Curve Fitting Made Easy

24 The Industrial Physicist APRIL/MAY 2003 © American Institute of Physics

PRIMER by Marko Ledvij

Scientists and engineers often want torepresent empirical data using a model

based on mathematical equations. Withthe correct model and calculus, one candetermine important characteristics of thedata, such as the rate of change anywhereon the curve (first derivative), the localminimum and maximum points of thefunction (zeros of the first derivative), andthe area under the curve (integral). Thegoal of data (or curve) fitting is to find theparameter values that most closely matchthe data. The models to which data are fit-ted depend on adjustable parameters.

Take the formula Y = A*exp(–X/X0)

in which X is the independent variable, Y isthe dependent variable, and A and X0 arethe parameters. In an experiment, one typi-cally measures variable Y at a discrete set ofvalues of variable X. Curve fitting providestwo general uses for this formula:• A person believes the formula describes

the data and wants to know what para-meter values of A and X0 correspond tothe data.

• A person wants to know which of severalformulas best describes the data.

To perform fitting, we define some func-tion—which depends on the parameters—that measures the closeness between thedata and the model. This function is thenminimized to the smallest possible valuewith respect to the parameters. The para-meter values that minimize the function arethe best-fitting parameters. In some cases,the parameters are the coefficients of theterms of the model, for example, if themodel is polynomial or Gaussian. In thesimplest case, known as linear regression, astraight line is fitted to the data. In mostscientific and engineering models, however,the dependent variable depends on theparameters in a nonlinear way.

Model selectionOne needs a good understanding of the

underlying physics, chemistry, electronics,and other properties of a problem to choosethe right model. No graphing or analysissoftware can pick a model for the given

data—it can only help in differentiatingbetween models. Once a model is picked,one can do a rough assessment of it byplotting the data. At least some agreementis needed between the data and the modelcurve. If the model is supposed to repre-sent exponential growth and the data aremonotonically decreasing, the model isobviously wrong. Many commercial soft-ware programs perform linear and nonlin-ear regression analysis and curve fittingusing the most common models. The betterprograms also allow users to specify theirown fitting functions.

A typical example of exponential decayin physics is radioactive decay. The measur-able rate of decay as a function of time—that is, the number of decays per unit oftime—is given by

Λ(t) = Λ(0)e-λt

where t is elapsed time and λ is the proba-bility of decay of one nucleus during onetime unit. By measuring the decay rate as afunction of time and fitting the data to theexponential-decay law, one can infer theconstant λ .

As another example, the following Gauss-ian function describes a normal distribution:

Here, x0 is the mean value and σ is thestandard deviation of the distribution. Usu-ally, several measurements are performed ofa quantity x. The results are binned—assigned to discrete intervals called bins—as a function of x, such that y(x) representsthe number of counts of the values in thebin centered around x. The mean value andstandard deviation are then obtained bycurve fitting.

Sometimes, it is possible to linearize anonlinear model. For example, consider thefirst equation. If we take the logarithm ofboth the left- and the righthand side, we get

log(Y) = log(A) – X/X0Hence, log(Y) is a linear function of X.

This means that one could try to fit the log-arithm of Y data to a linear function of Xdata to find A and X0. This approach waswidely used before the age of computers.

There are problems with it, however, includ-ing the fact that the transformed data usu-ally do not satisfy the assumptions of linearregression. Linear regression assumes thatthe scatter of points around the line followsa Gaussian distribution, and that the stan-dard deviation is the same at every value ofthe independent variable.

To find the values of the model’s para-meters that yield the curve closest to thedata points, one must define a functionthat measures the closeness between thedata and the model. This function dependson the method used to do the fitting, andthe function is typically chosen as the sumof the squares of the vertical deviationsfrom each data point to the curve. (Thedeviations are first squared and then addedup to avoid cancellations between positiveand negative values.) This method assumesthat the measurement errors are indepen-dent and normally distributed with con-stant standard deviation.

Linear regressionThe simplest form of modeling is linear

regression, which is the prediction of onevariable from another when the relation-ship between them is assumed to be linear.One variable is considered to be indepen-dent and the other dependent.

A linear regression line has an equationof the form

y = a + bx, where x is the independent variable and y isthe dependent variable. The slope of theline is b, and a is the intercept (the value ofy when x = 0). The goal is to find the val-ues of a and b that make the line closest tothe data. The chi-square estimator in thiscase (where σi are measurement errors; ifthey are not known, they should all be setto 1) is given by

χ2 is a function of the parameters of themodel; N is the total number of points. Weneed to find the values of a and b that min-imize χ 2. Typically, this means differentiat-

Curve Fitting Made Easy

y (x) =1

σ 2π exp −

(x − x0)2

2σ 2

χ 2(a,b) = (

yi −a−bxiσ

ii=1

N

∑ )2

25 The Industrial Physicist

ing χ2 with respect to the parameters a andb and setting the derivatives equal to 0. Fora linear model, the minimum can be foundand expressed in analytic form as

The resulting system of two equations withtwo unknowns can be solved explicitly for aand b, producing the regression line likethat shown in Figure 1.

Nonlinear modelsIn most cases, the relationship between

measured values and measurement vari-ables is nonlinear. Nonlinear curve fittingalso seeks to find those parameter valuesthat minimize the deviations between theobserved y values and the expected y values.In nonlinear models, however, one usuallycannot solve the equation for the parame-ters. Various iterative procedures are usedinstead. Nonlinear fitting always starts withan initial guess of parameter values. Oneusually looks at the general shape of themodel curve and compares it with theshape of the data points. For example, lookat the sigmoidal function in Figure 2.

This function has fourparameters. Ab and Atare two asymptotic val-ues—values that thefunction approaches butnever quite reaches—atsmall and large x. Thecurve crosses overbetween these twoasymptotic values in aregion of x values whoseapproximate width is wand which is centered

around x0. For the ini-tial estimate of the val-ues of Ab and At, wecan use the average ofthe lower third and theupper third of the Yvalues of data points.The initial value of x0is the average of X val-ues of the crossoverarea, and the initialvalue of w is the widthof the region of pointsbetween the twoasymptotic values.

A good under-standing of the select-

ed model is important because the initialguess of the parameter values must makesense in the real world. In addition, a goodfitting software will try to provide an initialguess for the functions built into it. Thenonlinear fitting procedure is iterative. Themost widely used iterative procedure fornonlinear curve fitting is the Levenberg–Mar-quardt algorithm, which good curve-fittingsoftwares implement. Starting from someinitial values of parameters, the methodtries to minimize

by successive small variations of parameter

values and reevaluation of χ2 until a mini-mum is reached. The function f is the one towhich we are trying to fit the data, and χ2

and yi are the X and Y values of data points.Although the method involves sophisticatedmathematics, end users of typical curve-fit-ting software usually only need to initializeparameters and press a button to start theiterative fitting procedure.

WeightingOften, curve fitting is done without

weighting. In the above formulas for χ 2, theweights σi are all set to 1. If experimentalmeasurement errors are known, they shouldbe used as σi. They then provide an unevenweighting of different experimental points:the smaller the measurement error for aparticular experimental point, the larger theweight of that point in χ 2. Proper curve-fit-ting software allows users to supply mea-surement errors for use in the fitting process.If the experimental errors are not knownexactly, it is sometimes reasonable to makean assumption. For example, if you believethat the deviation of the experimental datafrom the theoretical curve is proportional tothe value of the data, then you can use

σi = yiHere, a weight of each point, which is

the inverse of the square of σi , is inverselyproportional to the square of the value of adata point.

χ 2(a,b) =yi − f (xi ,a,b)

σ i

i =1

N

∑2

0 =∂χ 2

∂b= −2

xi yi − a − bxi( )σ i

2i=1

N

∑

0 =∂χ 2

∂a= −2

yi − a − bxi

σ i2

i=1

N

∑

Figure 1. In linear regression, the correct values of a and b in thestraight line equation are those that minimize the value of χ2.

Figure 2. In nonlinear curve-fitting, such as this sigmoidal func-tion, to minimize the deviations between the observed andexpected y values, it is necessary to estimate parameter valuesand use iterative procedures.

At

Ab

x0

X

Y = Ab+1+exp[–(x–x0)/w]At – Ab

Y

Y

w

10

8

6

4

2

00 2 4 6 8 10

X

DataLinear fit

a = 1.73247±0.12486

b = 0.78897±0.02052

Interpreting resultsIn assessing modeling results, we must

first see whether the program can fit themodel to the data—that is, whether theiteration converged to a set of values for theparameters. If it did not, we probably madean error, such as choosing the wrong modelor initial values. A look at the graphed datapoints and fitted curve can show if thecurve is close to the data. If it is not, thenthe fit has failed, perhaps because the itera-tive procedure did not find a minimum ofχ 2 or the wrong model was chosen.

The parameter values must also makesense from the standpoint of the model. If,for example, one of the parameters is tem-perature in kelvins and the value obtained

from fitting is nega-tive, we have obtaineda physically meaning-less result. This is trueeven if the curveseemingly does fit thedata well. In thisexample, we havelikely chosen anincorrect model.

There is a quanti-tative value thatdescribes how wellthe model fits thedata. The value isdefined as χ 2 dividedby the number ofdegrees of freedom,and it is referred to asthe standard deviation

of the residuals, which are the vertical dis-tances of the data points from the line. Thesmaller this value, the better the fit.

An even better quantitative value is R2,known as the goodness of fit. It is computedas the fraction of the total variation of the Yvalues of data points that is attributable tothe assumed model curve, and its valuestypically range from 0 to 1. Values close to1 indicate a good fit.

To assess the certainty of the best-fitparameter values, we look at the obtainedstandard errors of parameters, which indi-rectly tell us how well the curve fits thedata. If the standard error is small, it meansthat a small variation in the parameterwould produce a curve that fits the data

much worse. Therefore, we say that weknow the value of that parameter well. Ifthe standard error is large, a relatively largevariation of that parameter would not spoilthe fit much; therefore, we do not reallyknow the parameter’s value well.

Important intervalsStandard errors are related to another

quantity, one usually obtained from the fit-ting and computed by curve-fitting soft-ware: the confidence interval. It says howgood the obtained estimate of the value ofthe fitting function is at a particular value ofX. The confidence interval gives a desiredlevel of statistical confidence, expressed as apercentage, usually 95% or 99%. For a par-ticular value of X, we can claim that the cor-rect value for the fitting function lies withinthe obtained confidence interval. Goodcurve-fitting software computes the confi-dence interval and draws the confidenceinterval lines on the graph with the fittedcurve and the data points. The smaller theconfidence interval, the more likely theobtained curve is close to the real curve.

Another useful quantity usually producedby curve-fitting software is the predictioninterval. Whereas the confidence intervaldeals with the estimate of the fitting func-tion, the prediction interval deals with thedata. It is computed with respect to a desiredconfidence level p, whose values are usuallychosen to be between 95% and 99%. Theprediction interval is the interval of Y valuesfor a given X value within which p% of allexperimental points in a series of repeated

Figure 3. The residual plot (blue) shows the deviations of the datapoints (black) from the fitted curve (red) and indicates by its non-randomness a possible hidden variable.

15

10

5

0

-5

Y

0 5 10 15 20 25 30X

Data

Fitted curve

Residualplot

Primer

27 The Industrial Physicist

measurements are expected to fall. The nar-rower the interval, the better the predictivenature of the model used. The predictioninterval is represented by two curves lyingon opposite sides of the fitted curve.

The residuals plot provides insight intothe quality of the fitted curve. Residuals arethe deviation of the data points from thefitted curve. Perhaps the most importantinformation obtained is whether there areany lurking (hidden) variables. In Figure 3,the data points are black squares, theobtained fitted line is red, and the blue linerepresents residuals.

Normally, if the curve fits the data well,one expects the data-point deviations to berandomly distributed around the fittedcurve. But in this graph, for the data pointswhose values are less than about 22, allresiduals are positive; that is, the datapoints are above the fitted curve. However,for data points with X values above 22, allresiduals are negative and the data pointsfall below the fitted curve. This indicates anonrandom behavior of residuals, whichmeans that this behavior may be driven bya hidden variable.

So curve fitting, although it cannot tellyou how, can certainly tell you when yourchosen model needs improvement.

Further reading1. www.originlab.com (Origin).2. http://mathworld.wolfram.com/Least

SquaresFitting.html3. Press, W. H.; et al. Numerical Recipes

in C, 2nd ed.; Cambridge University Press:New York, 1992; 994 pp.; provides adetailed description of the Levenberg–Mar-quardt algorithm.

4. Bevington, P. R. Data Reduction andError Analysis for the Physical Sciences,McGraw-Hill, New York, 1969.

5. Draper, N. R.; Smith, H. AppliedRegression Analysis, 3rd edition, John Wiley& Sons, 1998. Ω

Marko Ledvij is a senior software develop-er at OriginLab Corp. in Framingham,Massachusetts ([email protected]).

B I O G R A P H Y

Curve Fitting Made Easy

Documents

Transcript of Curve Fitting Made Easy