Curve Fitting - Least-Squares Regression
Curve Fitting - Least-Squares Regression
Curve Fitting - Least-Squares Regression
( )) =
( )
( ))
+2
Finally, we get n+1 normal equations for n+1 unknowns (which are the
=
s) as follows
= 0,1, ,
++
+ +
++
] and A is a
are distinct.
Note that normal equations tend to be ill-conditioned especially for higher-order polynomial regression.
Then the computed coefficients become sensitive to round-off error causing some inaccuracy. (Solve the
example question in the textbook).
Example1: Consider the following data where is the measured variable which depends on . Fit a 2nd order
( )=
polynomial
+
+
to the data using least-squares regression.
x=0:5; y=[2.1 7.7 13.6 27.2 40.9 61.1];
Solution1: You can use Matlabs function polyfit. The polyfit function uses the backslash operator \ to
solve the least-squares problem (See the section General Linear Least-Squares Approximation for Discrete
Data described later in this document).
>> p=polyfit(x,y,2)
p =
1.860714285714288
2.359285714285703
2.478571428571445
Therefore,
= (3),
= (2),
= (1)
% Alternatively, you can solve this problem using the Basic Fitting tool of Matlab. You
just need to plot the data first by typing plot(x,y,'o'). A figure window then opens and
in this window, select the Tools menu to access the Basic Fitting tool.
= ( ( )
( ))
( )=
+
++
= 0 for each = 0,1, , .
=
( )
( )
( )
= 2
= ( ( )
and set
+2
=0
The n+1 normal equations to solve for the n+1 coefficients become
( )
=
When written in matrix form
= (where
=[
= 0,1, ,
= 2) to approximate
+1
1
1/2 1/3
1/2 1/3 1/4
1/3 1/4 1/5
= 0.050465,
[0,1] using
)
(
) for
)
(
= 4.12251,
2/
1/
4)/
=
The matrix A is known as the Hilbert matrix and it is ill-conditioned, hence its condition number is
considerably greater than 1.
is a weight
( )
( ) on = [ , ] in a least-squares sense, then the
( )[ ( ) ( )]
( )
( ) = 1 and
( )
( )
( )
( )
( )=
( )
( )
for
= 0,1, , )
= 0,1, ,
,,
( )
( )
( )
( )
= 0,1, ,
( )
( )
( )
( )
( )
= 0,1, ,
( ) ( )
( )
= 0,1, ,
See that the procedure of least-squares regression is considerably simplified when the functions are chosen to
be orthogonal. We can summarise the results as follows:
If
, ,,
is an orthogonal set of functions on an interval [ , ] with respect to the weight function
, then the least-squares approximation to ( ) on [ , ] with respect to is
( )=
= 0,1, ,
( ) ( )
( )
( )
( )
( )
( ) ( )
( )
Example3: Chebyshev polynomials are orthogonal on (1,1) with respect to the weight function
( )=
1
1
As shown in the below figure, ( ) places more emphasis at the ends of the interval and less emphasis
around the centre of the interval (1,1).
Some other examples of orthogonal polynomials are Hermite, Jacobi, Legendre and Laguerre orthogonal
polynomials. It is possible to construct orthogonal polynomials on an interval [ , ] with respect to a weight
function using a procedure called Gram-Schmidt orthogonalization.
( ) +
( ) +
( ) ++
) +
= 1,2, ,
where n is the number of data points; ( , , , ) are the coefficients; ( , , , ) are the (m+1)
basis functions;
is the measured value of the dependent variable for the
data point and
is the
discrepancy between
and the corresponding approximate value given by the linear least-squares model.
(See Section 17.4 in the textbook).
The term linear corresponds to the models dependence on its coefficients. Note that the basis functions
can still be nonlinear. For instance, the model ( ) = (1
) is a nonlinear model as it cannot be
written as a linear combination of some basis functions.
Some examples of linear least-squares model are:
Linear regression:
+
thus
= 1,
=
Polynomial regression:
+
++
thus
Multiple linear regression:
+
+
++
= 1,
= ,
= ,,
=
thus
= 1,
= , ,
=
] , =[
] and = [
] . Z is an ( ( + 1))
=[
matrix where , = ( ) and ( ) is the value of the
basis function calculated at the given values of the
independent variable(s) for the
data. Note that > + 1 which means that the system of linear
equations is typically overdetermined i.e. there are more equations than unknowns.
The matrix Z is also called the design matrix and it is given as
( )
( )
=
( )
( )
( )
( )
( )
( )
( )
( )
by setting
/
= 0 for = 0,1, , . In other words, we are minimising the length of the error vector
where =
. This problem is equivalent to minimising the objective function where we can
easily express using the dot product operation between two vectors as follows:
1
1
= | | =
2
2
1
= [
2
] [
)=0
Note that the vector 0 is a zero column-vector with the size ( + 1) 1. If the basis functions are
independent, then
is nonsingular and the coefficients can be found by using matrix inverse as follows:
=
. But remember that using matrix inverse is less efficient and less accurate than solving
the system by Gauss Elimination.
If the design matrix
does not exist. In such a case, we do not have a unique solution for the normal equations.
Note that the normal equations are always more badly conditioned than the original overdetermined system.
In order to circumvent this problem, orthogonalisation algorithms such as QR factorization (which is also
used by Matlab) can be employed.
In Matlab, the least-squares solution to the problem =
+ (i.e. ) is given by = \ where
Z is the design matrix. In Matlab, the backslash operator \ is the same as the function mldivide. See the
help file of mldivide to see the algorithms used by Matlab.
Matlab avoids the normal equations. Matlab's backslash operator utilises QR factorization when the
coefficient matrix is not square. Recall that the normal equations are always more badly conditioned than the
original overdetermined system in a typical least-squares problem. In order to circumvent this problem, QR
factorization is employed.
Question: If = + 1 , then what happens? In that case, the number of basis functions will be equal to the
number of data points and becomes a square matrix. Then, =
=
( )
=
which means = . What happens to then?
Example4: The variable y is a function of two independent variables
and . The measured values of y for
several values of
and
are given in the below table. Fit the model
+
+
to this data. Then
the problem is =
+
+
+ where denotes the error.
0
0
5
2
1
10
2.5
2
9
1
3
0
4
6
3
7
2
27
x1
x2 ]
= 5,
= 4,
= 3
cond(Z'*Z)= 65.466
= 5,
= 4,
= 3
Example5: Consider a data where and are the independent and dependent variables respectively. It is
required to fit a linear model of the form
+
+
to this data using least-squares regression.
The data is given as t=[0 0.3 0.8 1.1 1.6 2.3]'; y=[0.6 0.67 1.01 1.35 1.47 1.25]';
Solution5: In Matlab, define the design matrix as Z=[ ones(size(t)) exp(-t) t.*exp(-t) ]
The coefficients of the model are then calculated as:
>> a=Z\y
a =
1.398315282466043
-0.885977444768642
0.308457854291529
>> tt=(0:0.01:2.5)'; ym=[ ones(size(tt)) exp(-tt) tt.*exp(-tt) ]*a;
%Define vector tt to plot the fitted curve (i.e. the model).
%ym is the y-values produced by the model.
%Alternatively, you could define ym as ym=a(1)+a(2)*exp(-tt)+a(3)*tt.*exp(-tt);
>> plot(t,y,'o',tt,ym), xlabel('t'), ylabel('y')