Kemometrik - Curve Fit Dan Regresi Linier 01

Curve Fitting dan
Regresi Linier
KIM 357 – Kemometrika
Departemen Kimia FMIPA IPB
[email protected]
Curve Fitting
Describes techniques to fit curves (curve fitting) to discrete data to

obtain intermediate estimates.
There are two general approaches for curve fitting:
 Least Squares regression:
Data exhibit a significant degree of scatter. The strategy is to
derive a single curve that represents the general trend of the data.
 Interpolation:
Data is very precise. The strategy is to pass a curve or a series of
curves through each of the points.
Curve Fitting
In Chemistry, two types of applications are encountered:

 Trend analysis. Predicting values of dependent variable, may
include extrapolation beyond data points or interpolation between
data points.
 Hypothesis testing. Comparing existing mathematical model with

measured data.
Curve Fitting
Curve Fitting: Techniques
 Where does this given function Measured Variable = f (Physical

Variable) come from in the first place?
 Analytical models of phenomena (e.g. equations from physics)
 Create an equation from observed data
 Curve fitting - capturing the trend in the data by assigning a single
function across the entire range.
Curve Fitting: Techniques
A straight line is described generically by

 Given the general form
of a straight line
f ( x ) = ax + b  How can we pick the
coefficients that best fits
the line to the data?
The goal is to identify the  What makes a particular
coefficients ‘a’ and ‘b’ such that straight line a ‘good’ fit?
f(x) ‘fits’ the data well
Linear Regression
Fitting a straight line to a set of paired observations: (x1, y1), (x2,

y2),…,(xn, yn).
 y = a0+ a1 x + e
 a1 - slope
 a0 - intercept
 e - error, or residual, between the model and the observations
Linear Regression : Residual
Linear Regression : Question ?
How to find a0 and a1 so that the error would be

minimum?
Linear Regression
Assumptions:
 positive or negative error have the same value (data point is above or
below the line)
 Weight greater errors more heavily
• Denote data values as (x, y)
• Name points on the fitted line as (x, f(x)).
The error is available at the four data

points.
Linear Regression: Least Squares Fit
n n n
S r = ∑ e = ∑ ( yi , measured − yi , model) = ∑ ( yi − a0 − a1 xi ) 2
2
i
2
i =1 i =1 i =1
n n
2 2
min S r = ∑ ei = ∑ ( yi − a0 − a1 xi )
i =1 i =1
Yields a unique line for a given set of data.

Linear Regression: Least Squares Fit
n n
2 2
min S r = ∑ ei = ∑ ( yi − a0 − a1 xi )
i =1 i =1
The coefficients a0 and a1 that minimize Sr must satisfy the following

conditions:
 ∂S r
 ∂a = 0
 0

 ∂S r = 0

 ∂a1
Linear Regression: Determination of ao and a1
∂S r
= −2∑ ( yi − ao − a1 xi ) = 0
∂ao
∂S r
= −2∑ [( yi − ao − a1 xi ) xi ] = 0
∂a1
0 = ∑ yi − ∑ a 0 − ∑ a1 xi
0 = ∑ yi xi − ∑ a 0 xi − ∑ a1 xi2
∑a 0 = na0
na0 + (∑ xi )a1 = ∑ yi 2 equations with 2 unknowns, can
be solved simultaneously
∑ yi xi = ∑ a 0 xi + ∑ a1 xi2
Linear Regression: Determination of ao and a1
n∑ xi yi − ∑ xi ∑ yi
a1 =
n ∑ x − (∑ xi )
2 2
i
a0 = y − a1 x
Error Quantification of Linear Regression
 Total sum of the squares around the mean for the dependent variable, y, is St
St = ∑ i
( y − y ) 2
 Sum of the squares of residuals around the regression line is Sr
n n
S r = ∑ e = ∑ ( yi − ao − a1 xi ) 2
2
i
i =1 i =1
 St-Sr quantifies the improvement or error reduction due to describing

data in terms of a straight line rather than as an average value.
2 St − S r
r =
St
r2: coefficient of determination
r : correlation coefficient
For a perfect fit:

 Sr= 0 and r = r2 =1, signifying that the line explains 100 percent of
the variability of the data.
 For r = r2 = 0, Sr = St, the fit represents no improvement.
Least Squares Fit of a Straight Line: Example
Fit a straight line to the x and y values in the following Table:
xi yi xiyi xi2
∑ xi = 28 ∑ yi = 24.0
1 0.5 0.5 1
2 2.5 5 4
2
∑ i = 140
x ∑ xi yi = 119.5
3 2 6 9
2 8
4 4 16 16
x = =4
5 3.5 17.5 25 7
6 6 36 36
2 4
7 5.5 38.5 49 y= = 3.4 2 8 5
7
28 24 119.5 140
Least Squares Fit of a Straight Line: Example (cont’d)
n∑ xi yi − ∑ xi ∑ yi
a1 =
n∑ x − (∑ xi )
2 2
i
7 × 119.5 − 28 × 24
= 2
= 0.8392857
7 × 140 − 28
a0 = y − a1 x
= 3.428571 − 0.8392857 × 4 = 0.07142857
Y = 0.07142857 + 0.8392857 x
Least Squares Fit of a Straight Line: Example (cont’d)
^
2
xi yi (yi − y) e = ( yi − y ) 2
2
i
1 0.5 8.5765 0.1687 S t = (
∑ i y − y )2
= 22.7143
2 2.5 0.8622 0.5625
S r = ∑ ei = 2.9911
2
3 2.0 2.0408 0.3473
4 4.0 0.3265 0.3265
5 3.5 0.0051 0.5896 St − S r
2
6 6.0 6.6122 0.7972 r = = 0.868
St
7 5.5 4.2908 0.1993
28 24.0 22.7143 2.9911 2
r= r = 0.868 = 0.932
Least Squares Fit of a Straight Line: Example (Error Analysis)
•The standard deviation (quantifies the spread around the mean):
St 22.7143
sy = = = 1.9457
n −1 7 −1
•The standard error of estimate (quantifies the spread around the regression line)
Sr 2.9911
sy / x = = = 0.7735
n−2 7−2
Because S y /,x < S y the linear regression model has good fitness
Linearization of Nonlinear Relationships
• The relationship between the dependent and independent

variables is linear.
• However, a few types of nonlinear functions can be transformed
into linear regression problems.
The exponential equation.
The power equation.
The saturation-growth-rate equation
Software
Terimakasih

Kemometrik - Curve Fit Dan Regresi Linier 01

Uploaded by

Copyright:

Available Formats

Kemometrik - Curve Fit Dan Regresi Linier 01

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Kemometrik - Curve Fit Dan Regresi Linier 01

Uploaded by

Copyright:

Available Formats

Curve Fitting dan

Describes techniques to fit curves (curve fitting) to discrete data to

In Chemistry, two types of applications are encountered:

 Hypothesis testing. Comparing existing mathematical model with

 Where does this given function Measured Variable = f (Physical

A straight line is described generically by

Fitting a straight line to a set of paired observations: (x1, y1), (x2,

How to find a0 and a1 so that the error would be

The error is available at the four data

Yields a unique line for a given set of data.

The coefficients a0 and a1 that minimize Sr must satisfy the following

 Sum of the squares of residuals around the regression line is Sr

 St-Sr quantifies the improvement or error reduction due to describing

For a perfect fit:

Fit a straight line to the x and y values in the following Table:

•The standard deviation (quantifies the spread around the mean):

• The relationship between the dependent and independent

You might also like