P. Vanicek, Introduction To Adjustment Calculus

INTRODUCTION TO
ADJUSTMENT
CALCULUS
P. VANICEK
September 1973
TECHNICAL REPORT
LECTURE NOTES
NO.
35217
INTRODUCTION TO
ADJUSTMENT CALCULUS
(Third Corrected Edition)
Petr Vanfcek
Department of Geodesy & Geomatics Engineering

University of New Brunswick
P.O. Box 4400
Fredericton, N .B.
Canada
E3B 5A3
February, 1980
Latest Reprinting October 1995
PREFACE
In order to make our extensive series of lecture notes more readily available, we have
scanned the old master copies and produced electronic versions in Portable Document
Format. The quality of the images varies depending on the quality of the originals. The
images have not been converted to searchable text.
FOREWORD
It has long been the author's conviction that most of the
existing courses tend to.slide over the fundamentals and treat the
adjustment purely as a technique without giving the student a deeper
insight without answering a good many questions beginning with "why".
This course is a result of a humble attempt to present the adjustment
as a discipline with its own rights, with a firm basis and internal
structure; simply as an adjustment calculus. Evidently, when one tries
to take an unconventional approach, one is only too liable to make
mistakes. It is hoped that the student will hence display some patience
and understanding.
These notes have evolved from the first rushed edition - termed
as preliminary- of the Introduction to Adjustment Calculus, written for
course SE 3101 in 1971. Many people have kindly communicated their
comments and remarks to the author. To all these, the author is heavily
indebted. In particular, Dr. L. Hrad{lek, Professor at the Charles
University in Prague, and Dr. B. Lund, Assistant Professor at the Math-
ematics Dept. UNB, made very extensive reviews that helped in clarifying
many points. Mr. M. Nassar, a Ph.D. student in this department, carried

most of the burden connected with rewriting the notes on his shoulders.
Many of the improvements in formulations as well as most of the examples
and exercise problems contained herein originated from him.
None of the contributors should however, be held responsible for
any errors and misconception still present. Any comment or critism com-
municated to the author will be highly appreciated.
P. Van!~ek
October 7, 1974
CONTENTS
Introduction • . . .. . 1
1. Fundamentals of the Intuitive Theory of Sets
1.1 Sets, Elements and Subsets 6

1.2 Progression and Definition Set .... 7
1.3 Cartesian Product of Sets . .... 8
1.4 Intersection of Sets .... 9
1.5 Union of Sets ... .... 10
1.6 Mapping of Sets . .... 12
1. 7 Exercise 1 13
2. Fundamentals of the Mathematical Theory of Probability
2.1 Probability Space, Probability Function and Probabil..,.

ity ' I • I • • ' I • I ' I I 15
2.2 Conditional Probability . 16
2.3 Combined Probability 16
2.4 Exercise 2 • . • . • . . 18
3 Fundamentals of Statistics
3.1 Statistics of an Actual Sample

3.1.1 Definition of a Random Sample 20
3.1.2 Actual (Experimental) PDF and CDF . • • • . 22
3 .1. 3 Mean of a Sample . • . • 26
3.1.4 Variance of a Sample • . • , 29
3.1.5 Other Characteristics of a Sample •• 32
3.1.6 Histograms and Polygons • . . . . 34
3.2 Statistics of a Random Variable
3.2.1 Random Function and Random Variable •. 47
3.2.2 PDF and CDF of a Random Variable , ••• 47b
3.2.3 Mean and Variance of a Random Variable .• 51
3.2.4 Basic Postulate (Hypothesis) of Statistics,
Testing . . . . . . . . . . . . . 55
3.2.5 Two Examples of a Random Variable 56
3.3 Random Multivariate
3.3.1 Multivariate, its PDF and CDF • • • 66
3.3.2 Statistical Dependence and Independence , . 69
3.3.3 Mean and Variance of·a Multivariate . • . . 70
3.3.4 Covariance and Variance-Covariance Matrix • 72
3. 3. 5 Random Multisample, its PDF and CDF . . • . 76
3.3.6 Mean and Variance-Covariance Matrix of
a Multisample 76
3.3.7 Correlation . • . . . . . . . • • • 81
4. Fundamentals of the Theory of Errors
4.1 Basic Definitions . . . • . • . 89

4.2 Random (Accidental) Errors 91
4.3 Gaussian PDF, Gauss Law of Errors . 92
4.4 Mean and Variance of the Gaussian PDF • 94
4.5 Generalized or Normal Gaussian PDF 97
4.6 Standard Normal PDF . . . . . • 98
4.7 Basic Hypothesis (Postulate) of the Theory of Errors,
Testing • • . • • • 106
4.8 Residuals, Corrections and Discrepancies • • . . • • . • • 109
4.9 Oth~Possibilities Regarding the Postulated PDF . • • . . . 112
4.10 Other measures of Dispersion 113
4.11 Exercise 4 . . . . . . . . . 118
5. Least -Squares Principle
5.1 .-The Sample Mean as "The Least Squares Estimator~' • • . 123

5. 2 The Sample Mean as "The Maxi111u!n Probability Estimator" • • 125
5. 3 Least Squares Principle . . , . . . • . • . . 128
5.4 Least-Squares Principle for Random Multivariate 130

5.5 Exercise 5 . . . • • · • · • • • 132
6. Fundamentals of Adjustment Calculus
6.1 Primary and Derived Random Samples 133

6.2 Statistical Transformation, Mathematical Model . 133
6.3 Propagation of Errors
6.3.1 Propagation of Variance-Covariance Matrix, Covariance
Law . • • . • 137
6.3.2 Propagation of Errors, Uncorrelated Case • . • . 145
6.3.3 Propagation of Non-Random Errors, Propagation of
Total Errors • • 153
6.3.4 Truncation and Rounding . . • • . . • • • • • • . . 157
6. 3. 5 Tolerance Limits, Specifications and Preanalysis 162
6.4 Problem of Adjustment
6.4.1 Formulation of the Problem • • . . . • • • • • 166
6.4.2 Mean of a Sample as an Instructive Adjustment
Problem, Weights . • . , • 167
6.4.3 Variance of the Sample Mean • • • . • • • • • • • • 170
6.4.4 Variance -Covariance Matrix of the Mean of a
Multisample 174
6.4.5 The Method of Least-Squares, Weight Matrix . 176
6.4.6 Parametric Adjustment • • • . . • • . • • • • . . • 179
6.4.7 Variance-Covariance Matrix of the Parametric
Adjustment Solution Vector, Variance Factor and
Weight Coefficient Matrix . . • • . . • . . . 193
6.4.8 Some Properties of the Parametric Adjustment
Solution Vector . . • . • . • • . • • • • . . • 201
6.4.9 Relative Weights, Statistical Significance of a Priori
and a Posteriori Variance Factors . . . . • • 202
6.4.10 Conditional Adjustment • . . • • . . . . . .• . 204
6.4.11 Variance-Covariance Matrix of the Conditional Adjust-
ment Solution . 213
6, 5 hercise 6 . . . . . . . . . . . . . . . . . 220
Appendix I Assumptions for and Derivation of the Gaussian PDF . 233

Appendix II Tables . . • • . 238
Bibliography • . . . • • • • • . . . . • • 241
INTRODUCTION
In technical practice, as well as in all experimental sciences,
one is faced with the following problem: evaluate quantitatively para-
meters describing properties, features, relations or behaviour of various
objects around us. The parameters can be usually evaluated only on the
basis of the results of some measurements or observations. We may, for
example, be faced with the problem of evaluating the length of a string.
This can be measured directly. Here the only parameter we are trying to
determine is the observed quantity itself and the problem is fairly
simple. More complicated proposal would be, for instance, to determine
the coefficient of expansion of a rod. •rhen the parameter--the
coefficient of expansion--cannot be measured directly, as in the previous
case, and we have to deduce its value from the results of observations of
length, by performing some computations using the mathematical relationship
connecting the observed quantities and the wanted parameters. The more
complicated- the problems get, of course, the more complex is the system
whose parameters we are trying to determine. Obviously, the determination
of the orbital parameters of a satellite from various angles observed on
the surface of the earth would be an example of one such still more
sophisticated task.
The adjustment is a discipline that tries to categorise those
problems and attempts to deal with them symmetrically. In order to be
able to deal with such problems systematically the adjustment has to use a
language suitable for this purpose, the obvious choice being mathematics.
l
2
Hence, the problem to be treated has to be first "translated" into the
language of mathematics, i.e., the problem has to be first mathematically
formulated. The mathematical formulation of the problem would really be
the mathematical formulation of the relation between the observed quan-
tities (observables) and the wanted quantities (parameters). This relation-
ship is called the mathematical model. Denoting the observables by L
(L stands for one, two, or n quantities) and the parameters by X (X stands
for one, two or m quantities) the most general form of the mathematical
model cna be written as
F (X, L) = 0 •
The above equation merely states that there is a (implicit) relation
between the observables and the parameters. The formulation of an actual
mathematical model has to be done taking into account all the physical
and geometrical laws--simply using the accumulated experience. The com-
plexity of the mathematical model reflects the complexity of the problem
itself. Thus the mathematical model of our first problem is practically
trivial:
X= L
where X is the wanted length and L is the observed length.
The mathematical model for the coefficient of expansion a of the
rod is more complicated, namely, for instance
R, = R, 0 (1 + at)
where a = X, the observed length R, and the observed difference in temper-
ature t create L and R,

0
is another parameter (length of the rod at
3
a fixed tern~erature) which we happen to know. The mathematical model for
the satellite orbital elements would be more complicated still.
Once the mathematical model has been formulated it can become
a subject of rigorous mathematical treatment, a subject of adjustment
calculus. Hence, the formulation of the mathematical model itself is to
be considered as being beyond the scope of adjustment calculus and only
the various kinds of mathematical models alone constitute the subject of
interest.
There is one particular class of models, that are very often
encountered in practice, and that can be termed as overdetermined. By an
overdetermined model we understand a model which does not have a unique
solution for X because there are "unnecessarily many" observations supplied.
This can be the case, say, with our first example, if the length is measured
several times. The model in this case would be formulated as
X = R. 1
X = R,
2
X = R.n
where t 1 , t 2 , .•• , tn are all encompassed by the symbol L. Or, in the
second example, we may have
t1 = t0 (1 + at 1 )
R- 2 = R- 0 (1 + at 2 )
4
••• I t
n
) = L.
As we can easily see, these overdetermined models may or may
not have a unique solution. They usually have not. Therefore, in order
to produce a unique solution of some kine, we have to assume that the
observations were not quite correct, that there were errors in their
determinations.
This leads us into the area of the theory of errors with its
prerequisites--the theory of probability and statistics. With the help
of these disciplines we are aboe to define the most probable unique
solution (if it exists) for the parameters of the mathematical models.
We also are usually able to establish the degree of reliability of the
solution.
The notes are divided into six sections: Fundamentals of the
intuitive Theory of Sets, Fundamentals of the Mathematical Theory of
Probability, Fundamentals of Statistics, Fundamentals of the Theory of
Errors, Least-Squares Principle, Fundamentals of the-Adjustment Calculus.
The first four sections describe the relevant parts of the individual
fields that are necessary to understand what adjustment is all about. They,
by no means, claim any completeness and it is envisaged that an interested
student will supplement his reading from other sources, such as those
listed at the end of these notes.
A separate section (5) is devoted to the philosophical basis
of the adjustment calculus. Although not very extensive it should be
regarded important, giving the reasons why the least-squares techniqu~
is used in adjustment.
5
Finally, the last section deals with the basics of the adjust-
ment proper. Here again, only the introductory parts of the adjustment
calculus could be treated with the understanding that only the subsequent
courses will develop the full picture.
Throughout the course emphasis is placed on the parallel develop-
ment of concepts of "discrete statistics", i.e. statistics of random
samples, and "continuous statistics", i.e. statistics of random variables.
While random samples are the quantities we deal with in every-day practice,
the mathematical tools used are predominently from the continuous domain.
Good understanding of the interplay of the two concepts is indispensable
for anyone who wants to be able to use the adjustment calculus properly.
The bibliography given at the end of these notes lists some
of the useful books dealing with statistics and adjustments. Interested
reader is recommended to complement the reading of these notes by turning
to at least some of the listed sources.

1. FUNDAMENTALS OF THE INTUITIVE THEORY OF SETS
1.1. Sets 1 Elements and Subsets
A set is an ensemble of objects (elements) that can be distin-
guished one from another. The set is defined when all its elements are
defined.
Example 1.1: - { k':- :o::: 'I/,,

4.18 }
Al
A2 - { 1, 8,
/I I'-
15, <r
"
0,
'I I,
.::0~
/p-
4}
A3 - { 0, 1}
A4 - {all the left feet}
A5 _ {all the cities with more than one million inhabitants
in New Brunswick}
R - {all the real numbers} , and
I - {all the positive integers} , are all sets.
The text within the brackets{ ••• } is known as the list of the-
set. If an element a exists in the list of a set A, we say that the element
a belongs to the set A, and this is denoted by
a e: A
which is read as "a belongs to A". On the other hand, if an element a
does not belong to a set A, we write
a t A
which is read as "a does not belong to A".
Example 1.2: Referring to Example 1.1, we see that:

\I"~
-;0.:-
,, \
e: -A-1 2 t A1 , and a right foot t A4 .
6
7
A part of a set G is called a subset of G whether it contai~s one
or several elements. The fact that a set H is contained in G is hence
written as
HC G
If H is not contained in G, i.e. if not all the elements of H are at the
same time elements of G, we write
H¢ G
Example 1.3: Referring to Example 1.1, we see that:
. {2 35 118} C::: I
3 6.2} c:: R, and · { ~' \?
A set which does not contain any element is known as the empty
(void or null) set, and is denoted by~ •
Example 1.4: The set A6 = {all people taller than 10 feet}
contains no elements, i.e. A6 =~ . Also from Example 1.1,
we find that A5 =~ .
The sets are called egual if they contain the same and none but
the same elements.
Example 1.5: All the following sets are equal
. {1 ' 2 ' 3} ' {3 ' 1 ' 2} ' {2 ' 3 ' 1} ' •••
1. 2. · P:t(;)gresston and Definition Set
A progression ~ is an ordered (by ordered we mean that ~ is
arranged such that each of its elements has a specific position) ensemble
of objects (elements) that may not all be distinguishable one from another.
The definition set D of a progression ~ is the set composed from all the
distinguishable elements of ~. In such a case, we shall also say that D

8
belongs to ~.
Example 1. 6: ~ = tl , 2 , W', 2 , 1 , 8 , C/ ) , is a
progression, and its definition set D is given by
D :: {1 , 2 , u, 8 , CV' }•
At this point, the difference between a progression and a set
should be clear in mind. For instance, the progression

( &((', 8, 2 1 , 1 , 2 , ~ } represents a different progress-
ion than the one given in Example 1.6. However, the sets
{U, 8 ' 2 ' 1} ' . {2 ' 1 ' 8 ' cfC/' ' " }' ...
are all the same as the definition set D in Example 1.6.
1.3. · Cartesian Product of Sets

The Cartesian product of two sets A and B is a set, called the
product set and denoted by Ax B (reads A cross B), whose elements are all
the ordered two-tuples of elements of the component sets A and B. Hence,
if a g A and b g B, then the two-tuple (a,b) g A x B. However, if b i A

or a i B, the two tuple. (b,a) i Ax B.
The above definition can be extended to more than two sets, say
n sets. In such a case, the elements of the product set will be all the
ordered n-tuples of elements of the component sets. Accordingly, we can
define the Cartesian n-power Anlo'r An.if no danger of confusi.on with indexed
set exists·I of. a set A as ·t:ne Cartes-ian p;rC!>duat ot tl:i.e same ·s:e-t A with itself
n-times.
Example 1.7: If A:: {3 , 1 , 5} and B = {2 , 4} , then the product set
Ax B is AX:B- {(3, 2), (3, 4), (1, 2), (1, 4), (5, 2),
( 5 , 4)} . Referring to Example 1.1, we can easily see that:
' ',
' :0: )
I I '
g Al X A2 ' ( 4 .18 ' <r
9
(1 , <:::::> ) t A1 x A2 but (- 1 , ~ ) e: A2 x A1 ,
(1 , 2 , 15 , 1 , 8) e: I 5. , and ( 5 .16 , 3 • 26 , 1 , 0 , 1 ) e: R5..
1. ~-. Intersection of Sets
The intersection of two sets A and B, denoted by An B, is a
set which is a subset of both A and B and does not contain any elements
other than the common elements to A and B. The intersection of two sets
can be represented by the shaded area in Figure 1.1· Diagrams of this kind
are called "Venn diagrams" .
Figure 1.1
From the above diagram we can easily see that
Example 1.8: Referring to Example 1.1, we find that:

n ' ,,;
A
1 A2 : { :o-:.
/,,, }, Rni:I,
Note that we can define a subset A of B as such a set whose
intersection with B is A itself. In other words, if AC: B them An B : A,
or vice versa (see Figure 1.2).

10
Figure·l.2
If A() B = r/J, then the sets A and B are said to be disjoint sets.
The intersection of n sets A1 , A2 , •.• ,An is usually denoted by
n
() A. , where
i=l ~
n
('\ A. ::: A1 () A2 tl A3 ... () A •
i=l ~ n
This is illustrated in Figure 1.3 by the common area to all sets.
·Figure 1.3
1.5. Union of Sets
The union of two sets A and B, denoted by AU B, is a set that

contains all the elements of A and B and none else. Similar to the inter-
section, the union of the two sets is represented by the shaded area in
11
Figure 1.4
Figure 1.4
The union of n sets A1 , A2 , ... ' A is denoted by

n
n
V A. where
i=l ~
Example 1.9: Referring to Example 1.1, we obtain
\? ' 4.18, 1, 8, 15, ~
0' 4 } '
and I U R ::: R
Thinking of the union as the addition of sets, the subtraction
of two sets is known as the complement of one into the other. Referring
to Figure 1.5, and considering the two sets A C:: B, the set of all the
elements contained in B and not contained in A is called the complement
of A in B, and is dimoted by B - A. --
12
13-A
Figure 1.5
Example 1.10: Referring to Example 1.1, we get:
The complement of A3 in A2 is
-\I,~
A 2 - A3 : { 8, 15 , (t , :0: , 4} , and
I I \
H-I 5, { all real numbers that are not integers}.
1.6. Mapping of Sets
f is called a mapping of A into B if it relates one and only one
element from B to each element of A. This means that for each element
a£ A there will be only one corresponding image b £ B (see Figure 1.6).
Figure·l.6
Note here that the one-to-one relationship (i.e each b £ B has got one
and only one argument. a-~ A) is not required. We shall denote any such mapping
by
and read it as "f is an element of the set of all the mappings of A into B",
13
or simply "f is a. mapping of A into B", or "f maps A into B".
If the elements of B are all images of the elements of A, then
f is called a.n onto ~a.pping, or simply we say that "f maps A onto B".
If A and B are numerical sets, then f is called a function
(which gives the mathematical relationship between each a. e A and its
corresponding image be B). In this case, the image b of a will be nothing
else but the functional value f(a) •
Example 1.11: Given the set A = {a1 , a 2 , a. 3 } = {2, -1, 3} and the mapping
f e {A+ B} , where f(ai) = a.~ for each a. e A, i=l,2,3,

~ ~
then the images b. e B are computed as the functional values

·~
of the corresponding elements ai e A, i.e. bi w f(ai) = ai ,
which give b 1 = (2) 3 =8 , b2 = (-1) 3 = -1, and

. b 3 = (3)
3
= 27.
Generally, f is an into function, hence we write
(8, -1, 27) e B . However, if f is an onto function, then

the image set B of this example is given a.s
B 3 {8, -1, 27} •
1. 7. Exercise 1
1. Which of the following sets are equal?

{:e, r, s}, {s, r, 'C}, {r, s, t}, {-£, s, r} .
2. Let A: {d}, B: {c, d}, c = {a, b, c}
'
D - {a, b} and H: {a, b, d};
(i) is BCD 7 (ii) is c- B 1
(iii) is DCC 1. (iv) is B :# H 7
(v) is Ac:H .
1 (vi) is (AU D) C: H ?
(vii) is (A("\ B) ¢;. C 1 (viii) is (R (\ C) : D 1

14
3. Let u = {1, 2, 3, ... '

8' 9} ' A :: {1, 2' 3' 4} , B :: {2, 4, 6, 8} '
c = {3, 4, 5, 6} andD = {1' 3, 5, 7' 9}; then find the following:
( i) BUD ~
. ( ii) A()c .
)
(iii) AUB .
) (iv) U- A j •
(v) a set H, which is a subset of all the sets U, A and D.
4. Considering the following Venn diagram with the sets A, B, C, D and H,
indicate by shading the suitable areas on separate diagrams, the
following sets:
( i) DUH j
(ii) H.n c )
(iii) en B . I
(iv) A- C )
(v) BUc .
J
(vi) (11: - B) U (B f1 C) J'
(viii) A - (c U B) ·
5. Considering the two sets:
A = {3, 4, 0, -1} and B :: {-2, 5} , find the Cartesian products A x B

and B x A. Also find the second power B2 of the set B.
6. Given the set X:: {-2, -1, 0, 1, 2} , with f E {X~ Y}. If for each
x E X, f (x) = x 2 + 1, find the image set Y considering that f is an
onto function.
2. FUNDAMENTALS OF THE MATHEMATICAL THEORY OF PROBABILITY
2.1 Probability Space, Probability Function and Probability
Let us have a set D ~ ~ and let us assume that it can be par-
titioned into mutually disjoint subsets Dj C D such that D !: UD. (by

J J
mutually disjoint subsets we mean such subsets that D.n D. = ~ for any
~ J
pair Di' Dj' i ~ j). Such a set D we shall call probability space.
Any mapping P of D onto [0, 1] (that is the set of all positive
real numbers "b" satisfying the inequalities (0 .:::_ b < 1)) that has the
following two properties:
(1) If D'C D, then P(D') = 1- P(D- D'), (note that D- D') is the
complement of D' in D; see section 1. 5) , and

n
(2) If Dl' D2' ...
, Dn C D are mutually disjoint, then P( D.) =
~
u
i=l
n
~ P(D.), is called a probability function. The value (P (D I))
i=l ~
of the probability function P (takes any value from [0, 1]) is called
the probability. Note that the difference between the function and
the functional value has been mentioned in section 1.6.
The above two properties of the probability function have the
following consequences:
{_1} P(_D) = 1,
(2) P(~) = 0,
(3) If D' C D; then P (D' ) .:::_ 1,
(4) If D"CD'; then P(D"l .:::_ P(D'}, and
(5) If A, BCD, and A()B = {l1; then P CAU B) = P (A} + P (B).
15
16
If D is a ;point set , i.e. its elements can be represented by
points , it is always decomposable.
The value E P (D. )

~
E T0, 1] is sometimes called the total or
i
accumulative probability of U D.•

i ~
2~2L Conditional Probability
If A, 13 c D; then the ratio P(A()13)/P(13) = P(A/13) is called.
the conditional ;probability. The right hand side, that is P(A/13),is read
as "probability of A given 13". In other words, the conditional probability
P(A/13) can be interpreted as the probability of occurrence of A under the
condition that B occurred.
From the above definition of the conditional probability, we
notice that:
(1) If P(B) = 0; then P(A/13) is not defined,
(2) If B CA; then A ()B =B (see section 1.4), and then P(A/B) = 1, ·
(3) If A(\ B = C/J; i.e. A and Bare disjoint sets ; then P(A/B) = 0.
2._3 ... Combined Probability
If the conditional probability P(A/B) equals to P(A), then it is
clear that the occurrence of A does not depend on the occurrence of B. In
such a case we say that A and 13 are independent. Using the definition of
the conditional probabi~ity from the previous section, we can write:
P(A n B) = P(A) • P(13)

This can be understood as the probability of simulta.rteous occurrence of A
and B, which is usually denoted by P(A, B) sn read as probability of A
17
and B, and known as the combined (compound) probability of A and B, that is
P(A, B) = P(A) · P(B).
Similarly, we define the combined probability of occurrence of
the independent D1 , D2' • • .·I Dnc: D as the product of their individual
probabilities, i.e.
p (D.' Dj)
l.
= p (D.)
l.
P(Dj) i :! j
p (D.
l.
I
Dj' Dk) = p .(D.)
l.
P(Dj) P(Dk) i :! j, j :! k, i :! k,
n
IT p (D.)
l.
i=l
Example 2.1: Suppose we have decomposed the probability space D into seven
mutually disjoint subsets D1 , D2 , ••• , D7 as shown in Figure
2.1 such that:

7
D = U D.
l.
i=l
Figure 2.1.
Assuming that the probabilities P(D.) of the individual

l.
subsets Di are found to be:
P(Dl) = 1/28, P(D 2 ) = 2/28, P(D 3 ) = 3/28, P(D 4 ) = 4/28,

P(D 5 ) = 5/28, P(D 6 ) = 6/28, and P(D 7 ) = 7/28; then we get:
Total probability of D., i

l.
= 1, 2, ••• , 7 is
7
P(D) = P(U D.) = L P(Di) = (1+2+3+4+5+6+7)/28 = 28/28 = 1.0.
i l. i=l
7
Combined probability of all Di = II P(D.) = 0
l.
i=l
18
Example 2.2: In this example we assume that our probability space D is
decomposed into five elements dj € D, j = 1, 2, ••• , 5. If
the probabilities P(D.), as represented by the ordinates in

J
Figure 2.2, are given as:
+------....-··--···------·-----......-
O,'?J"'c.
0. '2,-11-----...----·---r------ --------------1--
0.\-t------ ·----
P(dl) = 0.2, P(d2 ) = 0.3, P(d 3 ) = 0.1, P(d4 ) = 0.1,

and P(d 5 ) = 0.3; then we get:
5
Total probability P(D) = P(L/d.) = ~ P(d.) = 0.2+0.3+0.1+0.1+0.3
J J j=l J
= 1.0 .
Combined probability of d 1 and d 2 (for example) = P(d1 , d2 )

2
= rr P(d.> = o.2.o.3· = o.o6.
j=l J --
This combined probability has to be underst't>Od as the probabili·ty
of simultaneous occurrence of d 1 and d 2 under the assumption
of their independence.
2.4. Exercise 2.
We have determined that every number of a die have the proba-
bility of appearing when the die is tossed, proportional to the number itself.
19
Let us denote: A = {even numbers\ , B = [prime numbersj, and C = [odd

numbersj; all subsets of the set of numbers appearing on the die.
Required: (1) Construct the probability space D.
(2) Find the probability of each individual element d. E D.

~
(3) Find P(A), P(B) and P(C).
(4) Find the probability that:

(i) an even or prime number occurs,
(ii) an odd prime number occurs,
(iii) A but not B occurs.

III. FUNDAMENTALS OF STATISTICS
3.1 Statistics of an Actual Sample
3.1.1 Definition of a Random Sample
Any finite (i.e. containing only a finite number n of elements)
ordered progression of elements (see section 1.2) s = (sl' s2' ... , sn)
such that:
(i) its definition set D (see section 1.2) can be declared a
probability space (see section 2.1); and
(ii) it has the probability function P 'defined. for. every

. . ' . d;&
l. D in such a
way that P(d.)

~
= c./n, where
~
c. is the count (frequency), of the
~
element di ·in s.. -::_-~· · · ·

may be ~Hw:lea ~ :r.an~ $~l.e: ~-·- 'I'.lJ;e: :r~~q;;l. ~j:l:P. is knovm as the relative
frequency.
Example 3.1: Consider the following progression s
~1 s2 s3 s4 s5 s6 s7
~ !! { 1, <:>
' cfCJ': 1, 1, <I. ' r:::; }
which has seven elements, (i.e • n = 7) •

The definition set D of ~ will be
dl d2 d3 d4
D= {1 , ~ , &Cr: CL } , which
consists of four elements (i.e. m = 4),

the counts of which are:
20
21
corresponding probabilities (relative frequencies)
are:
P(dl) = p (1) = 3/7, P(d 2 ) = P(Q) = 2/7,
p (d3} = P(t/) = 1/7, and
P(d4} = P(c{ ) = 1/7.
Note here that really both properties required from P to be a
probability function (section 2.1) are satisfied. In particular we have
(from the above example}: the total probability

m
P(D) =P U d,)
~
i=l
4
= l:
i=l
Accordingly, any finite ordered progression of elements may be
declared a random sample. This is a very important discovery and has to
be born in mind throughout the following development. As a result, it
is always possible to construct the probability space and the associated
probabilities "belonging" to the sample (i.e. the probability associated
with each element in the definition set of the sample).
From now on we shall deal with DCR (recall that R is the set of
all real numbers}, i.e. with numerical sets and progressions only. Also,
D will be considered ordered in either ascending or descending sense;
usually the former is used.
It has to be noted here that our definition of a random sample
is not standard in the sense that it admits much larger family of objects to
be called random samples than the standard definition. More will be said
about it in 3.2.4.
22
Example 3.2: A die is tossed 100 times. The following
table lists the six numbers and the
frequency (count) with which each number
appeared:
number d.J.. 1 2 3 4 5 6
count c.J. 14 17 20 18 15 16
•
Find the probability that:
(i) a 3 appears ;
(ii) a 5 appears;
(iii) an even number appears j
(iv) a prime number appears.
Solution:
20
(i) P(3) = - - = 0.20 )
100
(ii) P(5) ··--
15
100 = 0.15 J
(iii} P(2 ,4 ,6) = P(2) + P(4) + P(6)
17 18 16 51
= 10'0 + 10'0 + IOo = IOo = 0. 51 ~
(iv) P(2,3,5)'= P(2) + P(3) + P(5)
17 20 15 52
= I05' + I05' + laO= laO= 0.52 •
3.1.2 Actual (Experimental) Probability Distribution Function (PDF)
and Cumulative Distribution Function (CDF)
If the random sample ~ is a progression of numbers only (and,
of course, its definition set Dis a numerical set), which we shall from
now on always assume, then Pis a discrete function mapping D into [0,1].
23
This function is usually called experimental (actual) probability
distribution function (or experimental frequency function, etc.) of the
sample~' and abbreviated by PDF. The values P(di), di ~ D, are known
as experimental probabilities of d., which are equal to the corresponding

~
relative frequencies.
Example 3.3: Assume that a certain experiment gave us the
following random sample:
~ :.; ( 1 , 2 , 4 , 1 , 1 , 2 , 1 , 1 , 2 ) , n = g_.
Then its definition set is:
D~ {1, 2, 4} = {d., i=l,2,3} , m = 3 •

~
Therefore, the frequencies ci of di are:
c1 = 5 , c2 = 3 and c3 = 1.
The corresponding experimental probabilities
are: P(l) = 5/9, P(2) = 3/9 and P(4) = 1/9 •.

3
As a check i!l P(di} = ~ (5+3+1) = 1.
The discrete PDF of the given ~ in this
example is depicted in Figure 3.1 (which
is sometimes called a bar diagram), in which
the abscissas represent d~ and the

~
ordinates represent the corresponding P(d.).

~
2
Figure 3.1
24
Since we are using numerical sets only (and therefore ordered),
it makes sense to ask, for instance, what is the actual probability of d
being within an interval D.J::D, where D' .- [ ~, d j]. Such probability
is denoted by P(D' 2 or P(~ .::_ d .::_ dj). To answer this question, we use
the actual PDF and get
(3.1)
The above expression (equation 3.1) must be understood as giving the actual
probability of d€D 1 s {dk' ••• , dj~D rather than d€[~, dj] (i.e. the
probability that d will acquire a specific discrete value equal to
~' dk+l' ••• , dj-l' dj rather than the probability that d will be
anywhere in the continuous interval I~, dj]}. This is not always properly
understood in practice.
The function C of d .... D given by

~
C(d.l
~
= E P(dj) €[0,1] (3.2)
j<i
is called experimental (actual} cumulative distribution function (or
summation density function, etc.) of the sample~. and usually
Example 3.4:
-
abbreviated by CDF.
Using the data and results from example
3.3, we can compute the CDF of the given
sample~ by computing each C(d.) as follows:

1
C(d1 ) = P(d1 ) = 5/9,

C(d2 ) = P(d1 ) + P(d2 } = 5/9 + 3/9 = 8/9,
and C(d 3 l = (P(d1 ) + P(d2 }} + P(d3 )
= C(d2 l + P(d3 } = 8/9 + 1/9 = 9/9 = 1.

25
Figure 3.2 illustrates the discrete CDF of the given
sample 1;.
1 ------ --- -·--·---

8/9 - ---- --- ---.
I
I
5/9 --- --- .. rr-·----~~~J
I
I
I
f
0~--~~~------~--------------~---------d·l
4
' 2.
Figure 3.2
From Figure 3.2, we notice the following properties of the
CDF:
(i} the value (ordinate} of the CDF is always positive,
(ii) the CDF is a never decreasing function,
(iii l the cumulative probability C(<\), where dm is the largest
d .. e::D, is always equal to 1.

l.
26
Example 3.5: Using the data from example 3.2, we
can construct the CDF of the die tossing
experiment as follows:
C(l) = P(l) = 0.14,
C(2) = C(l) + P(2) = 0.14 + 0.17 = 0.31,
C(3) = C(2) + P(3) = 0.31 + 0.20 = 0.51,
C(4) = C(3) + P(4) = 0.51 + 0.18 = 0.69,
C(5) = C(4) + P(5) = 0.69 + 0.15 = 0.84, and
c(6) = d(5) + P(6) = o.84 + 0.16 = 1.00.
Note again that the maximum value of the
CDF is one. The graphical representation
of the above CDF can be constructed similar
to Figure 3.2.
3.1.3 Mean of a Sample
Consider the sample~ 5 (~ 1 , ~ 2 , .•. , ~n} with its definition
••. , d }. The real number M defined as:

m
n
1
M=- E ~- E [d1 , d ], (3.3)
n i=l ~ m
is called the ~ (average) of the actual sample.
We can show that M equals also to:
[ M =
m
i=l
E d.~ P(di)
.1 (3.4)
27
The proof of (3.4) reads as follows:
m 1m
R.H.S. = ~ d.
~
(c./n)
~
=- ~ d.c. =
i=l n i=l ~ ~
1 n
= l: t;, =M •
n i=l ~
The mean M of a sample can be interpreted as the outcome of
applying the summation operator ~ divided by n on ~, and is often
written as:
M= E(~) =mean (~) = ave (~) = -~ , (3.5)
where the symbol E (an abbreviation for the "mathematical Expectation")
must be understood as another name for the summation operator E opera-
ting on P(d.)d. (and not on ~i!).

~ ~
Note that E is a linear operator, and hence it has the following
properties (where k is a constant and ~ is a random sample) :
(i) E(k) = k,
(ii) E (k~) = kE (~),
(iii) E(~+k) = E(~) + k,
(iv) E(E ~j) = E E(~j), where ~j, j = 1, 2, ••. , s, ares random samples
j j
with the same number of elements m in their corresponding definition
sets Dj (Do not confuse~. with ~j~ the former is an element in the
J
latter. In other words, ~. is a single element in a sample, but
J
~j is one sample in a class of samples) ,
(v) If ~ = (~ 1 ), then E~) = ~l'
(vi) E (E(~)) = E (~) •
28
Example 3. 6: Using the random sample l; given in example
3.3, let us compute its mean from equation
(3.3) as follows:
1 n 1 15 2
M=E( ~ )= - r ~. = -(1+2+4+1+1+2+1+1+2 }= - = 1-
n i=l 1 9 · 9 3
Also, we can use equation (3-4), from which we
get:
M=E(~)=
m
.: djP(dj)= 1.
J-l 1
+ 2 • ~ + 4 •
15 2
t ~
=- (5+6+4'=- = 1-
9 '- 9 3 .
Obviously, both formulae (3.3) and (3.4) give
identical answers.
It is interesting to note that computing the mean of a sample
using equation (3.4) is analogous to computing the centre of balance
in mechanics. This can be simply seen by considering the probabilities
P(d.) or the counts c. as weights, and then taking the r moments= 0

l l
about any point , e.g. the origin 0 (see Figure 3. 3 which uses the data
from example 3.3).
I
I f (cw-s) .l.. (or 3) }(or1)
~ ·9~
•
I
~0
(
1 4
'
2 4
I I
~ M ~
I
Figure 3.3
29
The resulting distance of the centre of balance from the point is
nothing else but the sample mean M.
It is worthwhile mentioning here that, based on the above analogy
with mechanics, the mean M computed from equation {3.4) is also called the
weighted mean, in which each element d. ED is weighted (the concept of

~
weights is to be discussed later in details) by its probability P(d.).

~
3.1.4 Variance of a Sample
Let us have again an actual sample~= (~ 1 , ~2 , ... , ~n) with
a mean M. Then, the real number s 2 defined as

n
2
4 ( ~. -M) ' ( 3. 6)
~
i=l
is called the variance (dispersion) the actual sample. The square root
of the variance s 2 , i.e. s, is known as the standard deviation of the

sample.
Keeping in mind the relationship between the random sample ~
and its definition set D, we can write:
n m
1 E 2
E d. P(d.)
n
i=l j=l J J
which will provide another expression for s2 , namely:
m
= E P (d.) (d. -M) 2 (3.7)
j=l J J
30
5 2 can be also interpreted as the outcome of the application
of the operator Eon (s-E(s)) 2 meaning really P(d.) (d.-M) 2 and is often
J J
written as
52 =E ((s-E(s)) 2 = var (s). ( 3. Ba)
Carrying out the prescribed operation, we get
Applying the calculus withE operator (as summarized in section 3.1.3),
we obtain:
52 = E<s 2 >-2E<s>E<s> + E2 <s>
= E<s 2 > - E2 <s> •
From equation (3.5) we have E(S) = M, then by substituting for E(s) we get
(3.8b)
Consequently, the corresponding expression to equation (3.7b) will be:
m
2 2
E d. P (d.) -M (3.9)
j=l J J
It is worth mentioning that giving the analogy with mechanics
(as discussed in the previous section) we can regard the variance of the
sample (equation 3.7)) as the moment of inertia of the system of corres-
pending mass points with respect to M.

31
Example 3-. 7: Let us compute the variance 8 2 of the sample ~
given in example 3.3, by using equation (3.8b).
F;irst, we compute the first term E(,;2) as
follows:
n
E(E; 2 )= 1. ~ E;~= 1.
9 (1+4+16+1+1+4+1+1+4) =J1_ •
n i=l ~ 9
8ubsti tuting in equation ( 3 .B'b) , and knowing
that M = 1 ~ from example 3.6, we get:
33 9 .3;3..(15) 2-
8 2-- var (c)-
"' - 9 - (15)2.-
9 - 81
- 297-225 - 72 - ~ ~ 0 89
- 81 - 81 - 9 - - · - .
Taking the square root of the computed variance,
we obtain the standard deviation of the sample as:
2 12 = 2.828 ;. 0.943.
3 _3
The same result is obtained if we use equation
(3.9), firstly we have
= 1. 33
9 (5+12+16) =- 9 '
and since M = 1 ~ , we obtain
82 33
= 9- (159)2 = _98 ;. 0.89.
It should be noted here that the same value for the sample
variance can be obtained from equations (3.6) and (3,7). The verifica-
32
tion is left to the student (e.g. using the data from the above example).
However, equation (3.9) is advantageous from the computational point of
view, especially for large samples. A similar statement holds for
computing the sample mean Musing equation (3.4).
3 .1. 5 Other "Characteristics" of a Sample: Median and Range
The median, Med (~)of the sample~= (~ 1 , ~ 2 , ••• , ~n) is defined
differently for n odd and for n even. For n odd, Med (~) equals the ~
that is in the middle of the ordered progression~' that is
Med (~) = ~, +l • (3.10)

(~)
2
For n even, Med (~) is the mean of ~ and ~ that is:
(~) (~ +1}
Med(~} = 1. (r;: + ~ 1 .. (3.11)

2 . (~} (~ +1)
Example 3.8: Consider the sample~ ;. (5,3,6,4,1,2).
To obtain Med (r;:}, we first arrange the
sample in either ascending or descending
order, for instance: [;; : (1,2~3,4·,5,6),
n=6. Since we have n even, we get:
Similarly, the ascending ~regression of the

sample~ given in example 3.3 is:
I" ..
~ : (1,1,1,1,1~2,2,2,4), n = 9.
In this case n is odd, and we get:

33
Med(F;) = 1 •
The range, Ra(t;) of the sample t;;: (C , i=l,2, .•. n) is defined

~
as the difference between the largest (t;9;l and the smallest (F;s) elements
of t; that is:
I Ra(~) = ~~-~s ·1 (3.12)
Consequently, for an ascendingly ordered sample t;, we get
(3.12a)
Note that the range of the sample can be also determined from its
definition set D: {dj, j=l,2, ... m }. The corresponding expressions to
(3.12} and (3.12a), respectively are:
Ra(t;) = Ra(D) = d.i -d s ' (3.13)
and Ra(t;) = Ra(D) = dm-d1 0 (3.13a)
Example 3.9: From example 3.8, we have the ascendingly
ordered samples: (1,1,1,1,1,2,2,2,4}, n=9'
whose definition set is D = - {1,2,4} , m=3.
To obtain the range, we use either equation
( 3 .12a} , i. e . Ra(F;)= t; -t;

n 1
= 4-1 = _3,
or we use equation (3.13a), i.e.
At this point, we can summarize the different characteristics
of the sample t; originated in example 3. 3; as computed in the last three

34
sections, namely: M = 1.6 o.8 s = 0.94,

Med(t;) = 1 and Ra(~} =3
(Note that the "bar" above the last digit means that it is a periodic
number).
3.1.6 Histograms and Polygons
From now on, the number of elements n of a sample ~ will be
called the ~ of the sample. A sample with large size n, is often
divided into classes (categories). Each class is a group of n. indi-

~
vidual elements (n. < n). To achieve this, we usually determine

1.
the range of s (see section (3.1.5)),and then divide the range
into k intervals* by (k+l) class-boundaries (class-limits). It is usual
to make the intervals equidistant. The difference between the upper and
lower boundaries of a class is called the class-width. The number c of
elements in each class is called the class-c.ount (class-frequency).
This process in statistics is called classification of the sample. The
"box" (or rectangular) graphical representation of the classified sample
is called the histogram of the sample.
Example 3.10: Let us have the following random sample s:

~; (17,3,2,8,1,5,2,4,6,15,8,9,2,3,10,9,
11,12,4,5,8,6,7,4,5), n = 25
* The interval from a to b is either:
open denoted by (a, b) = (x :a<x<b)

closed " II
[a,b] = (x:a<x<b)
Open-closed,' "It " ( a,b] = (x :a<x<b)
closed-open, " I a,b} = (x :a<x<b)
To reconcile this known notation with the terminology of the theory of sets,
it has to be understood that any such interval can be regarded as a set.
To distinguish such a set from a point set, we shall call it a compact set.
35
First, we compute the range of s using

equation (3.12), i.e.
Ra(s1 = ~ i - ss = 17-1 = 16.
Let us use four intervals:
[1,5], (5,9], (9,13] and (13,17].
Hence, the class-counts will be:
c1 (Il,5])= 12, c2 ((5,9]) = 8,
c3 ((9,13])= 3 and c\((13,17]) = 2.

The histogram of the given sample in this
example is shown in Figure 3.4, in which
the horizontal axis represents the class
boundaries and the ordinates represent the
class-frequencies c. (see the left-hand

~
scale). Relative
Frequency (c; )
I2 - - - - - ---....---.... --- - -- - - ·- - · - - - - --- - - - -· · - - - - -
3
2 ··- - - - -- -..
Figure 3.4
Note in the above figure that a rectangle is drawn over each interval
with constant height equal to the corresponding class-count.
It is usually required that the area of,ol: under,the histogram
has to be equal to one. Assume that we have k classes with corresponding

k
class-counts c. such that: t c. = n. Let us denote the class-width,
~ i=l 1
assumed to be constant, by ~. Hence, the area a of the histogram is
given by:
k
= ~(c 1+c 2 + •.. +ck) = ~ t c ..
1
= ~n.
i=l
This means that the area under the histogram equals the class-width
multiplied by the size of the sample.
Therefore, to make the area of the histogram equal to one, we
simply have to divide each ordinate c. by the quantity n~. The new
1
(transformed) ordinate ~ is also called the relative count (compare

1
this to the relative count mentioned in section 3.1.2, which represents
the experimental probability of an individual element; however, here we
are dealing with counts in an interval).
Example 3.11: Using the data from example 3.10, we have:
n = 25 and ~ = 4. The quantity n~ = 25.4 =100.

,....,
Hence, to compute the relative counts c. of the
1
classified sample~' we divide each ordinate c.

l
(obtained in example 3.10) by 100. This gives us:
-c = s·· = 0.08,
2 100
37
The histogram of the sample in this case
will be the same as in example 3.10, with
the only difference that the ordinate scale
is going to be changed (see Figure 3.4, the
right-hand scale).
Using the relative counts .,..,ci, the area "a "

under the histogram equals to one, as we
can see from the following computation
(using Figure 3. 4):
a= 4.0.12 + 4.0.08 + 4.0.03 + 4.0.02
= 0.48 + 0.32 + 0.12 + 0.08 = 1.0,
which may be used as a check on the correct-
ness of computing - c .•
J.
Let us denote the largest and the .smallest abscissas of a histo-
gram by .Q. and s, respectively (e.g. in Figure 3.4, g.= 17 and s=l).
Notice that for any subinterval D' = [a,b] of the interval 6. = [s, R-],
we can compute the area a (D') under the histogram. This a(D') will be
given as a real number from [0,1]. Hence, a can be regarded as a function
mapping any subinterval of [s, ~] onto [0,1]. Therefore, it is easy to
see that ct can be considered as a probability function (see section 2.1),
more specifically one of the possible probability distribution functions
(PDF's) of the sample. Obviously, such PDF (i.e. a) depends on the
particular accepted classification of the sample.

38
From the above discussion, we find that the probability of any
subinterval of [s,t] is represented by the corresponding area under the
histogram. On the other. b.anao., the ordinates of the histogram do not represent
probabilities (again compare the histogram with the bar diagram given in
section 3.1.2).
Example 3.12: Referring to Figure 3.4, we may ask: what is the
probability of D' = [6, 10]; or, what is the
probability that the sample element, say x, lies
between 6 and 10. This can be written as:
P(6 < X < 10) = ?
The answer will be given by the area under the
histogram between 6 and 10 (which is shaded in
Figure 3.4}, i.e.
P(6~x~l0) = P(6~s9) + P(9<x~l0) =

= (9-6).0.08 + (10-9).0.03
= 3.0.08 + 1.0.03 = 0.24 + 0.03
= 0.27 .
On th~ other hand, by inspecting the actual
sample ~ originated in example 3.10
we find out that the actual number of elements
in the interval [6,10] is nine. This number
represents ( 2 ;).100% = 36% of the sample.
Or, we say that the actual probability
P(6~x~l0) = 0.36, which does not agree precisely
with the result obtained when using the
corresponding histogram (i.e. 0.27).

39
The difference between the actual probability and the computed
probability using the histogram, as experienced in example 3.12, is
largely dependent on the chosen classification of the•sample (selection
of the class-intervals). Usually, one gets a smaller difference (better
agreement) by selecting the class-boundaries so as not to coincide with
any of the elements of the given sample. The construction of histograms
can be considered a subject of its own right. We are not going to
venture into this subject any deeper.
Example 3.13: If we, for instance, use the following classification
(for the sample~ given in example 3.10): [0.7, 4.8],
[4.8, 8.9], [8.9, 13] and [13, 17.1], i.e. we have
again four equal intervals, for which ~ = 4.1. Then
we get the class-counts as c 1 = 9, c2 = 9, c3 =5

and c 4 = 2. The quantity n~ = 25.4.1 = 102.5. Hence,
the relative counts are:
- = c2- = 102.5
cl
9
=. 0.0878,
- 5 . 0.0488, and
c3 = 102.5
c4 =
2
102.5 =. 0.0195.
In this case, the new histogram of the sample ~ is
shown in Figure 3.5.

40
Relative
frequency
-'
C·
9 o. 0~78
2
0+-----~----~~~~~~~--~~---+--~~
0-7 ./f..8 1 /3.0
I
6
Figure 3.5
The probability P(6~~10) is computed as follows
(shaded area in Figure 3.5):
= 2.9.0.0878 + 1.1.0.0488 ;
. 0.2546 + 0.0537
.
= 0.3083 = 0.31,
.
which gives a better agreement with the actual
probability than the classification used in
example 3.11.
41
The graphical representation of a histogram, which uses the
central point of each box (class-midpoint) and its ordinate (the
corresponding relative .class-count), is called a polygon.
In order to make the total area under the polygon egual to one
we have to add one more class interval on each side (tail) of the
corresponding histogram. The midpoints s' and i' of these two, lower and
upper tail intervals,are used to close the polygon.
Therefore, it can be easily seen that the area~· under the poly-
gon has again the properties of probability. This means that ~· is one
of the possible PDF's of the sample. Hence~~ can be used for determining
the probability of any·D' = fa, b]cis', .i' J. Note also here
that the ordinates of the polygon do not represent probabilities.
Example 3.14: The polygon corresponding to the histogram of
Figure 3.4 is illustrated in Figure 3.6.
0.12 ........- - - -
3 IS
Figure 3.6
42
Similar to the histogram, the area"a"under
the polygon should be equal to one. To show
that this is the case, we compute"a"using
Figure 3.6 as:
a = '4' (1.2
' • 0.12 + ~
1 ( 0.12 + 0.08 ) ~
+1. (0.08 + 0.03) +

2
~ (0.03 + 0.02) +
1
+ 2 • 0.02}
= 2(0.12 + 0.20 + 0.11 + 0.05 + 0.02)
= 2 (0.50) = 1.00.
Let us compute the probability P(6~x~l0)
using the polygon (the required probability
is represented by the shaded area in Figure
3.6}. To achieve this, we first have to
interpolate the ordinates corresponding to 6
and 10, which are found to be 0. 090 and
0.0425, respectively. Therefore, the
required probability is:
P(6~x~l0) = P(6~x~7) + P(7~x~l0)

1 1
= 1.2(0.09+0.08)+3.~0.08+0.0425)
= 1.0.085 + 3.0.06125
= 0.085 + 0.184; 0.27,
which is the same as the value
obtained when using the corresponding histogram.

43
So far, we have constructed the histogram and the polygon
corresponding to the PDF of a sample. Completely analogously, we may
construct the histogram and the polygon corresponding to the CDF of the
sample which will be respectively called the cumulative histogram and the
cumulative polygon. In this case, we will use a modified form of equation
(3.2), namely
C(a) = P(x~a) = l: P(x.

~-1
<x<x.) •
- ~
(3.14)
x <a
i"""
Example 3.15: Let us plot the cumulative histogram and cumulative
polygon of the sample ~ used in the examples of
this section.
For the cumulative histogram, we get the following
by using Figure 3.4:
C(l) = P(1) =0 (remember that the probability of
individual elements from the
histogram or polygon is always
zero}.
C(5) = P[l,5] = 4.0.12 = 0.48,
C(9) = C(5)+P(5,9] = 0.48 + 4.0.08
= 0.48 + 0.32 = 0.80,

C(l3) = C(9)+P(9,13] = 0.80 + 4.0.03
= 0.80 + 0.12 = 0.92,

C(l7) = C(l3)+P(l3,17] = 0.92 + 4.0.02
= 0.92 + 0.08 = 1.00.
Figure 3.7 is a plot of the above computed
cumulative histogram.
44
C(;_)
o. s6--···--- ,--
I
1
0.8'3
I
c,oo~------~~------~._--~r-~---4-------+----------~--
51 9 : 13 1"1
6 10
Figure 3.7
For the cumulative polygon, we get the
following by using Figure 3.6:
C(-1) = O,
C(3) = ~(4.0.12)= ~(0.48)= 0.24,
= p[:..l,3]
C(7) = C(3)+P(3,7] = 0.24+ ~.4(0.12 + 0.08)
= 0.24 + 0.40 = 0.64,

C(ll) = C(7}+P(7,ll] = 0.64 + ~.4(0.08 + 0.03)
= 0.64 + 0.22 = 0.86,
C(l5) = C(ll)+P(ll,l5] = 0.86+ ~ .4(0.03 + 0.02)
= 0.86 + 0.10 = 0.96,
C(l9) = C(l5)+P(l5,19] = 0.96 + ~ . 4.0.02
= 0.96 + 0.04 = 1.00
Figure 3.8 is a plot of the above computed
cumulative polygon (note here, as well a&· in Figure
3. 7, the properties of the CDF mentioned in example
3.4).
45
CC;)
J.oo
o. 9b
o.f16
0.64
0,54--
0.24
-I 3 -, I
I
fI 15 \9
6 10
Figure 3.8
By examining Figures 3.7 and 3.8, we can see that the cumulative
polygon uses tbe central point of each class-interval along with its
ordinate from the corresponding cumulative histogram. Therefore, the
relationship between the cumulative polygon and its corresponding
cumulative histogram is exactly the same as the relationship between the
polygon and its corresponding histogram.
Because of the nature of the CDF, we can see that the cumulative
probability - represented by an area under the PDF extending to the left-
most point - is represented just by an ordinate of the cumulative histogram
or the cumulative polygon. Hence the cumulative histogram or the cumulative
polygon can be used to determine the probability P[a,b], a<b, simply by
subtracting the ordinate corresponding to a from the one corresponding to b.

46
Example 3.16: Let us compute the probability P[6,10] by
using:
(i) the cumulative histogram of Figure 3.7,
(ii} the cumulative polygon of Figure 3.8.
First, we get the following by using Figure 3.7:
The interpolated ordinates corresponding to 6
and 10 are found to be 0.56 and 0.83, respect-
ively. Therefore, P[6,10]= P(6~x~l0) =
= 0.83-0.56 = 0.27, which is the same value as the one
obtained when using the hi.stogram (example 3.12).
Second, we get the following by using Figure 3.8:
The interpolated ordinates corresponding to 6
and 10 are found to be 0.54 and 0.805,
respectively. Therefore:
•
P[6,10]= P(6~x~l0)= 0.805-0.54~0.27 ,
which is agai!i t1l~ same value as the one
obtained when using the polygon (example 3.14).
To close this section, we should point out that both the histo-
grams and the polygons (non-cumulative as well as cumulative) can be
refined by refinning the classification of the sample. Note that this
refinement makes the diagrams look smoother.

47a
3;2 Statistics of a Random Variable
3.2.1 Random (Stochastic) Function and Random (Stochastic) Variable
In order to be able to solve the problems connected with inter-
val probabilities (see the histograms and polygons of section 3.1.6) more
easily and readily, the science of statistics has developed a more con-
venient approach. This approach is based on the replacement of the
troublesome numerical functions defined on the discrete definition set
of a random sample, by more suitaple functions. To do so, we first
define two idealizations of the real world: the random (stochastic)
function and the random (stochastic) variable.
A random or stochastic function is defined as a function X
mapping an unknown set U* into R, that is
xe: {U + R}
(Later on, concepts of multi-valued Xe: {U + Rm} (where Rm is the cartesian
m-power of R, see section 1.3) are developed.)
This statement is to be understood as follows: For any value
of the argument u e U, the stochastic function x assumes a value x (u) c: R.
But, because the set U is considered unknown, there is no way any
formula for x can be written and we have to resort to the following
"abstract experiment" to show that the concept of random functions can
be used.
"Note that in experimental sciences the set U may be fully or at least

partly known. The science of statistics however, assumes that it is
either not known, or works with the unknown part of it only.
47b
Suppose that the function:x is realised by a device or a process
(see the sketch} that produces a functional
u - -...4__x_ _.-l-__...,.. ·:X:.(U)

value x(u) every time we trigger it. Knowing nothing about the inner
workings of the process all we can do is to record the outcomes x(u}.
When a large enough number of values x(u) have been recorded, we can
plot a histogram showing the relative count of the x{u) values within
any interval [.x0 , ~]. In this abstraction we can imagine that we have
collected enough values to be able to compute the relative counts tor
any arbitrarily small interval dx and thus obtain a ~mooth histogrmn'.
Denoting the limit of the relative count d~vided by the width dx o£ the
interval [x, x + dx], for dx going to zero, by cjl(x) we end up with a
function cjl that maps x e R into R.
Going now back to the realm of mathematics, we. see that the
outcome of the stochastic fun~tion can be viewed as a pair (x(u), cj~(x)}.
This pair is known as the random {stochastic) variable. It is usual in
literature to refer just to the values x(u) as random variable with the
tacit understanding that the function cjl is also known. .
We note that the function $ is thus defined over the whole set
of real numbers R and has either positive or zero values, i.e. cjl is non-
negative on all R. Further, we shall restrict ourselves to only such ;
that are integrable on R in the Riemannian sense, i.e. are at least
piece-wise continuous on R.
3.2.2 PDF and CDF of a Random Variable
The function ; described in 3.2.1, belonging to the random
variable x, is called the probability distribution function (PDF) of' the
random variable. It can be regarded as equivalent to the experimental.

48
PDF (see 3.1.2) of a random sample. From our abstract experiment it
can be seen that
j t/J(x) dx =1 ( 3 .15)
-oo
since the area under the "smooth histogram" must again equal to 1 (see
3.1.6). This is the third property of a PDF, the integrability and
non-negativeness being the first two. We note that eq. 3.15 is also
the necessary condition for t/J(x) dx to be called probability (see 2.1).
Figure 3.9 shows an example of one such PDF, i.e. tP in which
the integral (3.15) is illustrated by the shaded area under the rp.
Figure 3.9
l
The definite integral of the PDF, tP, over an interval D c D is
called the probability of D'. So, we have in particular:
Jxorp (x) dx = P(x<x)

-o
€[0, 1] , (3.16a)
-oo
00
I tfl(x) dx = P(x -
> X )
0
€ [0 1 1] 1 (3 .1Gb)
€ [0, 1] • (3.16c)
Consequently,
P (x > x )
- 0
=1 - P (x _< Xo> • (3.17)
The integrals (3.16a), (3.16b) and (3.16c) are represented by
the corresponding shaded areas in Figure 3.10: a, b, and c, respectively.
(a) ~)
~--T-----~~~~~~~~x
..... 00
(c:)
._.. ,._,__.....,___,'-'-""""~"""'1~---1---:11• )(
)f' )(2.. -to oO
Figure 3.10
At this point, the difference between discrete and compact probabil-
ity spaces, should be again-born • in.··mind. In the discrete space, the value
of the PDF at any point, which is an element of the discrete definition set
of the sample, can be interpreted as a probability (section 3 .1. 2) • However
in the compact space, it is only the area under the PDF, that has
got the properties of probability.* We have already met this problem when
dealing with histograms.
* The whole development for the discrete and the compact spaces could be
made identical using either Dirac's functions or a more general definition
of the integral.
so
Note further that:
X
0
P (x = x0 ) = f ~(x) dx = 0*).
X
0
Analogous to section 3.1.2, the function ~defined as
I----------------------------------
X
~(x) =f~(y) dy, e:[x+ [0, 1]} (3.18)
-<»
where ye:R is a dummy variable in the integration, is called a CDF
provided that ~ is a PDF. ~ is again a non-negative, never decreasing
function, and determines the probability P(x < x ). (Compare this

- 0
with section 3.1.2); namely:
~(x0 ) = P(x < x ) e:

- 0
[0, 1] . (3.19)
Figure 3 .11 shows how the CDF' (corresponding to the PDF in
Figure 3.9) would look.
\t'(x) \f(x):: I.
l.o - - - - - - -
L---~~------------~~--~~x
--+00
a..
Figure 3.11
* This may not be the case for a more general definition of the integral,
or for ~being the Dirac's function.
51
If <P is symmetrical, '1:' will be"inversely symmetrical"around the
axis 'l:'(x) = 1/2. Figure 3.12 is an example of such a case.
<((x) lt' {x):: I.
Figure 3.12
Not'~Fthat '1:' is the primitive function of <P since we can write:
d 'l:'(x)
<P (x) =
dx
In addition, we can see that q,(x) has to disappear in the infinities in order
to satisfy the basic condition :

00
! <P (x) dx = 1.
-co
Hence, we have:
lim 'l:'(x) =0 , lim 'l:'(x) = 1 •

x+-CIO x-+oo
3.2.3 Mean and Variance of a Random Variable
It is conceivable that the concept of a random variable is useless
if we do not know (or assume) its PDF. On the other hand, we do not have the
one-to-one relation between the random variable and its PDF as we had with
52
'
the random samples (section 3.1.1 and 3.1.2). The random variable acts
only as an argument for the PDF.
The random variable can be thus regarded as an argument of the
function called PDF, that runs from minus infinity to plus infinity.
Therefore, strictly speaking, we cannot talk about the "mean" and the
"variance" of a random variable, in the same sense as we have talked
about the "mean" and the "var-iance" of a random sample. On the other
hand, we can talk about the value of the argument of the centre of gravity
of the area under the PDF. Similarly, we can define the variance related
to the PDF. It has to be stated, however, that it is a common practice
to talk about the mean and the variance of the random variable; and
this is what we shall do here as well.
The~~ of the random variable x is defined as:
" =-~ x ¢(x) dx • l (3.20)
Note the analogy of (3.20) with equation (3.4), section 3.1.3.
~ is often written again in terms of an operator E*; usually
we write
00
E* (x) =~ =f x ~ (x) dx
t) (3.21)
t E* is again an abbreviation for the mathematical Expectation, similar

to the operator E mentioned in section 3.1.3. However, we use the
"asterisk" here to distinguish between both summation procedures, namely:
E implies the summation using E; and E* implies the summation using!.
53
We can see that the argument in the operator E* is x·~(x) rather than x,
x being just a dummy variable in the integration. However, we shall
again use the customary notation to conform with the existing literature.
We have again, the following properties of E*, where k is a
constant: (i) E* (kx) = k E* (x);
. . . r'
are r different "random variables", i.e., r random
variables with appropriate PDF's;
(iii) and we also define:
E*(E*(x)) = E* (x) = ~tt).

The variance cr 2 of a random variable x with mean~' is defined as:
~ a2 =-~ (x-•) 2 0 (x) dx ·I (3.22)
Note the analogy of (3.22) with equation (3.8), section 3.1.4. The
square root of cr 2 , i.e. a, is again called the standard deviation of the
random variable.
Carrying out the operation prescribed in (3.22) we get:

00
cr 2 =! [x 2 ~(x) -. 2x~+ (x) + ~2~ (x)] dx
00 00 00
= ! x2 ~(x) dx - 2~ ! x ~(x) dx + ~ 2 ! ~(x) dx •

-00
ttin order to prove this equation, one has to again use the Dirac's
function as the PDF of E*(x).
54
In the above equation, we know that the integral in the second
term equals~ (equation (3.20)), and the integral in the last term equals
one (equation (3.15)). Therefore, by substituting we get:
00
(3.23)
-oo
Note the similarity of the first term in equation (3.23) with

m
E(l;2) = 2: d~ P(d.) (section 3 .1. 4) • This gives rise to an often used
j=l J J
notation:
(3.23a)
We shall again accept this notation as used in the literature, bearing in
mind that E* is not operating on the argument, but on the product of the
argument with its PDF.
The expression
(3.24)
-oo
is usually called the r-th moment of the PDF (random variable); more
precisely; the r-th moment of the PDF about zero. On the other hand,
the r-th central moment of the PDF is given by:
00
r
m'r = ! (x-~) ~ (x) dx • (3.25)
-oo
55
By inspecting the above expressions for mr and m~ along with
equations f3. 20) and ( 3. 22) , we can see that:
l.l = ml (3.26a)
and a2 = m2 = m2 - l.l2 = m2 - mf (3.26b)
Compare the above result (3.26 a, b) with the analogy to mechanics men-
tioned in sections 3.1.3 and 3.1.4.
3.2.4 Basic Postulate (Hypothesis) of Statistics, Testing
The basic postulate of statistics is that "any random sample has
got a parent random variable". This parent random variable x~R is usually
called population and is considered to be infinite. It is common in stat-
istics to postulate the PDF of the population for any random sample, and
call it the postulated, or the underlying PDF. Such a postulate may be
hence tested for statistical validity.
In order to be able to test the statistical validity we have to
assume that the sample can be regarded as having been picked out, or drawn
from the . population, each element of the sample independently from the rest.
~his additional property of a sample is required by the standard definition
of a random sample as used in statistical literature. However, since the
present Introduction does not deal with statistical testing we shall keep
using our original, more general definition.
There are infinitely many families of PDF's. Every such family is
defined by one or more independent parameters, whose values characterize the
shape of its PDF. The individual members of a family vary according to the
value of these parameters. It is common to use if possible, the mean and the
standard deviation as the PDF's parameters. The less parameters the family of
PDF's contains the better; the easier it is to work with.
The usual technique is that we first select the "appropriate"

9
family of PDF's on the basis of experience and then try to find such values
56
of its parameters that woUld fit the actual random sample the best. In
other words, the shape of the postulated ~(x) is chosen first; then, its
parameters are computed using some of the known techniques.
Since we shall be dealing with the samples and the random var-
iables (populations) at the same time, we shall use, throughout these notes,
the latin letters for the sample characteristics, and the corresponding greek
letters for the population characteristics as we have done so far.
3.2.5 Two Examples of a Random Variable
Example 3.17: As the first example, let us investigate a random variable x

with rectangular (uniform) PDF ,. which is symmetrical
around a· value x; = k~ ::.:t'§t'"· the ·probability

of x < k-q and x > k+q, be zero. Obviously, this PDF has the
following analytical form (see Figure 3.13):
~ax t:S oF s!i m, e-try

P(k-~!: x !: l<t 'l).: 1.
Figure 3.13
=<
h, for (k - q < x < k + q)
o(x)
0, for (x < k- q) and (x > k + q).
57
This can be written in an abbreviated form as:
~(x) <
=
h, for (i.x-kl.-< q) ·
0, for (!.x- kl > q).

-
The above ~ contains apparently three parameters k, q and h.
However, only two are independent, since one can be eliminated
from the condition (3~15), i.e.:
! ~(x) d.x =1
-""
that must be satisfied for any ~ to be a PDF. Let us
elimi~ate for instance the parameter h. We can write:

00 k-q k+q
f ~ (x) dx f= ~ (x) dx + f ~ (x) dx +
-oo -oo k-q
00
f ~ (x) dx
k-4-q
k+q
= 0 + ! hdx + 0
k-q
k+q
k+q
= h ! dx = h [x]k = 2hq = l.
-q
k-q
This means that h =21q , and therefore:
--<l/(2q), for ( l.x - kl ~ q)

~(x)
0, for ( Ix - k I > q)
The corresponding CDF to the above ~ is:
0, for (x ~ k - q)
ll'(x) = !
X
-oo
~(x) d.x -1
2q !
k-q
X
dx. = ;q_ (x-k+q), for
( lx - kl ~ q) •
1, for (x :::, k + q) ,
and is shown in Figure 3.14.

)0
o/(x)
/. 0 ---- - - - - - - - - - - - -
0,5
Figure 3.14
From the above figure we see that the function~~linear in the
interval over which <l> :/: 0, and is constant everywhere else. Note
that:
cjl(x) =d '!' (x)
dX
The mean of the given PDF is computed from equation (3.20) as
follows:
00
k+q 1 2 k+q
11 =f x<P (x)dx = 12q f
k-q
xdx = -2q [:!....]
2 k-q
= -14q (k 2 + 2kq + q
2
- k
2 2
+ 2kq - q )
- 4kq -
- 4q - k.
This result satisfies our expectation for a symmetrical function
around k. The variance of the given PDF can be obtained from
equation (3.22), yielding

00
2 k+q
1
cr <j>(x) dx = 2q fk-q (x-k) 2 dx =
59
k+q k+q 2 k+q
1
=-
2q
k-q
f. X
2
dx -- 2k
2q
J
k-q
X a.x+L
2q
J
k-q
dx
3 k+q
= ~q [~ Jk-q - 2k 2 + k2
Since k =~ and q = 13o, then h =~q = 2 ;~ 0 , and we can
express the given rectangular PDF, which we will denote by
R, in terms of its mean ·~ and its standard deviation a as
follows:
<
~her, for ( lx-p.l) ~ her)
R(·~, a ; x) = ~(x)
0, for ( Ix - pI ) > her) •
Similarly, we can express its corresponding CDF, which we will
denote by R , in terms of~ and cr, as follows:

c
0, fo~:~{x<1 ~- ll - ho) .
~ho (x-J.t.+l3cr) , for ( Ix- ~-1 < 13a) •
1, for (x ~ ~ + ho) • -
Assume·· that we would like to compute the probability of
x e: [ ]..1-cr, ~+cr] , where x has the rectangular PDF .
This can be done by using equation (3.16c) and Figure 3.15, as
follows:
6o
{>(x).
I ,
~~~~~~~~~~~~~-.-x
(~-f3C1') I (~+JJo-)
(~-rr) cJ+o-)
Figure 3.15
].I+a
P ( 1,1-0! < X < 1J +a ) = f cp ( X ) dx
}.1-a
].I+ a
1 1
= f 2ha dx = 2ha [ 2 o]
}.1-a
=;§- = ~~ = L ~ 32 =o.~ 577 = Q.:.2§_.

The above probability is given by the shaded area in Figure
3.15.
Similarly, for this particular uniform PDF, we find that:
P(~J-2a-:_ X-:_ ]1 + 2a): P(~J-3a-:_ X-:_ 1J + 3a): 1.0
·In:' statistie.a.l,t;esting, we: often need to compute the moments
of the PDF (see/seo1;i.ea,3.2.3). Let us,., .-fer instance,·
compute the third moment m3 about zero of the rectangular
PDF... We will use equation (@-:.-24),. L e.

61
oo
3 ll +13a 3 1
f X ¢(x) dx= .( X ~dx
-oo )1-/Jcr . cr
= ~her (8/3crp 3 + 2413 a3 ll )
3 2
m3 =ll_·-~~
J:l-:;xample 3.18: As a second example let us investigate a random variable with
a triangular PDF, which is symmetrical around x = k . Let us
assume that the probability of x < k - q and x > k + q, be
zero. We may write (see Figure 3.16):
Figure 3.16
62
for (x < k - q).
q), for (k - q < x < k).
~(x) =
+ q), for (k < x < k + q).
0, for (x > k + q).
This .can be rewritten in the following abbreviated form as:
~(x) = < h (q - Ix -
qo
k
for ( Ix - k I
I), for ( Ix - k
:: q) •
I < q)
From the above, we can see that the triangular PDF has the
same parameters (k, q, h) as the uniform PDF of example 3.17.
Let us again eliminate the parameter h from the condition

00
f ~(x) dx = 1. This integral is nothing else but the area
of the triangle, so.· that we, can Write:_:~ • 2q • h = qh = 1.
-
--. ,
1
Thi.s gives . us : h = -,
q
and hence,
~(x) ·= < - 1
q
0, for
.
ix-~ l
q
-. (1·x-k 1' <
for
Clx-kl ~ q).
q) •
The .c-omputations of the ~ and the variance of the triangular
PDF can be performed by following the same procedure as we have
done for the rectangular PDF in example 3.17. We state here
the results without proof, and the verification is left to the
student.
The ~ Jl of the given triangular PDF equals to k, and the
variance a 2 comes out as~ q 2 .

63
Since k = lJ and q = 16a , we ean againt~press the tri-
angular PDF, which we will denote by T, in terms of its mean
T (\1, a ;x) = ~(x)

~
voa
-
lJ and its standard deviation a, as follows:
<
lx-}.1 I ' for( lx··\11
6(/
< 16a)
0, for ( IX-lll > 16a ) .
The corresponding CDF is given by:
0, for (x ~ ~- q).
l!'(x) / (l- lx-21 ) dx, for ( lx-gl ': q)

q ,..,
lJ-q ':l.
1, for (x: ll + q),
and is shown in Figure 3.17.
'f(>c)=J.
t.o
0.. 5 -----·-- -------
Figure 3.17
The integral in the above equation can be rewritten as:

<
X
! _.q,_+_x_-_J.!.o;;.' dx, for (x < u).
f x (~ _ Ix-u I ) dx
- 11-fllq q + xq2-
k-q q 9.2
X
......._______-"'-]J dx + f 9. - x + 11 dx,
]1-q q2 11 q2
for (x > 11).
and we get:
X
f (q + x - 1J) dx
11-q
= 1 {~2(x2-112+2~q-q2) + (q-11) (x-11+q)}

q2
== 1 2 {x 2 -2~x+i+2q(X-!1) + q2 }
2q
2
= . (x-11) _ + (x-11)_ + ~
2q2 q 2
Similarly,
X ( 2
l f (q-x+).l) dx = - x;JJ) - + (x-~2_,
2
q 11 2q q
and
f g,_+x-11 dx=l
2 2
q
JJ-q
Finally, we can express the CDF, which we are going
to denote by T , in terms of the mean 11 and the

c
.statt:C'tard d.eviation o, as folJ_(lj)wa:
65
1
=1/
0, for (X 2_1-l- ./6a)
(x-1J)
120 2
2
+ (x-1J) +
/60 2
.
!, for (l..l - ./60 .:s_ x .:s_ u) •
(X-l..l) 2 + (X-1J) 1 f ( .j )
2 ; 60 + 2' or u .:s_ x .:s_ 1..l + 60 •
\ 120
1, for (x .::_ 1..l + ./60) •
By following the same procedure as in example 3.17, we
can compute the probabilities: P(l..l-0 .:s_ x .:s_ 1..l + 0),P(~-20 ~ x ~ 1..l + 20)
and P (1J-30 .:s_ x .:s_ 1J + 30) as well as the third moment m3 about zero for
the triangular ·PDF. Again, we give here the results, and the verification
is left to the student:
P(lJ-0 2_ X 2_ 1J + a) ; 0.66 ,
P(lJ-20 .:s_ X .:s_ .l..l + 20) - 0.97
P(lJ-30 .:s_ x .:s_ 1..l + 30) = 1 and
m
3
= l..l3 + 3a 2 1..l •
66 .
3.3 Random Multivariate
3.3.1 Multivariate, its PDF and CDF
Analogically to the ideas of stochastic function and stochastic
variable given in section 3.2.1, we introduce the concept of a multi-
valued stochastic function

s
XE {U + R }
in the s-dimensional space.
We note that X is a vector function, i.e., X(u) can be written
as:
X(u) =(x1 (u), x 2 (u), ..• , xs(u)) E Rs, u E u.

The individual components xj(u) £ R, j = 1, 2, .•• , s are called components
or constituents of X(u). We also note that each component xj of the stoch-
astic function X can be regarded as a random variable (univariate) of its
own. One particular value of xj may be denoted by x~*) and similarly a

~
particular value of X may be denoted by

1 s
X.~ (x. , • • • I x.) •
J. J.
Note thq.t a specific value of X is a sequence of real numbers (not a
set), or a numerical vector.
The pair (X(u), ~(X)), where

1 2 s
~ (X) = ~ (X , X , . • •• , X ) (3-27)
is a non-negative, integrable function on Rs is called a random multi-
variate or simply a multivariate.
*The superscripts and subscripts here are found very useful to distinguish
between the c~ponents xJ, j = 1, 2, ... , s of the multivariate X, and
the elements x~, i = 1, 2, •.. , nj of the univariate (random variable)
XJ. J.
67
s
We can speak of a probability of X£ [X0 , Xl] C R , and define it
as follows:
r-------------------------------,
xi
P(X0 ~X~ x1 ) =!X ~(X) dX E[O, 1] . ( 3-28)
0
Here the integral sign stands for the a-dimensional integration~ dX for
an a-dimensional differential, i.e. dX = (dx1 , 2

dx ,
s
• • • dx ) and
1 2 s
= (xl, xl' •. • x 1 ) are assumed to satisfy
the following inequalities:
j = l , 2 , ••• , s
Note that in order to be able to call the function ~ a PDF, the following
condition has to be satisfied:
f
R
s~ (X) dX =1
A complete analogy to the one-dimension 1 or univariate case

l (3-29)
(section 3.2.2) is the definition of the multivariate CDF. It is defined
as follows:
X
~(X) = f ~(Y) dYe {Rs + [0, 1]} (3-30)
-oo
where Y is an a-dimensional dummy variable in the integration.
Example 3.19: Consider the univariate PDF shown in Figure 3.12. This
bell-shaped PDF is known as the normal or Gaussian PDF (to be
discussed later in more details), and is usually denoted by N,

68
in terms of its ~ and o we have
~(x) = N(~, o; x) •
Then the multivariate normal PDF in two-dimensional space, i.e.
~(X)= ~(x 1 , x 2 ), would appear as illustrated in Figure 3-18.
Figure 3-18.
In the two-dimensional space, ~(X) is ca.lled a bivariate PDF,
and the bivariate normal PDF illustrated above can be expressed as
~(X) = N( ~l' ~2' 01' o2 ; X)

..
·1 2
1
X - l-11 X- l-11
= exp [-~ . -~ ]
2 'IT 0 cr2 01 02
1
~
\ )< -- •· .v
X
\~.:..- ).l~ ~.
[~-
'2
~--:-
pi
2
0 2. J
69
3.3.2 Statistical Dependence and Independence
The PDF,~' of-the multivariate X may have a special form,
namely
In this case, the integral in equation (3-28) can be rewritten as:
s xj
= IT ! l (3-31)
j=l xj
0
Remembering that each component xj of the multivariate X can be regarded
as a univariate, and regarding ~j as the PDFs of the corresponding
univariates we can rewrite equation (3-31) as:
s s
IT IT P(xj < xj < xJ1' ) •
o-
j=l j=l
Comparing this result with equation (3-28), we get the relationship
between the probabilities
P(X <X< X ) = ~ P(xj < xj < xj) (3-32)

o- - l . o- - l
J=l
This relation can be read as follows: "The combined probability of all
the components satisfying the condition: xj < xj < xj equals to the

0- - l'
product of the probabilities for the individual components 11 , and
obviously satisfies the definition of the combined probability of

70
independent events (section 2.3). Hence, the components xj of such a

multivariate X are called statistically independent. The PDF from
example 3.19 is statistically ~ndependent.
If the PDF of a multivariate cannot be written as a product
of the PDF's of· its constituents, then these constituents are known as
statistically dependent. In this case, the probability P(X0 ~X~ x1 )

is not equal to the product of the individual probabilities.
It can be shown that for statistically independent components
we have
J R cj> j (xj) dxj = 1, j = 1, 2, ••• , s.
3.3.3 Mean and Variance of a Multivariate
The sequence
... , u )
s
= E* (X) (3-33)
where
~. = J xj cj>(X)dX = E* (xj)e R, j = 1, 2, ••• , s (3-34)

J Rs
is called the mean of the multivariate X. The argument of the operator E*

1 2 s 1 2 s
(i.e. the a-dimensional integral) is X. cj>(X) == (x , x , ••• , x ) • <P (x , x , ••• , x ) •
Similarly the variance of the multivariate X is given by
cr2 = (crl, cr2, ••• , cr2) I

2 s
(3-35)
where
(f~ =
J
= E* (xj -11 . ) 2 ) e R, j = 1, 2, ••• , s. ( 3-36)

J
71
Note that we can write again
~ - 2 ~ - 2 ~ 2 -2
E*(X-~) = E*(X-E*(X)) = E*(X) - ~ ( 3-37)
and
The variance of the multivariate does not express the statis-
tical properties in the multi-dimensional space as fully as the variance
of the univariate does in the one-dimensional space. For this reason,
we extend the statistical characteristics of the random multivariate

further and introduce the so-called variance-covariance matrix (see
section 3. 3. 4) •
Let us now turn our attention to what the mean and the variance
of a "statistically independent" multivariate look like. For the stat-
istically independent components xj, j = 1, 2, ••• , sofa multivariate x,
we obtain
s
=J [xj ~ . (xj) ( II ~ R, (x.R,) dx.R,) dxj] ( 3-39)
Rs J .R-=1
R-r!j
Here, according to section 3.3.2, all the integrals in equation (3-39)
after the IT-sign are equal to one, and thus we have
( 3-40)
72
Similarly,
(3-41)
Thus for the statistically independent X, we can compute the mean and
the variance of each component xj separately, as we have computed
cr 2 of the PDF from example 3.19.
3.3.4 Covariance and variance-covariance Matrix
Before we start describing the variance-covariance matrix,
let us define another statistical quantity needed for this matrix. This
quantity is called covariance and it is defined for any two components
xj and xk of a multivariate X as
cov (3.42)
We note three things in equation (3-42). First, if j =k

we see that the expressions for the covariances become identical with
those for the variances, namely:

73
Secondly, if the components of the multivariate are statistically
independent, the covariances (j # k) are all equal to zero. To show this,
let us write
Finally, noting that for a pair of components of a statistically
independent multivariate we have
( 3. 43)
we can write:
74
. k .
O'jk = E*(xJx -xJ j.lk-j.ljXk + ].lj].lk)
'k k
= E*(xJx ) - ].1 E*(xj) - ].I.E*(x) + ].lj].lk
k J
. k . k . k
= E*(xJx ) - ].lj].lk = E*(xJx) - E*(xJ) E*(x) =0
. k
Hence, for statistically independent components xJ and x , we get
(3-44)
or more generally, for r independent components we get
r r
t
E* ( II X ) = II (3-45)
t=l t=l
Equation (3-45} completes the list of properties of the E* operator
stated in section 3.2.3.
As we stated in section 3.3.3, the variance (cr2} of a multi-
variate is not enought to fully characterize the statistical properties
of the multivariate on the level of second moments. To get the same
amount of statistical information as given by the variance alone (in the
univariate case), we have to take into account also the covariances.
The variances and covariances can be assembled into one matrix
called the variance-covariance matrix or just the covariance matrix.

75
The variance-covariance matrix of a multivariate X is usually denoted by
~~ and looks as follows:
0'2
1 0'12 0'13 crls
cr2
0'21 2 0'23 0 2s
~*
X
= (3-46)
.
0'2
crsl crs2 s
It is not difficult to see that the variance-covariance matrix
can also be written in terms of the mathematical expectation as follows:
~* = E* [(X-E*(X)) (X-E*(X))T] , (3-47)

X
which is the expectation of a dyadic product of two vectors. Note
that the superscript T in the above formula stands for the transposition
in matrix operation. The proof of equation (3-47) is left to the
student.
Note that the variance-covariance matrix is always symmetrical,
the diagonal elements are the variances of the components and the off-
diagonal elements are the covariances between the different pairs of
components. The necessary and sufficient condition for the variance-
covariance matrix to be diagonal, i.e. all the covariances to be zeros,
is the statistical independence of the multivariate. The variance-
covariance matrix is one of the most fundamental quantities used in
adjustment calculus. It is positive - definite (with diagonal elements
always positive) and the inverse exists if and only if there is no absolute
correlation between components.

76
3.3.5 Random Multisample, its PDF and CDF
Like in the univariate case, we can also define here a quan-
tity n corresponding to the random sample ;, defined in section 3.1.1
as follows:
J::lI.
n -
Jr•i·
I. 2
1
t;3,
2
• • •
..... ,
• • I
E;l )
nl
~2 )
e R
e R
1
n2
(3-48)
n (_t;l, F;3'
n2
I.
lt,:S l«~. s
t;2, F;3,
s
• • • • • I
E;s
n
s
n
e: R s
which is a straightforward generalization of a random sample, and will
be called a random multisample. From the above definition, it is obvious
that n has s components (constituents), F,;j, each of which is a
random sample on its own. The number of elements n. in each component

J
F;j may or may not be the same.
We can also define the definition set as well as the actual
(experimental) PDF and CDF of a multisample in very much the same was as
we have done for a random sample. Also, the distribution and cumulative
distribution histograms and polygons can be used for two-dimensional multi-
samples. The development of these concepts, however, is left to the
student.
3.3.6 Mean and Variance-Covariance Matrix of a Multisample
The mean of a multisample (3.48) is defined as
... , (3-49)
where from equation (3-3) we get

1 nj . .
M. = n. .~ 1 E;~ = E(F,;J) e:R, j = 1, 2, ..• , s • ( 3-50)
J J.= J.
J
77
Here, the operator E is defined as a vector of operators E which is
obvious from comparison of (3.49)with (3.50). Similarly,
r
l -2
s =
2 2 2 2
(sl, s2, s3, .•• , ss) = E- (n-M)
- 2 e: R
s ,
] ( 3-51)
where from equation (3-6), we get

n.
2 1 J j 2
s. = L (E;.-M.) =E(E;j-M.) 2 e: R, j = 1, 2, •• , s.
J
(3-52)
J n. i=l l. J
J
-
We can also define the standard deviation S of the multisample n as
(3-53)
Example 3.20: Let us determine the mean M, the variance s2 and the
standard deviations of a multisample n = (t; 1 , t; 2 , t; 3 ),
where
sl = ( 2, 3, 4, 7, 4) ,
s2 = (6, 4, o, 3, 2) and
s3 = ( 5; 2, 5, 5, 8) .
Here we have n 1 = n 2 = n 3 = 5. The mean M is given from
equation (3-49) as
78
The members Mj, j = 1, 2, 3 are computed from equation (3-50)
as follows:
n 5
M = 1:._ 1:1 ~:- = 1:. 1: ~:-
l nl i=l ~ 5 i=l ~
= l5 20
(2 + 3 + 4 + 7 + 4) = 5 = 4 '
M2 = t (6 + 4 + 0 + 3 + 2) = l~ = 3,
M3 = t (5 + 2 + 5 + 5 + 8) = ~ = 5 ,
and we get
_,
M = (4, 3, 5) ,
The variance "82 is given from equation (3-51) as
~ 2 2 2
8 = (81, 82, 83) •
The members 8~, j = 1, 2, 3 are computed from equation (3-52)
as follows:
nl 2 5
8
2 1 1
=--- 1: (~;-M 1 ) =-1 r
1
(~. - 4)
2
1 nl i=l ~ 5 i=l ~
1
= 5 [4 + 1 + 0 + 9 + 0] = 514 = 2.8 '
79
82 =1 [(3}2 + (1}2 + (-3)2 + (0}2 + (-1)2]

2 5
= ; [9 + 1 + 9 + 0 + 1] = 2 ~ = 4.0
s~ = t [(0)2 + (-3)2 + (0)2 + (0)2 + (3)21
= 51 [0 + 9 + 0 + 0 + 9] =
18
-s = 3.6 ,
and we get
s-2 = (2.8, 4.0, 3.6) •
2
Taking the square root of the individual members Sj' j = 1, 2, 3,
we obtain the standard deviation s as

- = (s 1 ,
S s2 , S 3 ) = (1.67, 2.0, 1.9) •
If the jth and kth components of a multisample have the same
number of elements, say n, we can write the covariance Sjk between these
two components ~j and ~k as:
( 3-54)
which can be rewritten as:
Note that the covariance Sjk' as defined above, depends on the ordering of
. k
the elements in both components ~J and ~ , whereas the means Mj and ~ and
t he .
var~ances sj2 an d sk2 d o not. Therefore, to obtain a meaningful covariance
. k
sjk' each of the components ~J and ~ should be in the same order as it
was acquired. This can be visualized from the following example. Assuming
that the elements of ~j are observations of one vertical angle, and the
elements of ~k are the corresponding times of the observations. Clearly,
to study the relationship (covariance) between the observation time and
the value of the observed vertical angle, the matched pairs must be
respected.
80 a
~~-l: 21~: Let us determine the covariances between the different
pairs of components of the multisample n given in example 3.20.
The covariances 8jk are computed from equation (3-54} as follows:
5
8 12 = 8 21 = 51 i~l[(~i-
1
4)
2
(~i- 3 )
]
=~ [(-2)(3) + (-1)(1) + (0)(-3) +
+ (3)(0) + (0)(-1)]
= 15 [-6-1+0+0+0] = -T5 = - 1.4,
813 = 831 = ~(-2)(0) + (-1)(-3) + (0)(0) +
+ (3)(0) + (0)(3)]
l .
= - [ 0+3+0+0+0]
5
= -35 = 0.6 and
-
823 = 832 =? (3)(0) + (1)(-3) + (-3)(0) +

+ ( 0 )( 0) + ( -1 )( 3)]
= 51 ( 0 - 3 + 0 + 0 - 3]• -6 : :
=~ -1.2 •
Finally, we can assemble the variance covariance matrix En of

the multisample n:
62
1 6 12 613 6 1. s
82
8 21 2 8 23 62 s
E = (3-54)
n
82
8 sl 8 s2 s
Having defined the mean and the variance-covariance matrix of
a multisamp1e 1e·t us stop and reflect for a while. We have stated in
3.3.3 that the expansion from one to s dimensions defied a straight-
forward ~eneralisation of one dimensional variance. We had to introduce
the variance-covariance matrix to describe ~he statistical properties

eoo
of a ~ultivariate on the second moments level. Turning to the relationship
sample - univariate we discover· that this is not paralleled in the multi-
dimensional case either. While formulae for the mean and the variance
of a sample and a univariate were equivalent, those for a multisample
and a multivariate are not. While equivalent formulae to (3-34), (3-35)
and (3-42) can be devised for the multisample, the ones used mostly in
practice ((3-49), (3-51) and {3-54)) correspond really to (3-40), (3-41),
and (3-43) valid only for statistically independent multivariate.
This, together with the difficulty with the computation of
multisample covariances, i.e., the necessity to have the same number
of elements in any two components, leads often in practice to the
adoption of an assumed variance-covariance matrix. Decisions connected
with the determination of the multisa~ple variance-covariance matrix
are among the trickiest in adjustment calculua.
Example 3.22. Let us determine the variance-covariance matrix of the
multisample n introduced in example 3.20. In this case, we

have the variances computed in example 3.20, the results
were:
2
8 12 = 2. 8' 82 = 4.o and
Also, we have the covariances computed in example 3.21, the
results were:
812 = 821 =- 1.4, 813 = 831 = 0.6 and
823 = 832 =- 1.2.

Therefore, the required variance-covariance matrix will be:
81
2.8 -1.4 0.6
E
n
= -1.4 4.0 -1.2
0.6 -1.2 3.6
3.3.7 Correlation
Although the covariances of a multisample do not play the same
role as the covariances of a multivariate, they still can serve as a
certain measure of statistical dependence. We say that they show
the degree of correlation between the appropriate pairs of components.
The degree of correlation as a measure of statistical dependence,
may, of course, vary. We can see that the covariance Sjk E R may attain
any value. Hence it is not a very useful measure because we cannot
predetermine the value of the covariance corresponding to the maximum
or complete correlation. For this reason, we use another measure, the
correlation coefficient, which is usually denoted by p, and is
defined as
( 3-57)
It can be shown that pjk varies from -1 to+l.
Based on the use of the correlation coefficient is the correlation
calculus, a separate branch of statistics. It will suffice here to say
that we call two components ~j and ~k of a multisample n:

( i) totally uncorrelated, if pjk =0 ,
(ii) correlated, if IPjkl < 1 ,
(iii) totally positively correlated, if pjk =1 ,
(iv) totally negatively cor'related, if pjk = -1 .
Note that for the multivariate, the expression for pjk is written completely
analogous to equation (3-57).

82
Example 3.23: Let us discuss the degree of correlation between the
different pairs of components of the multisample n which is
used in examples 3.20 to 3.22 inclusive, and whose variance-
covariance matrix is given in example 3.22.
The correlation coefficients pjk are computed from equation (3-57)
as follows:
- 1.4
= 1.67 . 2 = - o. 42
=
0.6 = 0.19 '
1.67·1.9
-1.2
2 • 1. 9
=- 0.31 .
Note that:
Since
jpjkl < 1, j, k = 1, 2, 3, j :j: k,
thus the components ~ 1 , ~ 2 and ~ 3 of the given multisample n
are all correlated.
Example 3.24: Let us discuss the degree of correlation between the

1 2
components ~ and ~ , and between ~l and ~ 3 of the multi-
sample n = (~ 1 , ~
2
, ~
3
), where:
1 4),
~ = (2' 1, 3, 5,
~2 = ( 4' 2. 6,_ 10, ._at~ ...
1;3 = (-4, -2, -6, -10, -8).
83
By computing the means and variances of .;J, j = 1, 2, 3

similarly to example 3.20, and the covariances 812 and 813
similarly to example 3.21, we get the following results:
Ml = 3, M2 =6· and M3 = -6,

82 = 2, 8 22 =8 82
1 and
3 =8 '
8 1 = ·12~ s2 = 83 = 2 12 '
8
12
=4 and 813 = -4.
Hence
8 12
P12 =8 •S = ...,l-2.......4-2._,1....2 = + l,
1 2
which means that .; 1 and .; 2 are totally positively correlated,
and
~13 _4
P13 = S1 ·S3 = - 1 '
which means that .; 1 and ~:; 3 are totally negatively correlated.
At this po.int it is worthwhile mentioning that the computa-
tions of the means, variances, covariances ~d correlation coefficients
of the constituents of a multisalple are always preferably performed in a
tabular form for easier checking· The following table is an example of
such an arrangment using the two constituents .;1 and ~ 2 of the multi-
sam~le introduced in example 3.20.

84
t;1 F;2 1
( ~ :t.-M1) •
~.
1
~
1
( F;i-M1) (E.l-M )2
"i 1
F.::~
2
(t,;i-M2) '(
(~~-M2)2 ~i-11 2)
2 -2 4 6 3 9 -6
3 -1 1 4 1 1 -1
4 0 0 0 -3 9 0
7 3 9 3 0 0 0
4 0 0 2 -1 1 0
-
E 20 14 15 20 -7
. --- -·--·-
1
il'!.1 = ~ (20) = 1.~' M = - ( 15)
2 5 =3 '
s12 = 5
1
(14) = 2.8, 2 1
82 = "5 (20) =4 '
s 1 = /2.8 = 1.67 ' s.2 = llt = 2

'
1
812 = 5 (- 7 ) = -1.1~,
and
p12 = 1.67
-1.4
. 2 =- 0.4?- .
86
,61, 70, 102, 107, 113, 114) 117,'119, 120, 126, 120, 129, 129, 13:2, !37,
i3V, 130~ 130, 142, 143, 146, 1~6, 1·17, 1471 148, 149, 149, 150, 150, 153.
153, 156, 157, 158, 150, 159, 159, 159, 162, 162, 164, 166, 166, 166, 167,
16!>, 169, 169, 169, 170, 170, 171, 17~; 172, 172, 173, 173, 175, 175, 176,
176, 1.76, 17.7, 177, 178, 179, 180, 180, 181, 181, 181, 182, 183, 184, 184,
l85, 186, 187, 188, 188, 190. 192, 102, 193, 194, 194, 194:, 195, 195, 195,
· ::JO, 10~I., 10°.
··1nn "'' 1n~ 900, •01
iT1..1 1 - -
~01 , -9 01 , -9 0"'"":t
, -
9 •o•
9 9- 0....-, •O·'·!:': -•o~u 1 -~os 1 9..,.. 09.
... ·-:> 1 - • ,
90!)
-
"09 21.~~ 216 <:11!)
I .- • J J~ 919 I 219I -•)•)1
I • - J
2?9 9"'3 2"'7
IW-1 ,.,_, l .- t -
"33.. J --'*'J
·~-.t. 2'-'6
0
93~, I
t .-.
240, 247, 254, 262, 270
Required: (i) Glassify this sample according to your own choice, and
then draw its: distribution histogram, distribution polygon, cumulative
histogram, cumulat'ive polygon.
( ii) :Oet:ermine the mean, standard deviation, median and range
of the sample; then plot these quantities on your histograms and
polygons.
(iii) Det~rmine the probability of the height being in between
121 and 174 ems, by using your distribution histogram, your distribution
polygon, the cumulative histogram, the cumulative polygon, the actual
sample. Then compare the results.
(3) Verify the results given in Example 3.18 for the mean, the variance
and the third moment about zero of the triangular PDF.

3.4 Exercise 3
(l) The following table gives the weights as recorded to the nearest
pound for a random sample of 20 high-school students:
138 150, 146' 158, 150
146 164 138 164 164
150 146 158 173 150
158 130 146 150 164
Required: (i) Compute the mean, the standard deviation, the median
and the range of this random sample using both the original sample
and its definition set.
(ii) Compute the experimental probabilities of the
individual elements and then construct the corresponding discrete
PDF and CDF of the sample.
(iii) Compute the probability that the weight of a high-
school student is less than or equal to 150 pounds.
(iv) Compute the probability of the student weight to be
in the interval [158, 173].
(2) The following table gives the observed heights in em of a random
sample of 125 nine years old pine trees.

87
(4) Verify the results given in Examples 3.17 and 3.18 for the
probabilities P( )J-cr ~ x ~ ]J+cr), P( ).l-2cr ~ x ~ )J + 2cr) and
P(JJ-3cr ~ x ~ )J + 3cr)using the rectangular and the triangular CDF s
respectively ~ather than the corresponding PDF s.)
(5) Let x be a random variable whose PDF is given by:
4> (x) = < h, for ( -3 ~ x .::_ 7)
0, everywhere else.
~.9.ui.red: (i) Determine h.
(ii) Compute the mean and the standard deviation of x.
(iii) Construct the CDF of x.
( iv) Use both the PDF and CDF' to determine the following
probabilities: P(x~1.5),
P(x.?_2.5),
P( -1 .::_ X _-:_ 4) ,
P( JJ-2cr::_ x _-:_ JJ+2cr) •
(v) Compute the 3-rd and 4-th moments of the PDF about
zero.
(6) Let x be a random variable having the following PDF:
¢(x)
=
< k·x , for (O < x < 2)
0 , everywhere else.
B.eS!.ulr_e_£_: ( i) Determine the mean, the variance and the standard
deviation of x.
( i.i) Compute the probability P( 1 _-:_ x .::_ 1. 5) •

88
(7) Let x be a random variable whose PDF is given as:
k + 5o1 x - 5o3 , for ( 3 ~ x ~ 8)

1 13
¢(x) k-50x+ 50 , for (8..::_x~l3)
0 , everywhere else.
Be!!~ired: (i) Determine the mean and the standard deviation of x.
(ii) Compute the probabilities: P(5.5 ..::_ x ~ 10.5), P(x ~ 9),
P(x,.::. 7), P(].l - cr..::_ x ~ J.l + cr) .
( 8 ) Given a multisample n = (t;1 , 2 3

t; , t; ), where~
1
= (4.2, 3.7, 4.1),
t;2 = (26.7, 26.3, 26.6), and ~ 3 = (-17.5, -17.0, -18.0).
Required: (i) Compute the mean of n .
(ii) Compute the variance-covariance matrix of n.
(iii) Compute all the correlation coefficients between
the different pairs of components of n.
(9) Given a bivariate X = (x1 , x 2 ) with PDF
lx1-gl + 1 f ( lxl-ql < s 16, and
\~
s2t 1213 st 6h ' or
¢(X) lx 2-rl < t 13)
, everywhere else,
where q, rare some real numbers and s, t are some positive real
numbers.
Required: (i) Compute the mean of X.
(ii) Compute the variance-covariance matrix of X.

89
4. FUNDAMENTALS OF THE THEORY OF ERRORS
4.1 Basic D:efinitions
In practice we work with observations which are nothing else
but numerical representation of some physical quantities,.e.g. lengths,
angles, weights, etc. These observations are obtained through
measurements of some kind by comparison to predefined standards. In many
cases we obtain several observations for the~ physical quantity, which
are usually postulated to represent this quantity.
There is a different school of thoughts claiming that no
quantity can be measured twice. They say that if a quantity is measured
for the second time, it becomes a different quantity. Philosophically,
the two approaches are very different, however, in practice they coincide.
They vary in assuming different things (hypotheses), but they lead to
the same results.
The observations representing the same quantity may or may not
have some spread or dispersion (by spread we mean that not all the
observations are identical). For instance, when we measure the length of
the side of a rectangle using a graduated ruler, we will have two possi-
bilities (see Figure 4.la, b).
a) b)
Figure 4·.1
90
First, if the length of that side is exactly equivalent to an
integer number of graduations (divisions) on the ruler, the measurement
of it will not produce any spread. This is simply because the beginning
of the side will be at a graduation line of the ruler, and at the same
time the end of the side will be at another graduation line, and hence
we get always the same result. On the other hand, if the end of the
side is located between two division lines on the ruler, there will be
a fraction of the smallest division on the ruler to be estimated. The
estimates(observatiom) will differ, s~ due to different observers, and
hence we shall get a spread.
Usually, the spread and its presence depend on many other
things like: the design of the :experiment, measuring equipment, precision
required, atmospheric conditions, etc. If we know the causes that
influence the spread, we can try to account for them in one way or the
other. In other words, we will apply certain corrections to eliminate
such unwanted influences which are usually called systematic errors.
Examples of systematic errors are numerous like: variation of the length
of a tape with temperature, variation of atmospheric conditions with
time, etc.
In practice, this is possible if we can express such corrections
mathematically as functions of some measurable physical quantities. In
some cases, the systematic errors remain constant in both magnitude and
sign during the time of observations , e.g. most of the instrumental
systematic errors.. In such cases, we can eliminate these systematic
errors by following certain · teclmiques in making the observations. For
example, the error in the rod reading due to the inclination of the line
91
of sight of the level, with respect to the bubble axis, can be eliminated
by taking the backsight and the foresight at equal distances from the
level.
Further, we shall assume that there are no blunders (mistakes)
in the observations. These blunders are usually gross errors due to the
carelessness of the observer and/or the recorder. The elimination of
blunders has to be carried out before starting to work with the observations.
The ways for intercepting blunders are numerous and are as different as
the experiments may be. We are not going to venture into this here.
4.2 Random (Accidental) Errors
Even after eliminating the blunders and applying the appropriate
corrections to eliminate the systematic errors, the observations repre-
senting a single physical quantity usually still have a remaining spread,
i.e. are still not identical, and we begin to blame some unknwon or
partly unknown reasons for it. Such remaining spread is practically
inevitable and we say that the observations contain random or accidental
errors.
The above statement should be understood as follows: given a
finite sequence L of observations of the same physical quantity ~·, i.e.
we assume that the individual elements~., i = 1, 2, .•• , n represent the

~
same quantity t; where ~· is the unknown value, and can be written as:
n
~i
= nl
~
+ E: i' ~
L-
- 1I 2I •••I
n •
92
The quantities e's are the so-called random (accidental) errors*.
The sequence
( 4-3)
(or the sequence L, equation (4-l), for this matter) is declared a random
sample as defined earlier in section 3.1.1. This random sample has a
parent random variable, as defined in section 3.1.2.
It should be noted that the term "random error" is used rather
freely in practice.
4_.3 Gaussian PDF. Gauss Law of Errors
The histograms (polygons) of the random samples representing
observations encountered in practice generally show a tendency towards
being bell-shaped, as shown in Figure 4.2 a,b.
Figure 4.2
* It may happen, and as a matter of fact often does happen, that we are
able to spot some depen~ence of e (for whatever this means) on one or
more parameters,.e.g. temperature, pressure, time, etc., that had not
been suspected and eliminated before. Then we s~ that the e's change
systematically or predictably with the parameter in question, or we say
that there is a correlation between the e 's and the parameter. Here, we
may say that the observations still contain systematic errors. In such a
case we may try to eliminate them again, after establishing the law
governing their behaviour.
93
Various people throughout the history have thus tried to
explain this phenomenon and establish a theory describing it. The
commonly accepted explanation is due to Gauss and Laplace independently.
This explanation leads to the derivation of·the well known model - the
Gaussian PDF. The assumptions,,.due to Hagen, necessary to be taken into
account, along with the derivation of the law, due to de Moivre, are
given in Appendix I. Here we state only the result.
The Gaussian PDF,G(C;E) is found to be (equation (I-ll),
Appendix I) :
G(C; e)=~~ exp (-2e 2 /C), I ( 4-4)
where its argument E is the random error, i.e. a special type of random
variable with mean eqlal to zero, and C is the only parameter of the dis-
tribution. The Gaussian PDF is continuous and is shOwn in Figure 4.3.
Figure 4.3.
From the above Figure we note the following characteristics of
the Gaussian PDF,
( i) G is symmetrical around 0.
(ii) The maximum ordinate of G is at E = 0, and equals /( 2/lJrr )) , which
varies with the parameter C, see Figure 4.2b.
(iii) G approaches the E axis asymptotically as E goes to ~ =.

94
(iv) G has two points of inflextion at e: =+ tC/2.
The shape of G reflects what is known as the "Gauss law of a
large sample of errors", which states that:
( i) smaller errors are more probable than the larger errors,
(ii) positive and negative errors have the same probability.*
Note··that since G is a PDF it satisfies the following con-
dition:
00
f G(C ;E) de: (4.5)
4.4 Mean and Variance of the Gaussian PDF
Since G is symmetrical around zero, it is obvious that its
mean~ equals zero (see section 3.2.5).

-- E
The variance cr 2 of G is again obtained from

E
00
2 2 2
a
e:
= E* ( e: -~ )
E
= f e: G( C; e:) de:
00
2
= ig_ C'IT f E exp (-2e: 2 /C)de:: (4.6)
-oo
Recalling that
00
2 2 2 f1f
f t exp (-a t ) dt = (a> 0),
0 4a 3 '
we get from equations (4.6) and (4.7)
* The same result can be obtained using slightly weaker (more general)
assumptions through the "central limit theorem".
95
=--
2a
3 '
where
a = lc2 .
Hence,
12 · clc c
= 21lc • 21l2 =4
and we get
C = 4cr~2 • (4-8)
2 .
Consequently, the variance cr , or rather the standard deviation cr ~ can
~ . ~
be considered the only parameter of G. Substituting equation (4-8) into
equation (4-4) we get:

---------------------------r
1 2 ·~
exp (-£ /(~J) • (4-9)
a~ .f'2,-}
Note from equation (4-8) that ~ = IC/2, which equals to the abscissas
of the two points of inflextion of G.
Example 4.1. Let us compute, approximately, the probability P(-cr < £ < cr)
- E- - ~
assuming that £ has a Gaussian PDF. We first expand the function

2 2
exp (-£ /2cr£) to be able to integrate equation (4-9). Recall
that:
exp (y) = ey = l+y z:

+ 2! + ~
3! +
Hence
96
2 2 2 4 6
exp ( -€ /(2o )) = 1 - _e- + ....£_
4 _....£__ +·
E
2og
2
Bog 6
48oe:
and
E:
3 5
= __1_ [ 20 ~:~ 1 2 • 2oe_ + _L . 2oe:
o(21T) ~ 2o 3 Bo ~ 5
e: e:
7
1 2oE' + ]
- 48o 6 • 7 •••
e:
= ~~~1.
1T.
(J {C
- 0.167 + 0.025- 0.003]
e:
= i:-,; [0.855] ;. 0.683
Thus:
P(-o < e: <a) ;. . 0.683 •

e:- -e:
By following the same procedure, we can find that:
P(-2oe: ~ e: 1.. 2oJ ,; 0.954,
P( -3oe: ~ e: ~ 3oJ ;. 0 •. 991·

97
4.5 Generalized or Normal Gaussian PDF
The Gaussian PDF (equation (4.9)) can be generalized to have an
arbitrary mean ~ . This is achieved by the transformation

y
Y = e: + ~
y ' ( 4-10)
in equation (4-9), where y is the argument of the new PDF- the generalized
Gaussian. Such generalized Gaussian PDF is usually called normal PDF and
is denoted by N, where:
2
(y-~ )
'-T). 2
y
.
( 4-11)
The name "normal" reflects the trust which people have, or
used to have, in the power of the Gaussian law (also called the "normal
law") which is mentioned in section 4. 3 . If the errors behave according
to this law and display a histogram conforming to the normal PDF, they
are normal. On the other hand, if they do not, they are regarded as
abnormal and strange things are suspected to have happened.
The normal PDF contains only two parameters - the mean ~~and
the standard deviation J,• Hence, i t is well suited for computations.
Note here that the family of G(~; e:) is a subset of the family
of N(~, ~; y). Also note that the following condition has to be satis-
fied by N:
The formula for the normal CDF corresponding toN is given as:
2
1 y . (x-lly).
'!'N(y) = ali";) L"'exp(- 2 ) dx' ( 4-12)
y 2o
.Y
where x is a dummy variable in the integration.
98
For the generalized (normal) Gaussia~PDF, it can be again
shown that:
P ( 1-l -a < y < 1-l + a ) = 0. 683 ..

y y- - y y '
< y < 1-l + 2a ) · 0.954

P(J..l y -2a y- - y y
and
< y < ll + 3a ) ; 0. 997 ..
P( J..l y -3ay- - y y
(Compare the values to the corresponding results of the triangular PDF
in example 3.18).
4.6 Standard Normal PDF
The out.come t of the following linear transformation

x - 1-lx
t =--....;;;;. ( 4-13)
is often called the standardized random variable, where x is a random
variable with mean J..l and standard deviation a • Note that the above
X X
standardization processdces not require any specific distribution
tor x.
The transformation of the normal variable y (equation (4-10))
y-]J
to a standardized normal variable t = ~results
a
in a new PDF
l exp (-t 2 /2) = N(O, l; t) = N(t), ( 4-14)

1(21T)
'-------~-----------·-····--··-
whose mean J..lt is zero and whose standard deviation at is one. This
PDF is called the standard normal PDF, a particular member of the family
of all normal distributions.

99
S:dtnce both the parameters lit~· '.0 and crt' • -l;~ are. determined once
for all, the standard normal PDF is particularly suitable for tabulation
due to the fact that it is a function of t only. An example of such
tabulation is given in Appendix II-A, which gives the ordinates of the
standard normal PDF for different values oft. Note again that
00 l
£oo N(t)dt = /(2n)
The CDF corresponding to N (t) is given by
2
l ft exp (- L) dx,
I!'N(t) = 1(2n) -"" 2 (4-15)
or
2
l 1 ft exp (- ~ )dx,
I!'N(t) =-
2
+
1(2n) 0 ( 4-16)
where x is a dummy variable in the integration. Again, the CDF of
the standard normal PDF is tabulated to facilitate its use in probability
computations. Appendix II-B is an example of tabulated I!'N(t) using
equation (4-15), which gives the accumulated areas (probabilities)
under the standard normal PDF for different positive* values of t.
Appendix II-C contains a similar table, but it gives the values of
the second term in equation (4-16) only, for different values of t.
Hence, care must be taken when using different tables for computations.
* For negative values of the argument t the cumulative probability

P(t <-t ) = I!'N(-t ) is computed from ~N( t ) through the condition:
- 0 0 0
~ (-t )
N o
=l - ~ (t ) .
N o
100
The second term in equation (4-16) is usually known as 2 ~2

erf (t), i.e.
~N(t) = 12 + 212
1
erf (t), ( 4-17)
where, erf (t) is known as the error function, and is obviously given by
l erf (t) = -J; !~ exp (- }- )dx • I (4-18)
This erf (t) is also tabulated*.
In order to be able to use the tables of the standard normal
PDF and CDF for computations concerning a given normal random variable
x, we first have to standardize x, i.e. to transform x to t using
equation (4-13), then enter these tables with t. Thus, if we want,
for instance, to determine the probability P(x < x ) we have to write:

- 0
X-j..l X -J..I
P(x < X )
-o
= P(--X
cr-
< 0
cr
X )
• (4-19)
X X
This is identical to the probability :P(t ~ t 0 ) that can be obtained
from the standard normal tables.
Example 4. 2 Suppose that the height h of a student
is a normally distributed random

N(\:)
variable with mean J..lh = 66 inches and
standard deviation crh = 5 inches. Find
the approximate number K out of 1000
students h inches tall:

0 0·4 t ( i) h ~ 68 inches (Figure 4. 4 -:i.) j
Figure 4.4 - i
(ii) h ~ 61 inches (Figure 4.4-ii)j
* In most of the computer languages, this error fun:tion, erf (t).is a

built-i.n function. Hence it can be called as any lJ.brary subroutJ.ne
and evaluated ~ore precis~ly than using the corresponding t~bles.
101
(iii) h ~ 74.6 inches (Figure 4. 4-iii))

(iv) [64.3 .:_h.:_ 70] inches (Figure 4 •. 4-iv).
Solution: We are going to use the

Table in Appendix II-B.
(i) P(h .:_ 68) = P(t .:_ 68- 66 )
5
= P(t .:_ 0.40) = 0.6554 •
Hence, K1 = (0.6554)(1000) ; 655 students.
(ii) P(h _< 61) = P(t <
-
61- 66 )
5
Figure 4.4-ii
= P(t .:_ -1) = 1-P(t .:_ 1)
= 1. - 0.8413 = 0.1587 •
N{\:)
Hence, K2 = (0.1587)(1000) = 159 students •
(iii) P(h _> 74.6) = P(t > 74 • 6- 66
- 5
= P(t ~ 1.72)
1.72 t = 1. - P(t .:_ 1.72)
= l. - 0.9573 = ~.
Figure 4. 4-iii
Hence, K~ = (0.0427)(1000); 43 students.
(iv) P(64.3.:. h.:_ 70) =

=p (64.3-66 < t < 70-66
5 - - 5
N{t)
=p (-0.34.:. t.:. 0.80)
= P(t .:_ 0.80) - P(t .:_ -0.34)
= P(t .:_ 0.80) - (1-P(t .:_ 0.34)

= 0.7881- [1-0.6331]
.. 0.'34 0 o.s t = 0.7881- 0.3669 = 0.4212 •
Figure 4 .11.-i v Hence K4 = (0. 4212 )( 1000) = 1+21 students.
102
For the normal random variable h
given in example 4.2, determine the
student's height H such that:
( i) P(h < H ) = 0.6554 (Figure ir.• 5- i) ) •

- l
(ii) p (h 2. H2 ) = 0.25 (Figure 4. 5-ii) •
. j
(iii) p (h..:_ H3 ) ::: 0.20 (Figure 4·5-iii)·I
(iv) P(H4 _:. h..:_ H 5 ) = 0.95,

where H4 = f.lh -K and H5 = f.lh +K. (Figure 4, 5-iv) •
Solution: Again in this example, we
are going to use the standard normal
CDF table given in Appendix II-B.
( i) P ( h ..:_ H1 ) = P(t ..:_ t l) == 0. 655 4 .
From the above mentioned table, we

N(t)
0.6'554
get t = 0 .l+, that corresponds to
l
probability P = 0.6554. But we know
Hl-f.lh
that t = •
l crh
From example ir.-2 we have f.lh = 66 inches

figUf_e 4. 5-i and. ah =5 inches. Hence,
-66
H1<•.
tl = 5 = 0.4
from which we get
H1 - 66 = 5(0.4) = 2,
i.e.
H1 = 66 + 2 = 68 inches,
which is identical to the first case
in example 4. 2; however, what we are
doing here is nothing else but the
inverse solution.
103
But:
and we get
By interpolation in the above mentioned
table we get t 2 ' 0.675 which
Figure L~. 5-ii

i.e. H2 = 66 + 5 (0.675)
= 66 + 3.375 = 69.375 inches
By examining the above mentioned table
we discover that the smallest probabil-
ity reading is 0.50, since it considers

N(t.) only the positive values oft. Therefore
we have to write:
and we get
P(t <-t )
- 3
=1 - 0.20 = 0.80.
By interpolation in the above mentioned
Figure 4.5-iii
table we get: (-t 3 ) = 0.842, which
corresponds to P = 0.80. Then we have:

H -66
t3 = 35 = -0.842
and, H3 = 66-5(0.842)
= 66-4.210 = 61.79 inches.
101~
(iv) P(H 4 < h < H )

- - 5
N(t)
= P(-t 0 -< t < t

- 0
) = 0.95,
K K
where t =-=-.
-t 0 o crh 5
F,igure 4 • 5-i ~ The above statement means that:
P(t -< t 0 ) - P(t -<-t 0 ) = 0.95.

However, from the symmetry of the
normal PDF we get:
= l. - 0.95 = 0.025
2
and we get:
P(t -< t 0 ) = 0.95 + 0.025 = 0.975,

or P(t -<
-
t 0 ) = 1. - 0.025 = 0.975.
From the above mentioned table we get:
t
0
= 1.96, which corresponds to
P = 0.975, and we have:
t 0 = 5K = -
1.96 ,
i.e. K = 5(1.96) = 9.80. Consequently:
and
105
Example 4.4: Let us solve example 4.1· again by
using the standard normal CDF tables.
Recall that it was required to compute
P(-cr < e: <a ), where e: has a

E: - - E:
Gaussian PDF (i.e. its ~e: = 0). We
can write:
P(-cr < e: < a )

E: - - E:
-0' ...() 0' - 0

= P( ~ .::_ t < e: ))
0' 0'
E: E:
Figure 4.6 = P(-1 .::_ t .::_ 1), see Figure 4.6.
Further we can write:
P(-1 .::_ t .::_ 1) = 2P(O .::_ t .::_ 1).
From the table given in Appendix II-C,
we get:
P(O .::_ t .::_ 1) = 0.3413.

Hence,
= 0.6826 = 0.683,
which is the same result as obtained
in example 4.1.
106
4.7 Basic Hypothesis (Postulate) of the Theory of Errors,

Testing
We have left the random sample of observations L behind in
section 4.2 while we developed the analytic formulae for the PDF's
mostly used in the error theory. Let us get back to it and state the
following basic postulate of the error-theory. A finite sequence of
observations L representing the same physical quantity is declared a
random sample with parent random variable distributed according to the
normal PDF N(~t' crt; t). Other PDF's are used rather seldom. The
validity of this hypothesis may or may not be tested, on which topic
we shall not elaborate here.
The mean ML of the sample L is said to approximate (the word
estimate is often used in this context) the mean ~t of the parent PDF,
Also, the variance s12 of the sample L is said to
estimate the variance cri of the parent PDF.
Considering the original sample
L = ( t . ) = ( t'+e . ) , i = l , 2 , ••• , n ,
~ ~
we get:
l n l n l l n
M1 =-f. £. =- .2:1 (£1+e.) =- (n£') +- .E 1 e.; = £1 + M (4-20)
n i=l ~ n ~= ~ n n ~= .... e
SLnce the random errors c.'s are postulated to have a parent Gaussian
PDF N(O, cre; e),which implies that ~e = 0, then we should expect that
M + 0 and we can write equation (4-20) as:

£
(4-21)
107
keeping in mind that by the unknown value £'we mean the unknown mean
~£ of the parent PDF of £. We say that the mean ~ of the sample L
approximates (estimates) the value of the mean ~£ of the parent PDF of £.
Similarly, we get
2 1 n n m 2
(.ti-~)2 1 2 1 s 2
SL =n i~l =- i~l[ti-(.t'+ ME))
n
= - k .( c:. - ME) =
m.i=l ~ £
(4.22)
The above result indicates that the variance s12 of the sample L is
identical to the variance s 2£ of its corresponding sample of random errors
c:. This is actually why si is sometimes called the mean sguare error of
the sample, and is abbreviated by MSE. Also, s1 is known as the root
~ean square error of the sample, and is abbreviated by RMS. According
to the basic hypothesis of the error-theory we can write equation (4-22)
as:
( 4-23)
which states that si estimates the variance o~ of the parent PDF of

.. . ' £
n
) •
:Sxample 4.5 Assume that the sample L: (2, 7, 6, 4,
2, 7, 4, 8, 6, 4) is postulated to be
normally distributed. Let us tr~~sform
this sample in such a way that the transformed
sample will have:
(i) Gaussian distribution
(ii) Standard normal distribution.

108
Solution; ·First we compute the mean~
and the variance si of the given sample
as follows:
10
1 l
M =
10 i~l
~.
J.
- 10 (50) = 5,
1
According to the basic postulate of the
error-theory we can say that:
~~ =~ = 5 and o 2 = o~ ; s1 = 2,
where~£ and o 2 are respectively the
mean and the standard deviation of the
parent normal PDF N( ~ 2 , o £,; ~.) assumed
for the given sample. The parameters
~£and o 2 will be used for the required
transformations as follows:
( :i.) The Gaussian distribution G( o ; E),

e
where o~ = o2 = 2, has an argument e
obtained from equation (4-10) as:
Hence the transformed sample that has a
Gaussian PDF is:
e =(e.),
J.
i = l, 2, ... , 10 ·,i.e.:
~ - (-3, 2, l, -1, -3, 2, -1, 3, l, -1).

109
(ii) The standard normal distribution
N(t), has an argument t obtained from
equation (4-13) as:
R..-)..ln R-.-5
~
t . :
~ (j £
;v : _].__
2
•
' ~ = 1 ' 2 ' ... ' 10
.
Hence, the transformed sample that has
a standard normal PDF is T - ( t i),
i = 1' 2' ... ' 10 i.e.:
T- (-1.5, 1, o.5, -0.5, -1.5, 1, -o.s,

1.5, 0.5, -0.5).
4.8 Residuals, Corrections and Discrepencies
As we have seen, we are not able to compute the unknown
value £'or )..1 £. All w·e can get is an estimate £ for it from the
following equation
£: M_
--r,
= £'+ Me: = £'+ E' *) ( 4-24)
-
and hope that e:, in accordinace with the basic postulate of the error-
theory, will really go to zero.
The residual ri is defined as the difference between the
observation £. and the sample mean£, i.e.

J.
* From now on, we shall use the symbol i for the mean~ of the sample
L. The "bar" above the symbol will indicate the sample mean to make
the notation simpler.
110
Residuals with inverted signs are usually called corrections. It should be noted
that a residual, as defined above, is a uniquely determined value and not
a varia:bl e. The observed value £. is fixed and so is the mean £ for the
~
particular sample. In other words, for a given sample, the residuals can
be computed in one way only. Note that the differences (£. - £)

~
= r. ~
are
called residuals and not errors, because errors are defined as s. = ( 9.•• - J1 n)
l l lv
and Jl£ may be different from £.
In practice, one often hears talks about "minimized residuals",
"variable residuals" etc. which are not strictly correct. If one wants
to regard the "residuals as variables" the problem has to be stated differ-
ently. The difference v. between the observed value £. and any arbitrarily
l ~
assumed (or computed) value £0 , i.e.
(4.26)
should be called discrepency, or misclosure, to distinguish it from the
residual. These discr~pencies are obviously linear functions of £ 0 jtheir
values vary with the choice of £0 • Hence one can talk about "minimization
of discrepencies", "variation of discrepencies" etc. Evidently, residuals
and discrepencies are very often mixed up in practice.
At this point it is worthwhile to mention yet another pair of
formulae for computing the sample mean £ and the sample variance s2 Such
L
simplified formulae facilitate the computations especially for large samples
111
whose elements have large numerical values. The d.evelopment of these
formulae is done analogically to the formulation of equations (4.20),
(4.22), (4.25) and (4.26). Here we state only the results, and the
elaboration is left to the student.

(4.27)
I = !0 + V ,
and
n
2
I:. r.l. , . (4.28)
i=l
where: i 0 is an arbi tra:rily c.rosen value, usually close to I J

n
- 1
v=- I: v.l. ,
n
i=l
and
r 1. = i.l. - i = v.l. - v.- (4.30)
Example 4.6: The second column of the following tab1e is a sample of 10
observations of the same distance. It is required to compute
the sample mean and variance using the simplified formulae
given in this section.
We take ! 0 = 972.0 m,
10 1
v- =1-
10 i=l I: v. =10 (10.50) =1.05 m,
l.
I =!0 + v = 972.0 + 1.05 = 973.05 m2

_:;) 1 10 2 1 2
MSE= or:; = IQ I: ri = 10 (o • 5730) = 0. 0573. m
1=1
and RMS = SL = 0.24 m.

n
One of the checks on the computations is that I:
. 1
1'
i
= o,
J.=
see the fourth colt·~ of the given table.

112
._
No. . .1.
l.
•··
"· . ' .. :~.vl..= 1. - R.o
r
i
= R..J. - i r.
2
. 10
4
(m) l.
(m) = v.J. -
- v
J.
(m2)
1' 972.89 0.89 -0.16 256

2 973.46 1.46 0.41 1681
3 973.04 1.04 -0.01 1
4 972-73 0.73 -0.32 1024
5 972.63 0.63 -0.42 1764
6 973.01 1.01 -0~04 16
7 973.22 1.22 0.17 289
8 973.10 1.10 0.05 , 25
9 973-30 1.30 0.25 625
'
10 973.12 1.12 0.07 49

I
l:e 10.50 -0.95 5730

+0.95
= o.oo
4.9 Other Possibilities Regarding the Postulated PDF
The normal PDF (or its relatives) are by no means the only bell-
shaped PDF's that can be postulated. Under different assumptions, one can
derive a whole multitude of bell-shaped curves. Generally, they would
contai.1 ~ than two parameters which is an advantage from the point of
view of fitting them to any experimental PDF. In other words the additional
parameters provide more flexibility. On the other hand, the computations

113
with such PDF's are more troublesome. In this context let us just mention
that some recent attempts have been made to design a family of PDF's that
are more peaked than the normal PDF in the middle. Such PDF's are called
"Leptokurtic". This more pronounced peakedness is a feature that quite
a few scholars claim to have spotted in the majority of observational
samples. We shall have to wait for any definite word in this domain for
some time.
Hence, the normal is still the most ~opul~r PDF and likely to remain
so because it is relatively simple and contains the least possible number
of parameters - the mean and the standard deviation.
4.10 Other Measures of Dispersion
So far, we have dealt with two measures of dispersion of a sample
namely: The root mean square er·ror (RMS) mentioned in section 4. 7, and
the range (Ra) mentioned in section 3.1.5. Besides the RMS and the range
of a sample the following measures of dispersion (spread) are often used.
The averag~ or :mean error a of the sample L is defined as

e
n
1
a
e =n E 1£. - il = 1n
l.
(4.31)
i=l
114
which is the mean of the absolute values of the residuals.
The most probable error p , of the sample L, is defined as the

e
error for which:
P(lrl < P)
e "p)e =0.50· = .P(lrl 4. 31) !(
.. I
which means that there is ·50.%. probability that. the resid:ual is smaller and
50% probability that the residuAl is larger than Pe·

The most probable error of a random sample can be computed by constructing
the CDF of the corresponding absolute vaJues of the sample residuals, and
take the value of r which corresponds to the CDF = 0.5 as the value of p
e
•
Both a and p can be defined for the continuous distributions as

e . e
well. For instance, by considering the normal PDF, N(~ ,a ; x),we can
X X
write:
ae = f lxl ~ (x) dx
- CIO
oo (x-~ )2
1
= --=--- I Ix I exp (- - ~ ) ' dx • (4.33)
a I( 21r -oo 2a
X X
Similarly for pe' by taking the symmetry of the normal curve into account,
we can write:
P(x < ~X - p e ) = 1{1 N (~X - p ) .::

e
(x-~ )
2
1 c~x-pe)
=a l(27r) ! exp (- X ) dx = 0.25 (4.34)
X _CIO
2a 2
X
and
P(x -< ~ x + p e ) = I{IN(~x + p )

e
= 0.75 (4.35)
where I{IN is the normal CDF.
115
It can be shown for the normal PDF N(Jl a · x) that "a " "a " and "p "
x' x' x ' e e
are related to each other by the following approximate relation:
or (4.36)
a : a ~ p ,;, 1. 0 : 0. 80 : 0. 67 .
x e e
The relative or proportional error r , of the sample L, is defined
e
as the ratio between the sample RMS and the sample mean, i.e.
re = s1 I I. (4.37)
In practice, the relative error isusually usedto describe the uncertainty
of the result, i.e. the sample mean. In that case, the relative error is
defined as:
....J-.1-e_=_S_Q_/_:€___,1 (4.38)
where S is the standard deviation of the mean I and will be derived later
I
in Chapter 6 .. In this respect, one often hears expressions like "proper-
tional accuracy 3 ppm (parts per million)", which simply means that the
relative error is 3/10 6 =3 · 10-6 . It should be noted that unlike the
other measures of dispersion, the relative error is unitless.
The idea of the confidence intervals is based on the assumption
of normality of the sample, i.e. the postulated parent normal PDF
(N (I, SL;t)) for the random sample L. It is very common to represent the
sample L by its mean I and its standard deviation s1 as
l[ I + SL ] I
or
and refer to it as the "68% confidence interval" of L This is based on
the fact that the probability P(\1 2- a2 .::_ t .::_ 1-1 2 + ai~ is approximately 0.68 for
116
the normal PDF (see section 4.5).
Similarly) one can talk about the "95% confidence interval")
the "99% confidence interval") etc. In general, the confidence interval
of £ is expressed as:
(4.40)
where K is determined in such a way as to make
P(J.lR.- KcrR. ~ £ ~ J1£+KcrR.)eqi::tal.to 0.95, 0.99, etc.
The values (I - KS 1 ) and (I + KS1 ) are called the lower and the
upper confidence limits.
Example 4.1: Let us compute the average error,the relative error and the
95% confidence interval for the sample of observations L

given in example 4.6.
The average error is computed using equation (4.31) and the
fourth column of the given table in example 4.6 as:
1 10 1
a =- l: Ir. I = 10 (1. 90) = 0.19 m •
e 10 i=l ~
The relative error of the sample is computed from equation
(4.37) and the results obtained in example 4.6 as:

I - 0.24 •
247 ppm
re = 81 R. = 913.05
The 95% confidence interval of R. is
[I - K s < £ < I + Ks ] .
L - - L
where the number K is computed so that
P(Jl 2 - Kcr 2 ~ R. ~ J.l£+ KcrR.) = 0.95 .

This is identical to the probability P(-K < t < K) obtained
from the standard normal tables (see example 4.3) the last
case). Hence we can write:

117
P(-K < t < K) = P(t < K) - P(t < -K) = 0.95

from which we get
P(t < K) ~ 0.975,
Using the table for the standard normal variable of Appendix
II- B we get:
K = 1.96.
(In practice K =2 is usually used for the 95% confidence
interval. ) The 95% confidence interval of ..t then becomes
[973.05- 1.96 (0.24) ~ 2 ~ 973.05 + 1.96(o.24)])
that is:
[972.58 < 2 < 973.52] m
or
[973.05 + 0.47] m.
Example 4.8: Given a random variable x assumed to have a normal distri-
bution N (35, 4; x), compute the most probable error.
From the assumed PDF we have:
N{t) ~
X
= 35 and cr X = 4.
The most probable error p is computed so that
e
P(ll~x - p e < X < 11
~x
+ pe ) =
-tp 0 t
= P(-t p-< .t < t )
p
= 0.50, (Figure 4.7a)
Figure 4.7a
where t
p cr
X
The above probability statement can be rewritten as· (equation
(4.35)):
118
P(x < y
- x
+ p )
e
= P(t <
-
t )
p
= 0.75 (Figure 4. 7b) .
From the table in Appendix II - B, we obtain t

p
= 0.675
corresponding toP= 0.75. Hence,
p
e
= 4 tp = 4 (0.675) = ---
2.7 .
0 tf
Figure 4. Zb Note that in the second case of example t~.3, the value 3.375
is nothing else but the most probable error of the given random
variable h.
4.11 Exercise 4
1. Prove that the Gaussian PDF given by equation ( lt.• t~), has two points of
inflection at abscissas + IC/2.
2. For the Gaussian PDF given by equation (!~.• 8), determine approximately
the probabilities: P(-2o < s < 2a ) and P(-3a < s < 3a )

S E E E
by integrating the PDF, then check your results by using the standard
normal tables.
3. Prove by direct evaluation that the standard normal PDF has a standard
deviation equals to one.
l~. Show that the standard deviation a, average error a and the most pro-
e
bable error p of the normal PDF satisfy the following approximate
e
relations:
a a
e
1.0 0.80 ~' o. 67.
119
5. Determine: the average error, the most probable error, the relative
error and the 90% confidence interval of the random sample given in
the second problem of exercise 3, section 3. ~..
6. Assume that the sample H = (-5, -4., -3, -2, -1, 0, l, 2, 3, L~, 5) is
hypothesized (postulated) to have a Gaussian distribution. 'rransform
this sample so that the transformed (new) sample will have:
(i) Normal distribution with mean equal 10.
(ii) Standard normal distribution.
T. Gi.ven a random variable x distributed as N (25, 10; x), determine the
following probabilities:
(i) P(x ~ 28.5) j (ii) P(x 2 22.5),

(iii) P(x ~ 21.5), (iv) P(l6.T5 < x < 23.82)J
( v) P ( J x-25j < 1. 25) .
8. For the random variable in the previous problem, determine the values
Z. such that
l
( i) P(x < Zl) = 0.65, (ii) P(x ~ z2 ) = 0.025

'
( ii) P(x < Z3) = 0.33 ~
( iv) P(jx-25! _: Z4) = 0.33
( v) P(jx-25j >
-
z5 ) = 0.50.
120
9.
D
/'
/
/
/ h
- , . - ~.
The above figure shows a surveying technique to determine the height h
of ~ tower CD, which cannot be measured directly. The observed quantities
are:
~ = the horizontal distance AB >
a ,S = the horizontal angles at A and B J
e = the vertical angle of D at B .
The field results of these observations are given in the following table:
121
Field Observations
.l(m) a 13 e
P-45.63 65° 32' 03" 37° 13' 08" 42° 53' 15"
.55 32 04 13 11 52 30
.59 31 59 13 10 53 00
.65 32 01 13 . 13 51 00
.58 31 58 13 06 52 15
13 12 52 45
..
51 .15"
53 00
51 45
52 15
Average temperature during the observations time was T = 20° F.
The following information was gi•ren to the observer:
(i, The micrometer of the vertical circle of the used theodolite was not
adjusted to read 00' OO" when the corresponding bubble axis is
horizontal; it reads- (00' 30").
(ii) The nominal length of the used tape is 20m at the calibration temper-
ature T0 = 60° F, and the coefficient of expansion of the tape material
is y =5 • 10- 5 / 1° F •
122
Required
( i) Compute the estimated values for the quanti ties 9.-, a, (3 and 6 •
(ii) For each of the above observed quantities compute its standard de-
viation and its average error.
(iii) Compare the precision of these observed quantities (by comparing
the respective relative errors).
(iv) Assume that each of these observed quantities has a postulated normal
parent PDF, construct the 95% confidence interval for each quantity.
(v) Compute the estimated value of the tower's height h to the nearest
centimeter.
123
5. LEAST-SQUARES PRINCIPLE
5.1 The Sample Mean as
"The Least 89-uares Estimator"
One may now ask oneself a hypothetical question: given the
sample L = (~.),].
i = 1, 2, .•• , n, what is the value ~ 0 that makes the
summation of the squares of the discrepancies
v.]. = 2.
].
- ~
0
, i = 1, 2, ... , n, ( 5-l)
the smallest (i.e. minimum)?
The above question may be stated more precisely as follows:
Defining a "new variance" 8* 2 as
2
S* =-1n n
I (2.-2°)
i=l J.
2
=-1n n
.I v.
2
J.=l ].
(5-2)
find the value 2° that is going to give us the smallest (minimum)
value of 8* 2 .
Obviously, such a question can be answered mathematically.
From equation (5-2), we notice that 8* 2 is a function of 2°, .which is
the only free variable here: and can be written as
(5-3)
We know that:
Hence, by differentiating equation (5-2) with respect to 2° and equa~-
ting it to zero, we get:

124
1 n -2 n
=- E [2(i.-R. 0 )(-l)] = - E (i.-i 0 ) = 0 ,
n i=l ~ n i=l ~
that is:
n
E (i.-i0
i=l ~
) =0 •
The above equation can be rewritten as:
n n·
0 0
E R.. = E R. = ni ,
i=l ~ i=l
which yields
o n
R. =-n1 E R.. - R.
i=l ~
(5-4)
The result (5-4) is nothing else but the "sample mean" I again. In
other words, the mean of the sample is the value that minimizes the
sum of the squares of the discrepancies making them equal to the
residuals,(see section 4.8).
This iE the reason why the mean I is sometimes called the
least-squares estimation (estimator) of i, i.e. of ~i ; the name being
derived from the process of minimization of the squares of the discre-
:pancies. We also notice that i minimizes the variance of the sample if
we want to regard the variance as a function of the mean.
Note that the above property of the mean is completely indep-
endent of the PDF of the sample. This means that the sample mean I is
always "the mi:"l.imum variance estimator of R." whatever t.he PDF may be.
125
5. 2 The Sample Mean as

"The Maximum Probability Estimator"
Let us take our sample L again, and let us postulate an underlying
parent PDF to be normal (see section 4.5) with a mean ~ = ~ 0 and a

~
variance cr~ 2 given by:

n
2
crn
"'
= ,S* 2 = -n1 .
~=1
l:
(~. -
~
We say that the normal PDF, N(~~' cr~; ~) ; N(~ 0 , S*; ~) is the most
probable underlying PDF for our sample L (L = (~. ), i = 1, 2, •.. n)

~
if the combined probability of simultaneous occurrence of n elements,
that ~the normal distribution N(~ 0 , S*; ~),at the same places as Lis
maximum. In other·words, we ask that:
P[ ( ~. < ~ < L + 8~.)' i = 1, 2 •• n] =

1. 1. 1.
n
= II N( ~o' S*; L)
1.
. 8~.
~
i=l
be maximum with respect to the existing free parameters. By examining equation
(5.6), we find that the only free parameter is ~0 (note that S* is a function
of ~ 0 ), and hence we can write the above combined probability as a function
of ~0 as follows:
P[ (~. < ~ < ~ + 8~.), i = 1, 2, • • • , n] = A (~ 0 ) (5.7)

1. - - i ~ .
Note that 8~'s are some values depending on L and therefore are determined
uniquely by L.
126
We shall show that the value of ~0 satisfying the above condition
is (for the postulated normal PDF) again the value rendering the smallest
value of S*. We can write;
n
[ A( ~ 0 ) J = max I II N(~ 0 , S*; ~.) 8~.]
1. 1.
~ 0 sR i=l
n n (~. -~0)2
l
= max-, II II exp (-
1.
) 8~.
1.
1
~ 0 sR i=l S*I(27T) i=l 28* 2
n (£.-~0)2
= max l )n II exp ( -
1.
8~.] (5.8)
0 [( S* ll(21r) 1.
Q, sR
i=l 2S* 2
n
Here II 8~. is determined by L, and hence does not lend itself to maximiza-
1.
i=l
tion. It thus can be regarded as a constant, i.e.
n (!l,.-.Q.o)2
max [ A( ~ 0 )] =max II exp ( - 1
2 ) ] . (5.9)
i=l 2S*
R. 0 sR
Let us denote the second term in-the RHS of equation (5.9) by Q, which can
be expressed as:
n (~.-~o)2
Q= .Ill exp ( -x. ) , where x. = -1---~- (5.10)
1.= 1. 1. 2S* 2
This implies that:
n n
£n Q = .Q.n II exp ( -x.)) =
1.
E tn ( exp ( -x. ) ) ,
1.
i=l i=l
or
n
Q = exp ( E ( -x. ) ) , (5.11)
i=l 1.
127
From equations (5-9) ' ( 5-10) and (5-ll) we get:
n (,Q,.-,Q,o)2 n
J. ) l (,Q,.-,Q,o)2]
IT exp (- = exp [-- i: (5-12)
J.
i=l 28* 2 28* 2 i=l
The condition (5-9) can be then rewritten as:
n
[( l )
S*I(2'TT)
From equation (5-5), we have:

n
l:
i=l
Hence by substituting this value into equation (5-13) we get:

l n
max [ A( ,Q, 0 ) ] = max [ ( ) exp ( - ~) ] •
,Q, 0 ER ,Q, 0 ER S*/(2'TT)
Since the only quantity in equation ( 5-14) that depends on ,Q, 0 is S*, we
can write:
n
[;x.(,Q,O)J =max [(..1.)] =
,Q, 0 ER S*
=min [(S*)n ]. ( 5-15)

,Q, 0 ER
Because S* is a non-negative (quadratic) function of ,Q, 0 , the minimum of
(S*)n will be attained for the same argument as the minimum of S* (see
Figure 5-l.
Figure 5-l
128
Finally, our original condition (equation (5-9) can be restated as:
( 5-16)
which implies that
as*
- as*-2=
-=- 0 '
that is:
n
a
-I. v.2 :::: 0 . (5-17)
].
ato i=l
Obviously, the condition (5-17) is the same condition as that of the
"minimum variance" discussed in the previous section, and again we have

0 -
R, = ll, •
We ha.ve thus shown that under the postulate for the underlying
PDF, the mean 1 of the sample 1 is the ~aximum probability estimator for
5/,. As a matter of fact, we would find that the requirement of maximum.
probability leads to the condition
(5.18)
for quite a large family of PDF's, in particular the s~etrical PDF's.
If one assumes the additional properties of the random sample as
mentioned in 3.2.4.. then additional features of the sample mean can
be shown. This agajn is considered beyond the scope of this course.
5.3 Least-Squares Principle

We have shown that the sample mean renders always the minimum sum
o.:f squares of discrE)ancies and that this property is required, for a lar~e
family of postulated PDF's, to yield the maximum probability for the underlying
129
PDF. Hence the sample mean ~' which automatically satisfies the condition
of the least sum of squares of discrepancies, is at the same time the
most probable value of the mean ~~ of the underlying PDF under the con-
dition that the underlying PDF is symmetrical. This is the necessary
and sufficient condition for the sample mean to be both the least squares
and the maximum probability estimator, i.e. for both estimators to be
equivalent.
The whole development we have gone through does not say any-
thing about the most probable value of the standard deviation a~ of the
underlying PDF*). a~ has to be postulated according to equation (4.23).
The idea of minimizing the sum of squares of the discrepancies
is known as the least-squares principle,and has got a fundamental
importance in the adjustment calculus. We shall show later how the
same principle is used for all kinds of estimates (not only the mean of
a sample) and how it is developed into the least-squares method. However,
the basic limitations of the least-squares principle should be born in
mind, namely
(i) A normal PDF (or some other symmetrical PDF) is postulated.
(ii) The least-squares principle does not tell anything about the
best estimator of a~ with respect to the mean ~~ of the
postulated PDF.
*) Some properties of the standard deviation s can be revealed if the

additional properties of the random sample are assumed (see 3.2.4).
130
5.4 Least-Sqaures Principle for Random Multivariate
So far, we have shown t.hat the least-squares principle spells
out the equivalence between the sample mean 9., and the estimate for the
parent population mean ~9., determined from the condition that the sum of
discrepancies be minimum. We have also shown that 9., is the most probable
estimate for P9., providing the parent population is postulated to have
normal or any other synunetrical PDF. We shall show now that the same
principle is valid even for random multisample if we postulate the under-
lying PDF to be st.atistically independent (see Section 3. 3. 2) .
Denoting the multisample by Land its components by Lj, j = 1,
2, ... , s, and remembering that each Lj is a sample on its own, we can
write:
(Ll, L 2 , Ls)
"}
L • ., " I
( 5-19)
j Q,j ) sRs
Lj (X,j
1' 9.,2'
... , n.
J
Assuming a particular value L for the multisample L, where

0
L = (Q,l, Q,2, • • • I
Q, s) t:Rs (5-20)
0 0 0 0
is a numerical vector (sequence of real numbers), the associated dis-
crepancies V, which can be regarded as a multisample as well, are:
·- (V ,
l
v2 , ... I vs ) ( 5-21)
Here, each Vj, j 1, 2, •.. , sis a sample of discrepancies on its own,
i.e.
.... ' (5-22)

131
Making use of formula (3-52), we can write analogically to (5.2)
S*~ 1
=-
n.
( 5-23)
J
J
. 2
'l'he minimization of the variances, i.e. minimization of each E[ (V'J) ] ,
is equivalent to the minimization of each s~ 2 , or as we usually write:

J
(min [E(V'j ) 2 ], j = 1, 2, .. , s);;:; min [trace Hi'J *}, (5-24)

L ER 8 L ERS
0 0
......
where L/r:;' is the variance-covariance matrix of the mu.ltisample L (see
section 3.3.6). By carrying out this operation,similar to section 5.1,
w·e w:i.ll find that the vector
( 5-25)
sati.sfi.es the cond:i.ti.on ( 5-2l~). On the other hand, the result ( 5-25)
is nothing else but the mean E of the multi sample, i.e. :

L = E c:Rs • (5-26)
0
N(~j, S~; ~j), for each component Lj

-
Postulating a normal PDF,
0 J
of the mu.ltisample L, the multivariate PDF of the parent population can
be written as:
~
s
~(£) = IT N(£j, S*. Q, j )
j=l 0 j '
2
s (£j-£j)
l " 0
= IT exp [- ]' ( 5-2'7)
j=l s~l(2rr) 2S~ 2
J J
where Q,j is the random variable having mean ~j and standard deviation s~·.
0 J
Following a similar procedure as in section 5.2, we end up again with
with the discovery that the vector
( 5-28)
*) Trace of a matrix is the sum of its diagonal elements.

13~
maximizes the probability that the members of the parent population will
,_.
occur at the same places as the members of the multisample L.
Hence -LER s is, _under the above conditions,*)the

_,
maximum probable estimator for the mean ~ of the postulated parent
multivariate PDF, where
( 5-29)
5.5 Exercise 5
1. Prove that the mean~ of a continuous PDF, ~(x), defined as:

00
~ =J x ~ (x) dx
minimizes the PDF variance cr 2 , defined as:
as* ---
2. Prove that---= 0, is the necessary and sufficient condition for the
&Q.o
rectangular (uniform) PDF, R( .Q. 0 _,__ S*; .Q.), to be the most probable
underlying PDF for a sample L with mean i and variance s2 . Note
that the analytic expression for the uniform PDF is given in example
3.17, section 3.2.5.
3. Prove that the ·same helds fC~r· t~..e triangular_--
PDF, T(.Q. 0 , S*; .Q.),using its analytic expression given in example

3.18, section 3.2.5.
*) It can be shown that L is the maximum probability estimator of ~ even . when

we postulate a statistically dependent multi dimensional PDF from a certain
family of PDF's.
133
6. FUNDAMENTALS OF ADJUSTMENT CALCULUS
6 .1 Prima.ry and Derived Random Samples
So far, we have been dealing with random samples (multisamples)
that had been obtained through some measurement or through any other data
collecting process. These samples may all be regarded as primary or original
random samples (multisamples).
In practice, we are often interested in other samples that would
be derived from the primary samples by means of a computation of some kind.
Such samples may be called derived random samples (multisamples).
From the philosophical point of view, there is not much difference
between these two, since even the "primary" samples may be regarded as
derived from the samples of physical influences or physical happenings.
However, it is necessary to distinguish between them to be able to speak
about the transition from one to the other.
6.2 Statistical Transformation, Mathematical Model
The transition from a primary to a derived sample (multisample)
along with the associated variances and ~ovariances~be called statistical
transformation. We have already met two examples of such transformation
although applied to random variable rather than sample (see sections 4.5 and
4.6), namely the transformation of the Gaussian PDF to the normal and to
the standard normal PDF's, respectively.
Such statistical transformation may not always be as simple as
in the above two cases. As a matter of fact, it may not be even possible
to derive the sample at all from the primary sample which is usually
134
the case with multisamples. In other words, it might not be possible to
express the derived sample explicitly in terms of the primary sample.
Let us consider a primary multisample L = (Li), i = 1, 2J . . . ,s,
that has s constituents. Each constituent Li = (~ki), k = 1, 2,
is a random sample on its own and represents a distinct physical quantity

i
~i (i.e. the observations ~k , k = 1, 2, . . • ,ni are all representing the
same physical quantity t. ). Now, we may be interested in deriving a multi-

~
sample X having n constituents, ie.
X : : : (Xj) ' J. =1 ' 2 >. • • ,n'

from the original multisample L; noting again that each constituent Xj
represents a distinct physical quantity xj, j = 1, 2, . . . ,n. The formulae
(relationships) relating the physical quantities ~ and
x, where
and
(6.1)*
X = (xl ' x2 ' . • . 'xn )
are called the mathematical model for the statistical transformation; and
is usually expressed as:
l F ( .Q. , x) =0 J (6.2)
where F denotes the vector of functions f., i

~
= 1, 2, ... ' r(having r
components)that can be established between ~ and x.
To be able to derive x from ~, the mathematical model (6.2) should
be formulated as:
x =F (t), (6.3)
* Note that ~ and x are nothing else but the multivariates corresponding
to the multisamples L and X respectively.
135
which gives x as explicit function of t.
Example 6.1: After having measured the two perpendicular edges a and b
of a rectangular desk (see Figure 6.1), suppose that we
are interested in learing something about the length of

b
,. ' the diagonal d, and about the surface area cc of this disk.
d d
" ex.. In this case, the mathematical model will be written as:
/
1/
x = F ( t) , where
Figure 6.1
x = (x1 , x2 ) = (d, a. ) , and
To derive the components of x from t we write:
a. = f 2 (a, b) = ab •
In vector notation, we can write:
x=[::l = [:1 =
The possibility .of e•rrying out the statistical transformation depends
basically on three factors:
(i) complexity of the mathematical model, ~.e., the possibility of expressing
x explicitly in terms of L (x:. F(t));
(ii) "completeness" of the primary multisa.mple L, i.e. whether all its con-
stituents have the same number of elements in order to deduce the variance-
covariance matrix ~1 ;
(iii) our willingness to match the individual s-tuples of elements from the
primary multisa.mple L with then-tuples of elements from the derived
multisa.mple X, which creates much of a problem.

136
Particularly the last two factors are so troublesome that we
usually do not even try to carry out the transformation and put up with
~
some estimates, i.e. representative values E(X) and E for

~tatistical
. X
the derived multisample instead. To do so, we first evaluate E(L) and
E1 for the primary multisample L, from which we then compute the statistical
estimates E(X) and EX for X.

According to the basic postulate of the error theory and to make
the subsequent development easier, we generally postulate at this stage the
PDF of the parent multivariate to the multisample L and assume
E(L)
-
= E*(£), EL = E*t'
-
E(X) = E*(x), and EX= E~ (6.4)
in very much the same way as we postulated
i· = p£ and SL = a R.
for the univariate case as discussed in section 4.7. This postulate allows
us to work with continuous variables in the mathematical model and write it
as:
F(L, X) =0 (6.5)
understanding tacitly that each value X has· its counterpart L.
- A
From now on·, we shall write E for E(L),a.nd .X for the statistical
estimate of . X • Hence the mathematical model (6.5) becomes

..
F(L, X) =0 , (6.6)
which consists of r functional relationships between L and X.
..
· From the point of view of the mathematical molel F(L, X) = 0,
the statistical transformation can be either solvable (if s > n) or unsolvable
(if s < n). If it is solvable then we may still have~ distinctly different
cases:
137
(.i) " (when r

ei·ther the model yields only one solution X = s = n) by
using the usual mathematical tools, i.e., X is uniquely derived from

-
L;
(ii) or the mathematical model is overdetermined (when r, s > n) and cannot

"
be resolved for X at all by using the ordinary mathematical tools,
since an infinite number or different solutions for X can be found.
The first case we have met in example (6.1) where the determina-
t.ion of X from L does not present any problem from the statistical point of
view. The only problem is to obtain L:X from L and L:L. This problem, known
as propagation of errors, will be the topic of the next section.
If the model is overdetermined, or as we often say, if there are
£~9unda~cies, (redundant or surplus observations) then the problem of trans-

- "
forming (L, l:L) + (X, l:X) constitutes the proper problem of adjustment.*)
6.3 ?ropagation of Errors
6.3.1 Propagation of Variance-Covariance Matrix, Covariance Law
The rela·tionship between EX and EL for a mathematical model
F (L, X) = 0
is known as the propagation of variance-covariance matrix. Such relationship
can be deduced explicitly only for explicit relations

"
X = F (L) •
To make things easier, let us deduce it first for one particular explicit
relation, namely the linear relation between X, and L, i.e.

-
* It has to be mentioned here that in practice we are in both cases working

with E- and ~>, the variance-covariance matrices of L and X rather than
l,L, EXLbelong~ng to the samples L and X. The expressions for l,E' l:~ are
derived in 6.4.4.
138
where B is indeed an n by s matrix composed of known elements *). Note
that X is determined uniq~ely, as required. We want to establish the
transition
E1 = E( (L-L) (L-E)T) ~ l: X ,where E = E(L). (6.8)
We can write:
(6.9)
Here X = BL, and according to the postulate introduced in section 6.2
we can write:
E(X) = E(B L + C) =B -
E (L) + C =B -L + C.
Hence
EX =E ( (BL
= E (B(L E) (B(L- E))T)
=BE ((L E) (L- E)T) BT = B(E 1 ) BT,
I Ex = B EL BT. I (6.10)
This formula (6.10) is known as the law of propagation of variance--
covariance· matrix, or simply the ·covariance law.
*) This matrix B, which determines the linear relationship between X and

L is sometimes called' the "design matrix" ,"the matrix of the coefficients"
of the constituents of L in the linearized model, or simply the "coef-
ficients matrix".
139
Example 6.2: Assume that the variance-covariance matrix of a given
multisample L = (R.l, .Q,2' .Q,3) was found to be

3 2 0
L:L = 2 3 1
0 1 4
If a multisample X= (x1 , x 2 ) is to be derived from L
according to the following relationships:
X
1
=
determine the variance-covariance matrix L:X of X.
It can be seen that the above relationships between the
components of X and L are linear, and our mathematical
model can be expressed as:
X= B L
2,1 2,3 3,1
i.e.
{::1 =
f: -:]
1
f::1
This indicates that the coefficients matrix B is given by:
B = [:
0
1
-3
0
I •
140
The variance-covariance matrix ~X of X is given by equation
(6.10) i.e., in our case:
~
X
= B
2,2 2,3 3,3 3,2
-310 .r~0 1~ ~4'1 rt-3; i0J

0
-1
7
-121
1
{-3~ i]
0
= [39
5
5]
23
i.e.
Now we shall show that the propagation of variance-covariance
matrix can be deduced even for a more general case, namely the non-linear
relation between X and L, i.e. -

X =F (L) (6.11)
when F is a function with at least the first order derivative. Here we
have to adopt another approximation yet. We have to linearize the relation
(6.11) using, for instance, Taylor's series expansion around an approximate

value 1° for L .
X= F(L0 ) + dF (L - L0 ) +higher order terms,

dL JL = L0
where
dF
dL
IL= Lo
(L - L0 ) =
s
L:
i=l
-, C!F
dR,. R,.
~ ~
=
141
Taking the first two terms only, which is permissible when the values of
the elements in~ 1 are much smaller than the values of 1i' we can write:
(6.12)
where is again ann by s matrix but this time composed from all the partial
B
ax.
~
derivatives alj *).Applying the expectation operator we obtain
.Q, =.Q.o
j j
realizing that E(F(L 0 )) = F(L 0 ) and E(L 0 ) =L
0 (because L0 is a
selected vector of constant values)
(6.13)
Subtracting (6~13) from (6.12) we get:
X E (X) . B(L- E(L)) = B(L- L) (6.14)
and we end up again with
Ex
. (6.15)
resi:tzing that EX-E(X) = EX
* Explicitly' if we have
xl xl (.Q.l' Q,2' Q, )
' s
x2 x2 (.Q.l' .R.2' . '
.Q,
s
)
X = =
X
n
X (tl, .Q,2'
n
. '
.R.
s
)
:- :,t'lleta.'.. the matrix B will take the form:

ax 1 ax 1 ax1
a.R.l a.Q.2 a.Q.
s
ax:2 ax 2 ()X2
B=
a.R.l al2 a.R.
n,s s
....
axn ax ax
_!l n
a.R.l a.R.2 a.R. •
s
142
Hence the linear case may be rega:cded as one particular instance(special
case)of the more general explicit relation, yielding therefore the same
law for the propagation of variance-covariance matrix, i.e., the same
covariance law. It should be noted that the physical units of the individual
elements of both matrices B and E1 must be considered and selected in
such a way to give the required units of the matrix EX
Example 6.3: Let us ;!:;·ak.e .again the example . 6.1 and form the variance
covariance matrix EX for the diagonal d and the area a of
the desk in question. We have:
EL = [s~ sabl
sab s~ J
and the model is non-linear, although explicit, i.e.
X = F ( L) , or ( d, a ) = F (a, b) •
We have to linearize it as follows:
X = (d, a) = (d0 , a 0) + B [(a, b)- (a0 , b0 )],
where (do, ao) = F(a0 , b 0 ), and
~ere:
.
B =
r aa
-aa
a~
Ml
a
3b
a
ab •
a.
Hence, the matrix B in this case takes the form:
B =
2,2
and by applyi.ng · the covariance law (equation ( 6.15 )) we
·. t!~
get:
I:
X
= sd•l
82
= B I:
L
BT
8ad a
= t~d b~dJ [s;

8ba
a/d
ab] b/d
82
b
8 r :}
=
Example 6.4: Let us assume that the primary multisample L = (a, b) which
we have dealt with in Examples 6.1 and 6.3 is given by:
L = {a, b} = {(128.1, 128.1, 128.2, 128.0, 128.1), (62.5,
62 •7 , 62 •6 , 62 •6, 62. 5 )} , in e~ntil!letres •
Accordingly, the statistical estimate of the derived
quantities will be
i =l~J..
· a'·
= [II (a)~a + b~1>) 2 ]].;
where a and bare the estimates (means) of the two:meas'U.r.ed
sides of the desk. From the given data we get
a = 128.1 em. and b = 62.58 em.
Hence
144
X=
= [
142. 57 em
· 8016. 50 cm2
l
E1
l
After computing the variance-covariance matrix we get
·ro.oo4 0 em 2
~L = 0 0.0056
which indicates that the constituents a and b are being taken
as statistically independent.
Evaluating the elements of the B matrix (as given in Example·
.6. 3) we get:
B =
-a
a"
b
b
A
-a
=
I 0.898
62~58
o.439l
128.1
in which the elements of the first row are unitless, and
of the second row are in em.
Finally ~X is computed as follows:
~ =B ~ BT
X L
= [0.898 ro.~o4
0.~056]
0.4391 [ 0.898 62.581
62.58 128.1 0.439 128.1
= [0.0043
0. 5397
0.5397 ]
107 . 5627 , with units
[ 2
~:3 =~]
em •
Furthermore
sd = l(o.oo43) = o.o66 em.,
S = 1(107.5627) = 10.37 cm2

a
145
6.3.2 Propagation of·Errors,;,'¢1ncorrel.._teQ. C$-se
If X contains one component only, i.e. x, the matrix B in the formulae
( 6.10) or ( 6.15) degenerates into a 1 by s matrix, i.e. into a row vector

B s [B 1 , B2 , • . • ,Bs], and
EX = BI:LBT
becomes a quadratic form which has dimensions 1 by 1. Then·
s2X . (6.16)
If, moreover, 1 is assumed uncorrelated, we have
r1 = diag (S~ ,
1
si2 ' . . . ' (6.17)
which is a diagonal matrixJand we can write
s
8.2
X
= I:
i=l
B~
l
si i' (6.18)
This formula is known as the law of propagation of MSE' s or simply the
law of propagation of errors. The law of propagation of errors is hence
nothing else~ but a special case of the propagat~on of variance-covariance
matrix.
The law of propagation of errors has many applications in
surveying practice as well as in many other experimental sciences.
Example 6.5: In figure 6.2, we. assume a plane-triangle in which
the angles a and {3 whose estimat,ed values are:

- = 32°
a 15' 20 11 ' with s a = 4" '
s = 75° 43' 32" ' with se = 3" ' are e>bserved
Also, assume that a and {3 are independent, i.e. sa{3 = o.
Let us estimate the third angle y ,along with its standard
error Sy ,as follows:

Figure 6. 7.
146
r = 18o 0 - (a + s) = 72° 1, "o8 J
8 2 = (ay)2 8 2 + (ar)2 82
y aa a a~ s
= (;.,1) 2 (4) 2 + (-1) 2 . (3) 2 = 16 + 9 = 25 )
that is: 8y = 5" •
Example 6.6: Figure 6.3 shows a levelling line between two bench marks
A, C, with observed level differences h. of the individual

l.
sections with length ~i' i = 1, 2, • . . ,s. Assume that

c
all the h.'s are uncorrelated and the M8E of h. is propor-
1. l.
t 1.,\on a 1 · t o 2
k
.II(, i 1
•
1. • e • 8h . = k ~.,
l.
where k is a constant.
l.
Let us deduce the exPression for the M8E of the overall level
difference ~H between A and C where:
s
~H = HC - HA = r hi •
i=l
The mathematical model in this case is
+ ••• + h
s
A Hence:
: 2· (~)2
Figure 6.3 2
8~H = (a~H)2
dhl sh + dh
82
h2 + • ..
1 2
= ( 1) 2 (k~l) + ( 1)2 ( k~2) + •
s s
= r k ~.
l.
=k z ~.l. '
i=l i=l
which means that the M8E of ~H equals to the constant of propor-
tionality k multiplied by the total (overall) length of the
levelling line A - C.
147
Let us consider the example 6.3 and assume that the errors in
a, b are uncorrelated i.e. 8ab =0 as we did in Example 6.4. Then we can
treat d and OG separately (if we are interested in their individual M8E' s
alone) and we get by applying the law of propagation of errors:
82 = (~)2 82 + (~)2 82 1
=~ (a2 8 2 + b2 8 2)
d aa a <lb b a b
82 = (~)2 82 + (~)2 82 = b282a +

a
282
b
a aa a <lb b
Note that the same results can be obtained from Example 6.3 immediately by
putting 8 ab = 0.
On the otber hand, if we are interested in the covariance 8da
between the two derived quantities d and ~ , we have to apply the covariance
law (equation 6.15) and we will end up with
8 da = a~ ( 8 a 2 + sb 2) '
that is Sda ~ 0, and ~X (X= (d, a)) is not a diagonal matrix, even though
the ~L of the primary multisample is diagonal i'.e .. Sab = 0, see the
results obtained in ·E:xample 6. 4 • This is a very important discovery and
should be taken into consideration when using the derived multisample
X = (d, a) for any further treatment in which case we cannot assume that
d and a are uncorrelated any more and we must take the entire ~X into
account.
Example 6. 7: Let us solve Example 6.2 again, but this time we will
consider the primary multisample L = (~ 1 ,~ 2 ,~ 3 ) as
uncorrelated and its ~L is:

148
"~L = rl3~ 300 0041 = diag (3, 3, 4).
From example 6. 2 we have:
=[~ ~3] X= [=~l

0
B l =d
Hence:
I
X = Bl: LBT )
L:x =[:
0
l
-:] 3
0
0
3
0
0
l
0
2
0 0 4 -3 0
=[3: l:l = r xl
82
8
x2xl
xh]
8
82
x2
which again verifies the fact that even when ~L is diagonal
the !'X is not.
On the other hand we can treat x 1 and x 2 separately by using
the law of propagation of errors (since L is uncorrelated)
to get 82 and 8 2 separately; for instance ,

xl x2
axl 2 2 ax ax
2 + (_1)2 82 + (_1)2 82
8
xl
= (a;-) 8 "R; a.Q.2 ,11_2 at 3 ,11_3
l l
= (1) 2 (3) + (o) 2 (3) + (-3) 2 (4)

=3 + 0 + 36 = 39,
which is the same value as we got by applying the covariance
law above.
Example 6.8: To determine the two sides AC = z and BC =y of the plane
c triangle shown in Figure 6.4, the length AB =x along with
the two horizontal angles a and S were observed and their
estimates were found to be:
x = 10 m, with S
X
=3 em,
-a = 90° with sa = 2 "
'
i3 = 45° with ss = 4"
Figure 6.4
'
s aS = -1 arc sec
2
and sXCI. = sx(3 = 0 .
It is required to compute the statistical estimates for
y and z along with their associated variance-covariance
matrix~ in cm2 , where

X = (y, z).
First, we establish the mathematical model which relates the
primary and derived samples, i.e.,
X= F (L), where
From the sine law of the given triangle we get:
sin a.
y-· = _..;..._
2<
sin S
~
= _:;;;.__
sin y
.'
however the angle y is not observed, i.e. it is not an
element of the primary sample, therefore we have to sub-
stitute for it in terms of the observed quantities, say
a and (3 by putting
150
sin y =sin (a+ S),
and we get:
y =X sin a
--~~~-
sin (a + S)
Z =X sin S
--~~~--
sin (a + S) •
By substituting for a, S and x, we get
"'y = 10 \I 2 = 10 ( l. 414) = 14.14 m 1
z"' = 10 m.
Our mathematical model then can be written as:
(a, S,
(a, S,
x)1 = [x
x)
sin a/ sin (a + S)J
x sin S/ sin (a + S)
To compute LX =B LL BT , we have to evaluate the matrix
-B "W<nic-b: is ·of the form
~ 2:/. 'Oy
\ 'Oet ' as ax
B =
az az 'Oz
aa ' as a:x:
z -y
sin(a + s) tan (a + S) X
=
-z z
tan (a +S) sin ( et' + S) X
From the given data, the matrix E1 takes the form

151
82 s
a. sa.s a.x
= s2 s
IL ssa. s Sx.
s s2
xa. 8xs X
4 -1 0
= -1 16 0
0 0 9
(It is very important to maintain the same seQuence of the
elements of the primary sample in both matrices B and ~L
to give a meaningful ~X.)
Now matching tbe units of the individual elements of B and
~L' keeping in mind that ~X is reQuired in cm2 , results in

scaling the B matrix to
z ( 100) -Y(lOO) ¥...

X
p" sin(a. + S) p"tan(a. + S)
B=
-Z(lOO) y( 100) z
X
p"tan(a. + S) p"sin(a. + S)
where p" = 206265 = 2.10 5 arc sec.
Evaluating the elements of the above B matrix we get:
B =
[0.007 0.007 1.414]
0.005 0.010 1.000
and conseQuently
152
L:x =[· 007

.005
.007
.010
1.4141
1. 000
r 4
J l ~1 1~
-1 0]
~ ~1.414
.007
.007
.005
.010
J
1.000
)
i.e.
.[18. 0009 12.7272] . [.18 13] cm2

12.7272 9.0016 13 9 '
and
S
y
= 118 = 4.2 em J
S
z
= l9 = 3 em.
The results of the above example show that the high precision
in measuring the angles a and S has insignificant effect on the estimated
standard errors of the derived y and z lengths as compared to the effect of
the precision of the measured length x. Hence, one can use the error
propagation to detect the main deciding factors in the primary sample on
the accuracy of the derived quantities and decide on the needed accuracy
of the observations. This process is usually known as pre-analysis which
is done before taking any actual measurements by using very approximate
values for the observed quantities. This results in accepting specifications
concerning the observations techniques to achieve the required accuracy.
Some more details about it are given in section 6.3.5.

153
6. 3. 3 Propagation of Non-Random Errors, propagation of Total :Errors
The idea of being able to foretell the expected magnitude of
the MSE (as a measure of random errors) of a function of observations -
this is essentially what the law of propagation of errors is all about -
is often extended to non-random errors. These non-random errors are
sometimes called systematic errors, for which the law governing their
behaviour is not known. Hence, the values of such non-random errors
used in the subsequent development are rather hypothesized (postulated)
for the analysis and specification purposes.
The problem may be now stated as follows: let us have an
explicit mathematical model
X =f (L) , (6-19)
in which x is a single quantity, f i.s a single-valued function and
L = (~ 1 , ~2 , ••. , ~s) is the vector of the different observed quantities
that are assumed to be uncorrelat~d. We are seeking to determine the
influence of small, non-random errors o~.1 in each observation R.. on the

1
result x. This influence will be denoted by oX •

The problem is readily solved using again the truncated
Taylor's series expansion, around the approximate values 1° = (~~' R.~, ... ,~~),
from which we get:
X
. f(L 0 )
=
of (L-1°)
+ aLl
L = 10
s
= X
0
+ L:
of (~.-~~) (6-20)
i=l F.i
J.
R, • = ~~ J. J.
J. J.
154
By substituting o~. for (~.-~?) and 0 for (x-x 0 ) in equation (6-20)

l l l X
we get:
0
X
= a~.l *) ' (6-21)
which is the formula for the propagation of non-random error.
Note in formula ( 6-21) , the sign~ of both the partial
derivatives (~~i) and the non-random errors o~i' haveto be considered.

(Compare this to formula ( 6-18) .)
We may also ask what incertitude can we expect in x if the
observations ~. are burdened with both random and non-random errors.

l
In such a case we define the total error as:
I= T l(a 2 + 8 2 ) ~ (6-22)
with o being the non-random error and 8 being the M8E. Combining
the two errors in x as given above and using equations (6-18) and (6-21)
we get:
s of 2 s
T
X
= /[~; 1
~
a:r-l ali) +
i=l
z:
s af 2 2 2
= I[ i=l
z: ( (-;-;;-) ( o~. +8. ) } +
o~i
0~.]
l l J
or
s 2
T = ·'j t l: ( ..u:_) 2
T .. + q] ' ( 6-23)
x i=l a~. l:
l
where q may be regarded as a kind of "covariance" between individual non-
random errors, and T. is the total error in the observation~ .•

l l
*For the validity of the Taylor's series expainsion, we can see that
the requirement of o~i being small in comparison to ~i is obviously
essential.
155
As we mentioned in section 4.2, the non-random (systematic)
errors may be known or assumed functions of some parameters. In this
case their influence oX on x can be also expressed as a function of the
same parameters.
Example 6.9: Let us solve Example 6.2 again considering the primary
multisample L = (~ 1 , ~2 , ~3 ) to be uncorrelated with variance-
covariance matrix:
and having also non-random (systematic) errors given as:
same units as the given standard errors,.
It is required to compute the total error in the derived
quantities: x 1 and x 2 according to the mathematical model
given in Example 6.2.
The total errors are given by equation (6-22) as:
We have:
3 ax 1 2
82 = I: (aT"") 82 = 39
xl i=l ~
~.
~
'
3 ax 2 2
82 = L: (aT)
x2
82
~.
= 15 .
i=l ~ ~
The influences o and o- due to the given non-random errors
xl x2
in L are computed from eguation (6-21) as follows:
3
0 = L:
xl i=l
= (1)(-1.5) + (0)(2) + (-3)(0.5)
= -1.5 + 0 - 1.5 = - 3
3 3x 2
0 = L:
().Q., o.Q..
x2 i=l l
l
= (2)(-1.5) + (1)(2) + (0)(0.5)
= -3 + 2 + 0 = - 1 •
Hence, the reguired total errors will be:
T
xl
= IK-3) 2 + 39] = 1[48] =~ '
T = 1[{-1) 2 + 15] = 1[16] = ~·

x2
Example 6.10: Consider again Example 6.6. In addition to the given
information, assume that each height difference h. has got

l
a non-random (systematic) error expressed as oh = k'h.,

l i l
where k' is another constant, a constant of proportionality
between h. and oh . Determine the total error in LlH where

l .
l
s
LlH = HC-HA = L: h. = h1 + h 2 +
l
... + h
s
i=l
The total error in LlH is given by:
157
In Example 6.6, we found that:
s
where k was a constant and iAC =i=l
E i. is the entire length of
~
the levelling line AC.
We can now compute o~H as follows:
s a~H
o~H = i=l
E
3h.
oh.
~
~
where
a~H a~H a~H

- - = -3h
3h1
- = ... =--=
3h 6
1'
2
and
= k'h. ~
Then we get
s s
~H.
o~H = i~l ;'.~: .•. k.'' i~l hi = k'
Finally, the expression for the total error in ~H will be:
6. 3. 4 Truncation and Rounding
In any computation we have to represent the numbers we work with,
whioh may be e~toer irrational like n, e, 12, or rational with very
many decimal places like 1/3, 5/11, etc., by rational numbers with a
fixed number of figures.
The representation can be made in basically two different
ways. We either truncate the original number after the required number
158
of figures or we round off the original number to the re~uired length.
The first :process can be mathematically described as:
(6-24)
where a is the original number assumed normalized*), n is the re~uired
number of decimal :places and Int stands for the integer value.
Example 6.11: ~ = 3.141592 ••••. , n = 3 and we get:
7T =T= ~
= Int (31 41.592 ..• ) 10-3

= 3141 • l0- 3
= 3.141.
The second :process, i.e. the rounding-of~ can be described
by the formulae:
I a ,;, ~ = Int (a•lOn + 0.5)/lOn \ ( 6-25)

in which all terms are as described above. ·
Example 6.12: ~, n =3 and we get:
1T = TIR = Int (TI•l0 3 + 0.5) l0- 3
= Int (3141.592 + 0.5) l0- 3
= Int (31 42.092 .•. ) 10-3

= 3142 . l0- 3
= 3.142.
It can be seen that the errors involved in the above two
alternative :processes differ. Denoting the error in "a" due to
* To normali.ze the number, say 3456.21, we write it in the form 3.45621 • 103.
159
truncation by 8 and the error due to rounding by 8 'we get:

~ ~ .
8 =a - ;, E [0, 10-n)
aT
8 =a- ~ E [-0.5 10-n, 0.5 10-n)
~
and we may postulate that-8 has a parent random variable distributed
aT
according to the rectangular (uniform) PDF (see section 3.2.5):
R( 0.5 10-n , o; 8 ) (6-26)

~
while 8 has parent PDF:
~
( 6-27)
as shown in Figure 6.5.
0 0
PDF of rounding errors PDF of truncation errors
Figure 6.5
160
From example 3.17, section 3.2.5 we know that 0 = q/13, where q equals
-n
half of the width of the R. In our case, obviously q = 0.5 10 so
that (J = 0.289 10
-n
.
Beeause of their different means, the error in truncation
propagates according to the "total error law" and the errors in rounding
propagates according to the "random error law". Hence, if we have a
number x:
.x=f(L), ( 6-28)
where
L=(.Q..), i = l, 2, ... ,s
l
is a set of s numbers to be either truncated or rounded off individually,
we can ~orrite the formulae for the errors in .x due to truncation and
rounding errors in the individual .Q.~s as follows:

l
s s
= l[(l: lL l:
i=l at.l i=l
l
(6-30)
12
This indicates clearly that the error in .x due to the rounding process
is less than the corresponding error due to truncation; and this is
w·hy we always prefer to work with rounding rather than truncation.
Example 6.13: Let us determine the expected error in the sum .x of

1000
a thousand numbers ai~.x = i~l ai, if
( i) the individual values a. were truncated to five decimal

l
places;
161
(ii) the individual values a. were rounded-off to five

~
decimal :places.
Solution:
(i) The error 8 due to the truncation of individual a. is

XT ~
computed from equation (6-29) as follows:
1000
8 =I{[ E ~X . (0.5 . l0- 5 )] 2 +
~ i=l oai
1000 ~x 2 1 10
+ .E (-~-) ( 12 10- )}
~=l oai
=1{[0.5 . l0- 5 . 10 3 ] 2 + _! . lo- 10 . 10 3 }

12
= I{~4 . lo- 4 + _! . l0- 7}

12
= l{lo- 8 (2500 + 0.833)}
• 0.005001 = 0.005 .
(ii) The error 8 due to the rounding bf individual a. is

~ ~
computed from equation (6-30) as follows:
1000 2
(lL) ( - l . 10-10)}
8 = I{ E
XR i=l aa.~ 12
l . 10-10)}
= I{ (looo) (-
12
= l{lo- 8 (0.833)}
. 0.000091
'
which is much smaller than the corresponding 8
XT
162
6.3.5 Tolerance ~imits, Specifications and Preanall!!!_
Another importantant application of the propagation laws for
errors is the determination of specifications for a certain experiment
when the maximum tolerable errors of the results, which are usually
called tolerance limits, are known beforehand. Such process is known
as pre-analysis. The set-up of the specifications should therefore
result in the proper design of the experiment, i.e. the choice of
observation techniques, instrumentation, etc., to meet the permissible
tolerance limits.
The specifications for the elementary processes should account
for both the random and the inevitable non-random (systematic) errors.
This is, unfortunately, seldom the case in practice. It is usual to
require that the specifications are prescribed in such a way as to meet
the tolerance limits with the probability of approximately 0.99. If
we hence expect the random errors to have the parent Gaussian PDF,
the actual results should not have the total error, composed of the non-
random error o and 2. 5 to 3 times the RMS , which corresponds to
probability of 99% •larger than the prescribed tolerance limits, i.e.
IT ~ 1{0 2 + 2
(3a ) ), (6-31)
Example 6.14: Assume that we want to measure a distanceD = 1000 m,
with a relative error (see 4.10) not worse than 10-4 , using a
20 m tape which had been compared to the "standard" with a
precision not better than 3cr < 1 mm, i.e. tolerance limits
of the comparison were ~ l mm • Assume also that the whole
length Dis divided into 50 segments d.,

~
i = l, 2, .•• , 50,
each of which is approximately 20 m. Providing that each
segment d. will be measured only twice: forward F. and

1 1
backward B., what differences can we tolerate (accept or

1
permit) between the back and forth measurements of each
segment?
Solution:
The tolerance limits in D, i.e. the permissible total
error in D, is given by
TD = lOOCm. 10
-4 = 0.10 m = 10 em.
This total error TD is given by
where cD is the non-random (systematic) error in D, aD is
the random error in D and the factor 3 is used to get probability
> 99% according to the assumed Gaussian PDF. Knowing that
50
D = i=l
L: d~, where d. = -12 (F.+B.)
... 1 1 1
we get:
50 3D
L: cd.
i=l
ad.1 1
where
Hence,
50
L: 1· c.1 <- 50 mm =5 em.
i=l
164
Thus, we must require that:
em2
or
2 75 cm2 , :8-,. 33 2
aD !.. -9 = _.. - em
in order to meet the specifications.
Denoting the MSE in the individual segm~nts di·by
ad = ad (all assumed equal) we get

i
2 50 2
a = t a
D i=l di
from which we obtain
Rememb.ering that each segment di is given by:

' 1
d.1. = 2 (F.1. +B.)
1.
,
and denoting the MSE in either Fi or Bi (both assumed equal)
by a we get:
2 2 ad. 2 2" adi 2 2

ad =a
d = 1
<aF ) aF + (air") aB
i i i i i
2 2
= (1)2
2
a2 + (1)
2
a
and
2 2
o ~ 2od = 2(0.lb) = 0.33 em.
Recalling that we want to know what differences between the
forth and back measurements can we tolerate, and denoting
such differences by ~., we can write:

2
~. =F.-B.
2 2 2
Then:
2
3~. 3~. 2
2 2 2 2
a~. = 0~ = (3F~) oF. +(3B~) OB
2 2 2 2 2
= (1)202 + (-1)202
= 2o 2 •
Thus, we end up with the condition:
2 2 - 2
o~ ~ 2o = 2(0.33) = 0.66~ em
or
o~ ~ 0.816: 0.8 em.
This means that if we postulate a parent Gaussian PDF for
the differences ~' the above o~ is required to be smaller
or equal than the RMS of the underlying PDF. Consequently,
the specifications will be as follows: We should get 68%
of the differences ~'s with~ o~, i.e. within+ 0.8 em, and
95% of~ within~ 2o~, i.e. within~ 1.6 em. These specifications
are looser than a man with an experience in practice would expect.
It illustrates the fact that in practice the specifications are
very often unnecessarily too stringent.

166
6.4 Problem of Adjustment
6.4.1 Formulation of the Problem
Let us resume now at the end of section 6.2 where we have
defined the proper problem of adjustment as the transition

"
(E, E1 ) + (X, EX) ( 6. 32)
for an overdetermined mathematical model
F(L, X) =0 • (6.33)
By "overdetermined" we mean that the known E contains too many components
to generally f.it the abbve model for whatever X we choose, i.e. yielding
infinite number of solutions X • The only way to satisfy the model ,
i.e. the prescribed relations,is to allow some of or all the E to change
slightly while solving for X. In other words, we have to regard E as

an approximate value of some other value L which yields a unique solution X
and seek the final value L together with X.
Denoting
L - E =v (6.34)
we may reformulate our mathematical model (6.33) as:
"
F(L, X) = F(L + V, X) =0 (6.35)
where V is called the vector of discrepancies.
Note that V plays here very much the same role as the v's
have played in section 4.8. From the mathematical point of view, there
is not much difference between V and v. However, from the philosophical
viewpoint, there is, because V represents a vector of discrepancies of

s different physical quantities (see also section 5.4)while v was a vector
of discrepancies of n observations of the same physical quantity. To
show the mathematical equivalence of these two we shall, in the next
section, treat the computation of a sample mean as an instructive adjust-
ment problem.
6.4.2 Meafi,of a·sample as an Instructive Adjustment Eroblem,weights
Let us regard a random sample L = (~ 1 , ~ 2 , . . . ,~n) of n
observations representing one physical quantity L as uncorrelated estimate
of the mean . Further we shall denote the definition set of L by 1,
where 1 = (1 1 , 1 2 , . . . ,1m) consists of only m distinctly different
values of ~'s. Let us seek an estimate x, satisfying the mathematical
model
X =~ (6.36)
representing the identity transformation. Evidently, the model is
overdetermined because the individual 1.,

J
j = 1, 2, . • 'Jm, are
different from each other and cannot therefore all satisfy the model.
So, we reformulate the model as:
j = 1, 2, . . . , m (6.37)
where the v's are the discrepancies. We have to point out that, although
we seek now the same result as we have sought in section 4.7, t.he formu-
lation here will be slightly different to enable us to use analogies later
on. While we have been taking all the n observations into account in section
4.7, we shall now work only with them distinctly different values 1.,
J
168
j = 1, 2, ••• ,m, that constitute the sample L*t

Thus we shall have to compute the mean I from the second
formula introduced in section 3.1.3 (equation (3.4)}, i.e.:
m: -
I: t j
i = j=l. P(ij·· }= ~ i."J
.1=1
PJ ' (6.38)
rather than the first (equation (3.3)) as used in section 4.T. Here,
according to section 3.1.3, Pj = cj/n with cj, being the count of the
same values tj in the original sample t•containing all n observations.
Hence Pj are the experimental (actual) probabilities. In other words, if·
we wish x to equal i, the.model (6.37} yields the following solution:
m
X= I: . 'l pj J (6.39}
j=l . .1
or
(6.40)
in vector notation,
The coefficients Pj are called weight coefficients,or simply

.....
weights,and xis called the weighted mean- analogy borrowed tram mechanics
(see section 3.1.3}. Note that, with the weights being nothing else but
the experimental probabilities, we put "more weight" on the val.ues with
which we are more "certain", i.e. which are repeated more often in the
sample, which is intuitively pleasing.
* L = (i , "i , • • •, "i )
can be regarded in this context as a sample of
"groupe~" o~servation~, i.e. each constituent t , j = 1, 2, • ·• .,m,
has a count ( frequenby). cj associated with itjin the original sample
L •
169
In our slightly different notation even the least-squares prin-
ciple, as formulated in section 5.3, would sustain a minor change. While
.
we were see k ~ng such Nno as t o rnak e
1 2 n n o 2
v. l: = ;1 l: (~ - ~ ). (6.41)
. 1 ~
n ~= i=l i
minimum, we would have to write now the·condition of minimum variance as:
m 2
min [ l: P.v. (6.42)
~o 8 R j=l J J
where v:j = ~. - ~ 0 • In matrix notation, (6.42) becomes:

J
( 6. 43)
where P is a diagonal matrix, i.e.
(6.44)
The latter formulation, i.e. equations 6.42 and 6.43 is more
general since we can regard the former formulation, i.e. euqation 6.41 as
a special case of (6.42) and not vice-versa. We have

n 2 n 2
1
l: V, = l: P.v.
n ~ ~ ~
i=l i=l
which implies that P. =

~
l,
n
fori= 1, 2, ••• , n are equal weights for all
the observations i!. Hence we shall use (6.43) exclusively from now on.
~
The same holds true even for the two formulae for i and we shall use
equation (6.40).
Note that once we apply the condition 6.43, the discrepancies
cease to be variable quantities and become residuals (see 4.8). We shall
denote these residuals by v.

170
Equation 3.7 can now bF> obviously written as
m A
s 2 = 2
E P. v j (6.45)
L J
j=l
or in matrix notation,
[s L
2 = VT PV.
l (6.46)
Consequently, we shall restate the least-squares principle as follows:

T
the value x that makes the value of the quadratic form V PV the least
ensures automatically the minimum variance of the sample L. This property
does not depend on any specific underlying PDF. If L has got normal
A
parent PDF (or any symmetric distribution), xis the most probable estimate
of x, which is sometimes called the maximum likelihood estimate of x.
6.4.3 Variance of the Sample M.ean
We have shown that the simple problem of finding the mean of
a sample can be regarded as a trivial adjustment problem. Hence we are
entitled to ask the question: What will be the variance-covariance matrix
of the result as derived rrom the variance-covariance matrix of the original
sample ? In other words, we may ask what value of variance can be as soc-
iated with the result - the mean of the sample.
The question is easily answered using the covariance law
(section 6.3.1). We have established that (equation 6.40):

A T-
x =P L •
Hence, by applying the covariance law (equation 6.15)we obtain:

171
l:" = Bl: L -BT = = s~

X X
i.e.
s* X
= ( 6. 4 7)
Here l:L is not yet defined. All we know is that L- (i 1 , i 2 ,•. im)
is a sample of "grouped" observations £. with different weights (observed

~
probabilities) P. associated with them. Let us hence assume these obser-

~
vations uncorrelated and let us also assume that there can be some
"variances" S~ attributed to these observations. In such a case, the

R,i
variance covariance matrix of L can be expressed as
-
2
l:- = diag (Si I s- ,
2
• • • I s~ ) (6.48)
L £2 R,
1 m
Substituting (6.48) into (6. 47) I we get:

m 2
s~
X
= l: P. s2-
J £.
*) . ( 6. 49)
j=l J
On the other hand the value of x (i.e. the sample mean) can be
computed using the original sample of observations, L = (t 1 , t 2 , ... , tn),
i.e. the ungrouped observations ti' i = 1, 2, ... , n, which all have equal
experimental probabilities (equal weights) of 1/n, yielding:
n
1
X = n1 l: £. =
~ n
(6. 50)
:.i=l
2
Hence, we can compute the variance of the mean, i.e. S", again by applying
X
the law of propagation of errors on (6.50), and we get:
n 2 n
l: = (!) l: (6.51)
n
X i=l i=l
*It should be noted here that since L <i 1 , i 2 , ••. , i) is a sample of =

group observations, for which a differen~ weight Pj (ex~erimental proba-
bility) is ~ssociated with each_element £., j = 1, 2, ••. , m, the individual
variances s I assigned to the t. are, inJgeneral, different from each other,
i.e. they j vary with the grortps of observations.
172
in which all the 'varianc~ s~. are again assumed to have the same value
~
and equal to the sample variance s 2L given by

n 2 A
2: ( ~. ;;.. x) . (6.52)
~
i=l
Equation (6.51) then gives:
s~ = (6.53)
X
which indicates that the variance of the sample mean equals to the variance
of the sample computed from equation (6.52) divided by the total number
of elements of the sample*~
We thus ended up with two different formulae, (6.49) and
(6.53), for the same valueS~.

X
In the first approach, we have regarded
the individual observations (really groups of observations having the same
value) as having different variances s 21 associated with them. The second

j
approach assumes that all the observations belong to the same sample with
Numerically, we should get the same value of S~ from both

2
variance SL.
X
formulae, hence
m 2
2 8L
2: (P 2
j
s-~. ) =-
n
(6.54)
.··j=l ~
Let us write the left hand side. of ( 6. 54) in the form:
m
I: (P 2
j
j=l
* In terms of our previous notation, we can write the variance of the

sample mean as
n •
173
and the right hand side in the form:
8 2 n 8 2
L 1
--=
n n L: (l)
n
i=l
Using the oame manipulation as in section 6.4.2 when dealing with the
V's and ~'s, and also earlier, in section 3.1.4 when prooving equation
(3.4), the right hand side can be rewritten as:

2
n 8L m 82
1
n
E (-) = E [P. (_&_)]
i=l n j=l J n
in which P. has the same meaning as in (6.49).

J
Now, the condition (6.54) becomes:
m 82
L
E [ PJ. (--;-) ], (6.55)
j=l
which can
K, j = 1, 2, . . . , m, (6.56)
where K is a constant value for a specific sample that equals to the
variance of the sample mean. From (6.56) we get:
8tj = p
K
j
' j = 1, 2, .•. , m, (6.57)
which shows that in order to get the correct result from (6.49) we have to
assume that in the first approach the individual observations have
variances inversely proportional to their weights.
This result is usually expressed in the form of the following
principle: the weight of an observation is inversely ;proportional to its
variance, i.e.
(6.58)
174
We can also write using equation (6.57):
= 1·s 02 = K (6.59)
2
where s 0 , constant for a specific sample, is known as the variance of unit
weight. It can be interpreted as the variance of an imaginary observation
whose weight equals to one. In the case of sample mean I, S0 equals to si.
From equations (6.46) and (6.53) we can write:
S~ = ~lPv ( 6. 60)
x n
This result will be often referred to in the subsequent development.
We have to point out that the whole argument in this section
hinges on the acceptance of the "variances" s~. and s 2R:.. They have been
1 1
introduced solely for the purpose of deriving formulae (6.53) ,(6.58) that
are consistent with the rest of the adjustment calculus •. The more rigorous
alternative is to accept the two formulae by definition.
6.4.4 Variance Covariance Matrix of the Mean of a Multisample
We have seen in section 6.4.3 that the mean I of a sample L
has also a standard deviation Si associated with it. This standard devi-
ation is l(n)-times smaller than the standard deviation SL of the sample
itself and can be interpreted as a measure of confidence we have in the
correctness of the mean ~. Evidently, our confidence increases with the
number of observations.
We can now ask ourselves the following question: Does the mean
-
L of a multisample L also have a variance-covariance matrix associated with
it? The answer is - there is nothing to prevent us from defining it by
generalising the discovery from the last section. We get
s2- s- - s- -
~1 ~1 ~2 ~l~s
E- = s- - s2- s- - (6.61)
L ~2.Q,l ~2 .Q,l~s
s- - s2-Q,
2s 21 s
175
where
l 82
n. !/.,.
1 1
and
= Ls
n. !/.,,!/.,,
1 1 J
Here we have to require again that ni = nj, i.e. that both components of
the multisample have the same number of elements (see section 3.3.5).
Obviously, if this requirement is satisfied for all the pairs of components
we have
= ns = n
and
(6.62)
By analogy, the variance-covariance matrix obtained via the
covariance law (see section 6.3.1) from the variance-covariance matrix of
the mean of multisample is associated with the mean of the derived multi-
sample, or statistical estimate X. We say that
r"" = BEEBT I
is the variance-covariance matrix of the statistical estimate X, i.e. of
(6.53)
the solution of uniquely determined mathematical model
X = F(L) •
Similar statements can be used for other laws of propagation of errors.
Development of these is left to the student, who should also compare results
of this sections with the solution of Example 6.14.

176
Example 6.15: Let us take again the experiment described in Examples
6.1, 6.3 and 6.4. This time we shall be interested in
deriving the variance-covariance matrix EX of the solution
vector X.
Solution: First we evaluate EE from eq. (6.61). We obtain

2 1 2 o.oo4 cm2 2
S- = - S = 5 = 0.0008 em )
a 5 a
__o 2
--..;....;·0;..:50;..::.5.;:;.6...:c;::;m=-- = 0.0011 em2
Since Sab =0 we get
E- =[ 0. 0008 0 ] 2 1
em = 5 ~L •
L 0 0.0011
NowJZ::X can be evaluated from eq;u.ation,(i).;6s) and-we have
or
2
zA =fo. ooo81 em 0.1079 cm3 ]
4
X 0.01079 21.51254 em
Thus the standard deviations of the estimates d and aare
given by
1(0.00081 cm2 ): 0.028 em ,
1(21-.51254 cm4 ) = ..;..4.;...;.6....;.4....;c=m'2-•
6. 4. 5 The Method of L.east-s quares, Weight M.atrix
The least~squares principle as applied on the trivial identity
transformation 1i.e. the sample mean,can be generalized for other mathematical
models. Takiqs the general formulation of the problem of adjustment as

177
described in section 6.4.1, i.e.
[ F(L + v' X) = 0' I

we can again ask for such X that would make the value of the quadratic
form of the weighted discrepencies, VTPV, minimum, i.e.
min n
A
(6.64)
XER
The· condition (6. 64) for the majority of mathematical models, is enough to specify
such X =X uniquely. The approach to adjustment using this condition became
known as the method of least-squares.
The question remains here as how to choose the matrix P. In
the case of the sample mean we have used
2 2 2
P = diag (K/Sl , K/S l , . . . , K/S l ) ,
l 2 m
that is
2
1/S I ).
m
Using the notation developed for the multisample, this can be rewritten as:
Q
which indicates that the matrix P is obtained by multiplying the constant
(6.65)
K by the inverse of the variance-covariance matrix of the means of observa-
tions. This is in our case a diagonal matrix as we have postulated the
sample L to be uncorrelated.
We again notice that, mathematically, there is not much dif-
ference between a sample and multisample - they can be hence treated in much
the same way. Thus, there is not basic difference between the apparently
178
trivial adjustment of the sample mean and the general problem of adjust-
A'
ment. The only difference is that in the first case X is a vector of one
component, while generally it may have many components.
This gives a rise to the question of what would be the role of
K (K having been a scalar equal to s?X in the adjustment of the mean of sample),
in the least squares method, where X has several constituents. Let us just
say at this time that we usually compute the weight matrix P, as it is called
in the method of least-squares,as
p =K l: -1 (6.66)
E
where K is an arbitrarily chosen constant, the meaning of which will be
shown later. This can be done because, as will also be shown later, the
solution X is independent of K since it does not change the ratio between
the weights or variances o:t:. 4he individual observations.
In this course we shall be dealing with only two particular
mathematical models which are the most frequently encountered in practice.
In these models, we shall use the following notation:
n for the number of constituents of the primary or original multisample L;
u for the number of constituents of the derived, or unknown(to be derived)
. multisample X;
r for the number of independent equations ~elationships) that can be for-
mulated between the constituents of L and X.
Moreover, we shall consider these models to be linear.
The first model is
AX =L , (6.67)
179
in which A is an n b u matrix, X is a u b 1 vector and L is an n by 1
vector (n = r > u) • The adjustment of this model is usually called
parametric adjustment, adjustment of observation equations, or adjustment
of indirect observations, etc.
The second model is
BL =C (6. 68)
in which B is an r by n matrix, Lis n x 1 and Cis r x 1 vectors (r < n).
The adjustment of this model is known as conditional adjustment, adjustment
of condition equations, etc.
The two mathematical models are evidently quite special since
they are both linear. Fortunately many problems in practice, although
non-linear by nature, can be linearized. This is the reason why the two
treated models are important.
6.4.6 Parametric Adjustment
In this section, we are going to deal with the adjustment of
the linear model (6.67), i.e.
AX + C =L (n > u) (6.69)
which, for the adjustment, will be reformulated as:
AX - (L + V) =0
or
180
v = AX - L*) • (6. 70)
Here A is called the design matrix, X is the vector of unknown
parameters, L is the vector of observations, (L = L* - C where L* is the
mean of the observed multisample), and Vis the vector of discrepancies,
which is also unknown. The formulation (6.70) is known as a set of
observation equations.
" that would minimize the quadratic
We wish to get such X = X
form VTPV in which P is the assumed weight matrix for the observations
L (see the previous section). This quadratic form, which is sometimes
called the quadratic form of weighted discrepancies, can be rewritten
using the observation equations (6.70) as

- T -
= (AX - L) P(AX - L)
= ((AX)T- LT) (PAX- PL

T T -T T T -T -
= X A PAX - L PAX - X A L + L PL (6.71)
-1
From equation (6.66) we have P = K E_ , where K is a constant scalar and
L
Er; is the variance-covariance matrix of L. Since E- is symmetric, the
L
weight matrix P is symmetric as well and PT = P. We can thus write
since it is a scalar quantity.
Substituting (6.72) into (6.71) we get
* If we have a non-linear model

L = F (X)
it can be easily linearized by Taylor's series expansion, i.e.
o ClF 0
L = F(X
) + -~
ClX X=Xo
(X-X ) + • • • I
0
in which we neglect the higher order terms. Putting l\X for X-X , l\L for
L-F(XO) and A ( a matrix) for ClF/ClXIx=xo we get
l\L ,;, Al\X •
This is essentially the same form as equation (6.69). However, in this
case we are solving for the corrections l\X to the approximate value x0 of
the vector X, instead of solving for X itself.
181
(6.73)
The quadratic function (6.73), called sometimes the variations
function, is to be minimized with respect to X. This is accomplished by
equating all the partial derivatives to zero, i.e.
i = 1' 2' ... , u, (6. 74)
and we obtain, writing d/dX for the whole vector of partial derivatives
d/dXi,
which can be rewritten as:
or by taking the transpose of both sides we get:
r (ATPA) X = ATPL 1+) (6. 75)
This system of linear equations is called the system of normal equations
which can be written, as often used in the literature, in the following
abbreviated form:
~
N X= U (6.76)
where N = (ATPA) is known as the matrix of coefficients of the normal
equations, or simply the normal equation matrix and U = ATPL is the vector
of absolute terms of the normal equation.
The system of normal equations (6.76) has a solution X
* From matrix algebra we know that if A is a symmetric matrix and X is a

vector we get:
d
ax AX
+ Note that the normal equatiomcan be obtained directly from the mathemati-
cal model by pre-multiplying it by ATP •
182
given by
(6. 77)
if the normal equation matrix, N = ATPA, has an inverse. Note that N
is a symmetric positive definite matrix.*)
To discuss the influence of the weight matrix P on the solution

A
vector X, let us use a different weight matrix, say P', such that
P' = yP (6. 78)
where y is an arbitrary constant. Substituting (6.78) into (6.77) we get:
X' = (ATP'A)-l (ATP'L)
= (ATyPA)-l (ATyPL) (6. 79)

1 T -1 T -
=y (A PA) y(A PL)
A
=X
This result indicates that the factor K in equation (6.66) for computing
the weight matrix P from LL' can be chosen arbitrarily without any influ-
A
ence on X, which really verifies the statement we have made earlier, in
section 6.4.4.
It should be noted that the vector of discrepancies V as defined
in (6.70), becomes after minimization of the vector of residuals (see 4.8)
of the observed quantities. As such, it should be again denoted by a
different symbol, say R, to show that it is no longer a vector of variables

A
(function of X) but a vector of fixed quantities. Some authors use V
for this purpose and this is the convention we are going to use (see also
A
6.4.2). The values v. are computed directly from equation (6.70) in the
~
same units as these of the vector L. - Then the adjusted observations will
be given by £ =L+ V.
* ATmatrix say N, is positive definite if the value of the quadratic form

Y NY is positive for any vector Y (of the appropriate dimension) .
183
We should keep in mind that one of the main features of the
parametric method of adjustment is that the estimate of the vector of
unknown parameters, i.e. X, is a direct result of this adjustment as
given by equation (6.77).
At this stage, it is worthwhile going back to the trivial
problem of adjustment- the sample mean. According to the equation (6.79),
we can choose the weights of the individual observations to be inversely
proportional to their respective variances with an arbitrary constant K
of proportionality. This indicates that the weights do not have to equal

n
to the experimental probabilities for which E P = 1, as we required
i=l i
in sections 6.4.2 and 6.4.3. In this case, the observation equations
will be
x = ~l + v 1 , with weight p 1 ,
A
x = tn + v , with weight P .
n n
Or, in matrix form
AX
A
=L - A
+ V
where
1 ~
1 iJ.
2
A= -L
.
1 t
n
184
with weight matrix, P = diag (P1 , P 2 , · •· , Pn).
Substituting in equation (6.77) we get the solution, i.e. the weighted
mean of the sample, as

n
L: p.L
A i=l ~ ~ (6.80)
X=
n
L: p.
i=l ~
n
which agrees with the result in section 6.4.2. _,when L:pi equa]s to one.
i=l
Formula (6.80) is the general formula used to compute the weighted mean
of a sample of weighted observations.
Exa;mple 6.16: Let us have a levelling line connecting two junction points>
G and J, the elevations of which,HG' HJ,are known. The
levelling line is divided into three sections d1 , d 2 and d 3
long. Each level difference h1 , h 2 and h 3 was observedJwith

-
resUlts h1 , h2 and h 3 • - The observations h. are considered
~
uncorrelated with variances proportional to the corresponding
lengths d., i = l, 2, 3.
~
It is required to determine the adjusted values of the
elevations of points land 2, i.e. H1 and H2 respectively,

G using the parametric adjustment.
Figure 6.6
Solution
From the given data we have:number of observations n=3 ;
number of unknowns u = 2. Therefore, we have.one redund-
ant observation. The independent relationships between
the observationsand the unknowns are written as follows
(each relation, corresponds to one observation,):

The above relations can be rewritten in the general form
used in the previous development:
A X =L
3,2 2,1 3,1
where X = (Hl' H2) and
Hl = hl + HG = Ll )
-H1 + H2 = h2 = L2
-H = h - H = L
2 3 J 3
Putting this in matrix form, we get
1 0
-1
0
1
-1
[::l =
The corresponding set of observation equations are:
Hl = HG + ( hl + v 1 ) '
-Hl + H2 = ( h2 + v 2) '
H2 = -HJ + (ii 3 + v 3 ) .
These observation equations can be written in matrix form
as:
V = A X E
3,1 3,2 2,1 3,1
186
where:
vl (iil + HG)
v
3,1
= v2
v3
X =
2,1 [::1 1 =
(h3
ii2
- HJ)
and the design matrix A is given by
'A
3,2
= r-~ ~1
0 -1 •
We assumed that the observed values hl' h2 and h 3 are
-
uncorrelated. We will also assume that HG and HJ are
errorless, Hence:
2
L:- = diag (S-2 ' S- S-2
1 h. h2 l
' h3
But S-2 is proportional to d., i = 1, 2, 3;
h. l
l
thus
Further, we choose K: =1 and we get
p = K L:-
1
-1 .
= d1ag
1 1
(d, d' L) .
1 2 d3
Applying the method of least-squares the normal equations
are
A
N X = u
2,2 2,1 2,1 '
where
1
N = ATPA = -1 0 0 1 0
dl
[: 1 _: J 0
1
0 -1 1
d2
1
0 0 0 -1
d3
This giv:es
187
1 1
(L +d),
dl
--
d2
2
N
=
2,2
--
1
d2
'
1 1
( -d2 . + -
d3
)
and 1
-l 0 0 (hl + HG)
T-
U = A PL =
r: 1 _:l dl
0
0
1
d2
0
0
.L
h2
(ii 3 - HJ)
d3
Hence
u =
A
The solution X is given by
(\
-1
X N u
2,1 = 2,2 2,1
where
-1 dl d2 d3
N (L + L)
= (dl+ d2+ d3) d2 d3
1 (.L + L)
d2 ' dl d2
Performing the multiplication N-1u and realizing that
Uow, we compute the residuals vi from the equation
V = AX - L and find
i = 1, 2, 3.
188
Finally, we compute the adjusted observations from
L=L+V.
Remembering that HG and HJ are assumed errorless we get:
h.l. = h.l.
+ v.'
l.
i = 1, 2, 3.
Example 6.17: A local levelling network composed of 6 sections,shoWn in

Figure 6.7,was observed. Note that the arrow heads indicate
the direction of increasing elevation • The following
table summarizes the observed differences in heights h.

l.
along with the corresponding length of each section.
Section ; , St~tions - ii. l. length 1.

(km) l.
No. from to (m)
1 a c 6.16 4
c
2 a d 12.57 2
Figure 6.7 3 c d 6.41 2
4 a b 1.09 4
5 b d 11.58 2
6 b c 5.07 4
2
Assume that the variances Sh., i = 1, 2, .•. , 6, are
" l.
proportional to the corresponding lengths t.. The elevation
l.
H of station a is considered to be 0 metres. It is

a
required to adjust this levelling net by the parametric
method of adjustment and deduce the least-squares estimates

A A
Hb, He, and Hd for the elevations flb' He and Hd of the
p6ints b, c, and d.
189
Solution:
From the given data we have - number of independent obser-
vat ions: n = 6, number of unknowns: u = 3. Hence we have
3 redundant observations, i.e. 3 degrees of freedom .
Our mathematical model in this case is linear, i.e.
A X =· L 1
6,3 3,1 6,1
where
X = (~, He, Hd) •
3,1
The 6 independent observation equations will be(one
equ~tion for each observed quantity):
hl + v·l = Hc H
a = Hc o.o = H
c'
h2 + v 2 :::
Hd H
a
::: Hd - o.o = Hd,
h~ + v 3 :::
Hd H
c
h4 + v4 :::
~ H
a
:::
~- o.o :::
~ '
h5 + v 5 = Hd ~'
E6 + v 6'· = Hc - ~ .
The above set of equations can be rewritten i<l the following
form, after substituting the values of h.:

].
v
1
=
/.
H
c
6.16 ,
A
v2 :::
Hd 12.57
A
·'·
v3 = -Hc + Hd 6.41
A
'
v4 = Hb 1.09
'· .,
v
:J
= -~ .... "Hd 11.58
/'.
v6 = -~ + Hc 5.07
In matrix form we can write
190
v ... A X E
6,1 6,3 3,1 6,1
where
vl 6.16
v2 12.57
v3 H 6.41
b
v = v4 X =
H
E = 1.09
6,1 3,1 c 6,1
v ..) 11.58
Hd 5.07
v6
and the design matrix, A,is
0 1 0
0 0 1
0 -1 1
A = 1 0 0
6,3
-1 0 1
-1 1 0
since we have no information about the correlation between
h.,
l.
we will treat them as uncorrelated. Hence, the variance-
covariance matrix EE of the observed quantities will be:
EE = diag (~, 2, 2, 4, 2, 4)
6,6
understanding that· the constant factor K is assumed one.
The corresponding weight matrix is given as:
p = diag (0.25, 0.5, 0.5, 0.25, 0.5, 0.25).

6"6
The normal equations are
N X & u
3,3 3,1 3,1
yielding the solution
-1 u
X = N
3,1 3,3 3,1
where
191
N = AT P. A
3,3 3,6 6,6 6,3
Thus:
0 0 1 -1 0.25 0 0 0 0 0
N=
[:
0
1
-1
1
0
0
0
1
-:] .0
0 0.5
0
0
0.5
0
0
0
0
0
0 l(
0 0 0 0.25 0 0
0 0 0 0 0.5 0
0 0 0 0 0 0.25
0 1 0
0 0 1
0 -1 1
X
1 0 0
-1 0 1
-1 1 0
and
[ 0 0 0
N ;: 0.25 0 -0.5
0.25
0
-0.5
0
-0.25
0.25
J 0
0
1
0
0
1
0 0.5 0.5 0 0.5 0 0 -1 1
1 0 0
-1 0 1
-1 1 0
Finally:
N = ll.OO-0.25
-0.25
1.00
-0.5 ]
-0.5
3,3
-0.5 -0.50 1. 50
Note that N is a symmetric,positive-definite matrix.

192
Hence:
0.81.6
0.81
N
-1 = [ 0.8
1.6 0.8
3,3 o.8 o.8 1.2
Computing U =ATPL- , we get.
-0.25]
U=l0.~5
0 0 0.25 -0.5 6.16
0 -0.5 0 0 0.25 12.57
3,1 0 0.5 0.5 0 0.5 0 6.41
1.09
11.58
5.07
and
-6.78501
u = [ -0.3975 .
3,1 15.2800
Performing the mult~plication N-l U, we get X as:
1.6 0.8
o.8J [-6. 785ol 1.051
x [
= o.8 1.6 0.8 -0.3975 = [ 6.16
3,1 o. 8 0.8 1.2 15,2800 i2.59
Therefore, we have obtained' the following· estimates
~ = 1.05 m,
H
c
= 6.16 m,
Hd = 12.59 m.
By substituting the values of X we get the residual vector

-
V for the observed h. from the equation
l.
V =A X - L .
193
Namely:
o.oo m
0.02 m
v = 0.02 m
6,1
-0.04 m
-0.04 m
0.04 m
The adjusted observations h are computed from:

+ v.~ i = 1, 2, •.• , 6
and we get:
6.16 0.00 6.16

12.57 0.02 12.59
6.41 0.02 6.43
A
h = 1.09 + -0.04 = 1.05

11.58
5.07
l-0.04
I
0.04
11.54
5.11
in metres.
The computations can be checked by deriving the heights of
points b, c and d from Ha using the adjusted hi. The resulting values
must not differ from the adjusted values Hb' He and Hd.
6.4.7 Variance-Covariance Matrix of the Parametric Adjustment Solution
Vector, Variance Factor and Weight Coefficient Matrix
"' is given by equation

The parametric adjustment solution vector X
(6.77), i.e.
194
This can be written as:
X= BE (6.81)
where
(6.82)
The variance-covariance matrix EX of the solution vector X can be easily
deduced by applying the covariance law (equation (6.15)) on (6.81);
we get:
LX= BEE BT. (6.83)
From equation (6.66), we have

p = KI:--1
L
and inverting both sides we obtain
-1
EL = K P , (6.84)
Substituting (6.82) and (6.84) into (6.83) we get

EA = (N-lATP) K p-l (N-lATP)T. (6.85)
X
Both P and N are symmetric matrices, so that we can write:
PT = P,
NT= N and .( N-l)T =· N-1 •
substituting this into (6.85) we get
E"
X
=K N-l AT P P-l P A N-l
=K
that is
("X =K -N
On the other hand, by. putting p ·=
-1
=K (ATPA)-l,
KL-
,;.1
1
in (6.86) we get
(6.86)
L
E" =! K (AT E--1A)-l = (ATEE -lA)-1 , (6.87)
X 1C L
which shows that EX does not depend on the choice of the factor K. In fact,
this statement is valid only if we know the correct values of the elements
195
of LE • Unfortunately, however, LE is often known only up to a scale factor,
i.e. we know the relative variances and covariances of the observations

-1
only. This means that we have to work with the weight matrix KE-L
without knowing the actual value of the factor K. Therefore ~X cannot be
computed from equation (6.87).

"T "
If we develop the quadratic form V PV *)considering the obser-
vations L to be influenced by random errors only, we get an estimate "'K

for the assumed factor K given by
VTPV = ( n - u) ~ • (6.88)
The multiplier in the right-hand side is nothing else but the difference
between the number of independent observations and the number of unknown
parameters, i.e. the number of redundant observations, which is sometimes
denoted by'lif»and called the number of degrees of freedom, i.e.
df=n-u. (6.89)
df must be greater than zero in order to be able to perform a least-squares
adjustment. Hence equation (6.88) becomes
K = (6.90)
Usually)in the literature, K is known as the a priori variance factor and
K is called the least-squares estimate of the variance factor,or,simply,
estimated or a posteriori variance factor~ T.h& estimated variance factor

can be now used instead of the a priori one, yielding an estimate of ~X
* Here, the vector V is the vector of residuals from the least squares
adjustment .
196
" -1
l:"
X
= K" N = K(ATPA)-l
"T"
V PV (ATPA)-l
= (6.91)
df
which is known as the estimated variance-covariance matrix of x.

To discuss the influence of the chosen variance factor K in
-1 "
the weight matrix P =K Z:L on Z:x, as defined by (6.91), we take another
factor, say K ' • We obtian P' = K'Z:-L -1 = yP. Substituting in equation
(6.91) we get:
"T "
~~ = y (V PV) ~ ( Tp )-1 = ~"
6x df Y A A 6x
The above result indicates that fx given by equation (6.91) is independent
of the choice of the a priori variance factor K. We recall that the same
"
holds true for the estimated solution vector X (equation 6.79).
It often happens in the adjustment calculus, that we have to use
the estimated parameters Xin subsequent adjustments as "observations".
Then we have to take into account their respective weights. We know that
the weight matrix of an observation vector must be proportional to the
inverse of its variance-covariance matrix (equation 6.66). Thus, we can
see that the matrix of normal equations, N, can be immediately used as the
weight matrix of the vector X, since the inverse N-l is proportional to the
-1
variance-covariance matrix Z:~. Accordingly, the matrix N is known also
as the weight coefficient matrix and the square roots of its diagonal
elements are called (Hansen's) weight coefficients.

" -1
Note that X is called uncorrelated when N is diagonal, i.e.
when N is diagonal. In such a case, we can solve the normal equations
separately for each component of Xwhich satisfies our intuition. The

" is only remotely related to the correlation of L.
correlation of X "
X
197
will be uncorrelated if L is uncorrelated, i.e. P is diagonal, and if the
design matrix A is orthogonal. On the. other hand, N may be diagonal
even for some other general matrices P, A.
Let us now turn once more back to the "adjustment'~ of the
sample mean (see 6.4.3). It is left to the student to show that the
normal equations degenerate into a single- equation, namely equation ( 6. 40)
On the other hand, using eq. (6.91), we obtain the estimated variance of
the mean x as
(6.92)
Evidently the estimated variance of x differs from the variance sAX 2

(see eq. 6.60) in the denominator.
By analogy we define a new statis-
2 A
tical quantity, the estimated variance s1 of a sample L
n n
l: (~. _ I)2 = 1 l: (6.93)
1. n - 1
i=l i=l
(compare with eq. 3.6) which is used in statistics wherever the mean I of the
sample L is also being determined. It is again left to the student to
show that using the estimated variances for the grouped observations (see
6.4.2) the formula(6.92)(instead of 6.60) can be derived using the argumentation
of 6.4.2 and 6.4.3.
The estimated variances of the sample L and its mean R. can be
also computed using non-normalized weights,.i.e. weights p. for which

.l.
n
l: pi ~ 1 (see 6.4.6). It can be shown that the appropriate formulae are
i=l
1
n
=n - 1
l: (6.94)
i=l
198
and
2 n A
A
1 2
S- = E pi (6.95)
~ n ~
(n-1) z pi i=l
i=l
To conclude this section, let us try to interpret the meaning
of the variance factor K, introduced for the first time in 6.4.5. Let
us take, for simplicity, an experiment yielding a unit matrix of normal
equations, i.e. N = I. What would be the variance-covariance matrix
of the solution vector X? It will be a diagonal matrix
This implies that all the variances s~ 2 of the components of X equal to

X.
l
K. Since the square roots of the diagonal elements of N (all equal to 1)
A
can be considered as the weights P. of the components x. of X we can also
l l
write:
2 2
pl sl = P2S2 = •.• = Pnsn 2 = K
(6.97)
Comparison with equation (6.59) gives some insight into the role the
variance factor K plays. It can be regarded as the variance of unit

2
weight (see 6.4.3) and is accordingly usually denoted by either S or
0
cr 0 2 (in case of postulated variances). This is .again intuitively pleasing
since it ties together formulae (6.66) and (6.65), where K can be also
2 2 2 A A A
equated to S • Analogically, we denote K by either S or cr

0 0 0
A 2 A
By adopting the notation cr for K, and further by denoting the

0
weight coefficient matrix of the estimated parameters X, i.e. N-1 , by Q,
the equations (6.90) and (6.91) become:

199
~
cr
2 vrpv
=-- (6.98)
0 df
~ 2
~A
= cr Q. (6.99)
X 0
Example 6.18: Let us compute the estimated variance-covariance matrix
~"
X
" in example 6.16.
of the adjusted parameters X The
~A matrix is computed fro~ equation (6.99). First, from

X
the above mentioned example we have:

AT HJ - HG - thi
v = ._.;;__ _;;,__;;;;._;;;;...._ [dl ' d2 ' d3 J'
1,3 ~d.
l. l.
p . 1 1 1
3,3 ;:: .-diag [d, d' d]
1 2 3
and df =n - u =3 - 2 = 1.
Hence,
[1, 1, 1],
and
A 2 - 2•
cr
0
(HJ - HG -~hi) /
-1 l.
As we have seen, N = Q is given by
dl d2 d3 d2 + d3 1
-1
Q =N = d2d3 d2
2,2 ~:ldi
dl + d2
L •
d2 dld2
We thus obtain finally
" 2 (HJ- HG -1{hi)2

~A
X
= cr
o
Q=
~.d.
l. l.
200
Example 6.19: Let us compute the estimated variance-covariance matrix

~ A
~X of the adjusted parameters X in example 6.17. We
are going to use equations (6.98) and (6.99). First, from
the above mentioned example we have
"T
v = [o.oo, o.o2, o.o2, -o.o4, -o.o4, o.o4]
1,6
in metres,
P = diag [0.25, 0.5, 0.5, 0.25, 0.5, 0.25]

6,6
in m-2- and
df =n - u =6 - 3 = 3.
Hence
and
= 0.002
3
.!.
- 0.00067 ( uni tless) •
Also, from example 6.17, we have
Q = N-1 = [1.6
0.8
0.8
1.6 o.BJ
0.8 2
in m .
3,3 o.8 0.8 1.2
Finally,
~"
X = ~ 0 2Q = 10-4
10.67
5.33
5.33
10.67
5·33]
5.33 in m ,
2
3,3 5.33 5.33 8.0
~:~~
or"" [10.67 5.33
'Ei = 5. 33 10.67 ] in cm2 •
5.33 5.33 8.0
201
6.4.8 Some Properties of the Parametric Adjustment Solution Vector
It can be shown that the choice of the weight matrix P of the
observations E (proportional to the inverse of variance-covariance matrix
~L) and the choice of the least-squares method (minimization of VTPV)

to get the solution X = X ensures that the resulting estimate X has got
the smallest possible trace of its variance-covariance matrix LX. In
other words: taking P =~ 2 ~L-l and seeking min VTPV, provides such a
0 XERU
solution X that satisfies at the same time the condition
min trace LX . (6.100)

Xt::Ru
This is a result similar to the consequence of the least squares principle
applied to random multivariate (section 5.4) and we are not going to prove
it here.
Similarly, it can be shown that for uncorrelated multisample of
observations L = (L1 , L2 , . . . , Ln) which are assumed to be normally
distributed with PDF given by:
n 1
q,(LJ3;L) = II exp [ - (6.101)
i=l s.I(27T)
~
we get the most probable estimate of L0 if the condition min VTPV

XERu
is satisfied. This can be verified by writing
n
1 1
q,(L 0 ;,S;L) = - - - = - n - - exp [--
2
L
(27T)n/2 II S i
i=l
i=l
1 1 T
= n
exp [- 2; V PV] ,
II s.l
i=l
202
T
which is maximum if both V PV and trace (EX) are minimum. This is valid
for any fixed K.
6.4.9 Relative Weights, Statistical Significance of A Priori and A

Posteriori Variance Factors
We have seen in section 6.4.6 that the choice of the a priori

variance factor cr 2 , or~. does not influence the estimated solution
0
...
vector X. Also, in section 6.4.7 we have seen that the same holds true
even for the estimated variance-covariance matrix EX. Hence, for the
purpose of getting the solution vector X along with its EX' we can assume
2
any relative weights, i.e. P = cr 0 EL -l , with cr 2 chosen arbitrarily. On
0
T
the other hand, t~e~mai"irix
of no}"lna;l':lquations, i.e. N =A PA, and the
... 2 AT A
estimated variance factor, i.e.' cr = V PV/df, are influenced by the

0
2
selection of cr •
0
These features of ~·o 2 are used in practice for two different
purposes. First, is to render the magnitude of the elements of the
normal equation matrix N such as to make the numerical process of its
inversion the most precise. This is accomplished by choosing the value

2
of cr such as to make the average of the elements of N close to one.
0
The second purpose is to test the consistency of the mathe~atical
model with the observations and to test the correctness of the assumed
variance-covariance matrix E1 . Usually, if we do not have any idea

2
about the value of the variance factor cr , we assume
0
after performing the least-squares adjustment, we get
The ratio~ 2 /cr 2 , provides some

2
as an estimate of the assumed cr
0 0 0
203
testimony about the correctness of LL and the consistency of the model.
This ratio should be approaching 1. By assuming in particular, a 2

0
= 1,
we should end with &02 = 1 as well. If this is not satisfied, we start
looking into the assumed LL and use the obtained cr~ from the adjustment
instead of a 2 in computing the weights. If the resulting new variances

0
and covariances of the observations are beyond the expected range known
from 'experience, we have to start examining the consistency of the math-
ernatical model with the observations, i.e. if it really represents the
correct relationship between the observed and the unknown quantities.
This approach is also used to help detecting the existing
"systematic errors" in the observations L, that manifest themselves as
deviations from the mathematical model. These deviations cause an
"overflow" into the value of the quadratic form VTPV and con~equently,
"2 •
into a
0
The theoretical relation between the a priori and a posterior
variance factors allows us to test statistically the validity of our
hypothesis. However, this particular topic is going to be dealt with
elsewhere. Let us just comment here on the results of the adjustment
of the levelling network discussed in Examples 6.17 and 6.19. In corn-
puting the weight matrix P, we assumed a 2

0
= 1. After the adjustment we
obtained &2 ~
0
0.00067. Thus the ratio a 2
e
;cro2 equals to 1500 which is con-
siderably different from 1. This suggests that the variance-covariance
matrix E- was postulated too "pessimistically" and that the actual variances
L
of the observations are much lower.

204
6.4.10 Conditional Adjustment
In this section we are going to deal with the adjustment of
the linear model (6.68), i.e.
B L = c '
(r < n), (6.102)
which represents a set of "r" independent linear conditions between n
observations L. Note that C is an r by 1 vector of constant values
arising from the conditions.
For the adjustment, the above model is reformulated as:
B (L + V) ~ C =0
or, as we usually write it
B- ·v + w = 0 *)) (6.103)
where:
(w= BE-c. (6.104)
The system of equations (6.103) is known as the condition
e~uations, in which B is the coefficient matrix, V is the vector of dis-
crepencies and W is. the vector of constant valu~s ~. We re9all that "n" is
the number of observations and "r" is the number of j,ndependent
conditions. It should also be noted that no unknown parameters,.i.e.

·.rector X, appear in the condition equations. The discrepencies V are
the only unknowns
* If we have a non-linear model F(L) = 0, it can be again linearized by

Taylor's series expansion, yielding:
F(TJ) = F(L 0 ) + ~~ I
1=1°
(L-1°) + • • • '
in which we again neglect the higher order terms. Putting V = (1-1°),

B for aF /3L and W = F(L0 ) , we end up with the lineariz.=d condition
equations of the form: BV + W = 0 J which is the same as ( 6 .103} ..
205
"
We wish again to get such estimate V of V that would minimize
T
the quadratic form V PV, where P = ao2 E-L -1 is the assumed weight matrix
. . T
of the observations L. The formualtion of this condition, e • mJ.n V PV,L
Ve:Rn
is not as straightforward, as it is in the parametric case (section 6.4.6).
This is due to the fact that V in equation (6.103) can not be easily
expressed as an explicit function of B and w. However, the problem can
be solved by introducing the vector K of r unknowns, called Lagrange's

r,l
multipliers or correlates*). We can write:
min VTPV =min [VTPV + 2KT (BV + W)] (6.105)

Ve:Rn Ve:Rn
since the second term on the right hand side equals to zero. Let us
denote
To minimize the above function, we differentiate with respect to V and
equate the derivatives to zero. We get:
which, when transposed, gives PV + BTK = 0.

"
The last equation can be solved for V and we obtain:
lV = -P-1 BT K .} (6.106)
This system of equations is known as the correlate equations.
Substituting equation (6.106) into (6.103), we eliminate V:
B(-P-l BT K) + W = 0 ,
or
·~-lBT) K= W l (6.107)
This is the system of normal equations for conditional adjustment. It is
usually written in the following abbreviated form:
* This is why the conditional adjustment is sometimes called: adjustment

by correlates.
206
MK = W, (6.108)
where
(6.109)
The solution of the above normal equations system forK yields:
(6.110)
Once we get the correlates K, we can compute the estimated residual vector
V from the correlate equations (6.106). finally the adjusted observations

....
1 are computed from
L=L+V. (6.111)
In fact, if we are not interested in the intermediate steps,
the formula for the adjusted observations 1 can be written in terms of the
original matrices B and P and the vectors :E and C. We get
1 =1 + v
=1 - P-l BT K
=1 - P-lBT(BP-lBT)-l (B L - C) . (6.112)
It can also be written in the following form:
[ L = (I - T) L+ HC ) \ (6.113)
where: I is the identity matrix ,
(6.114)
Example 6.20: Let us solve example 6.16 again, using this time the"con-
ditional method of adj.ustment. We have only one condition
equation between the observed height differences h., i

l.
= 1,2,3,
and we thus note that the number of degrees of freedom
207
is t.he same as in example 6.16. Denoting HJ - HG by [!H, the
existing condition can be written as:
L:h.l = llH.
After reformulation we get:
which ~an be easily written in the matrix form as
B v + w = 0
1,3 3,1 1,1
where
[ vl
B = [1, 1, 1], v = v2
v3
and
w = ii + 112 + h3 - llH
= L. h.l - llH .
i
The weight matrix of the observations is given by (see
example 6 .16) :
P (l 1 L)
3,3 = diag dl' d2!' d3
and
The system of normal equations for the correlates K is given
by equation (6.108) as
.M K = w
1,1 1,1 1,1
where
L:d .•
i l
208
The solution for K is
K = M-1 W
1
= 'i:'A'"
.d.
U:ii. -
i 1
AH).
1 1
The estimated residuals are then computed from equation
(6.106) as:
A
V = -P-1 T
BK
0 0 1 r-
dl
e.h.
J.
- AH
=- 0 d2 0 1 Ea.
1 1
0 0 d3 1
dl
~h. - H
1 J.
-- d2 L: d.
i J.
d3
and we get:
AH - th.J.
v.
,J.
= L:
l.
d.'
1
idi
This is the same result as obtained in example 6.16 when
using the parametric adjustment (note that AH=H -H).

J G
In this particular problem, we notice that the adjustment
divide the misclosure, i.e. ( dH -E. h.), to the individual

l. 1
observed height differences proportionally to the corresponding
lengths of levelling sections, i.e. inversely proportionally
to the individual MSE's. The adjusted observations are given

by equation (6.1~1) i.e.
"
L =L + v '
or
i = 1' 2' 3.
209
This yields:
llH - Eh.~ ~
h. = h. + d ••
~
~ ~
~d.
~ ~
Finally, the estimates of the unknown parameters, .i.e.

" " "
X= (H1 , H2 ) , are computed from the known elevations
and the adjusted observations h., as follows:

. ~
H ::a
HG + hl
1
= HG - dl
+ hl + - - (llH - ~iii J
~d.
~·
~ ~
and
H2 = HJ - h3
d3
= HJ - h3 - (llH - Eh. ) •
~ ~
BU.
i ~
The results are again identical to the ones obtained from
the parametric adjustment.
Example 6.21: Let us solve example 6.17 again, but this time using the
conditional adjustment. The configuration of the levelliqg

b
network in question is illustrated again in Figure 6.8,
for convenience.
From the above mentioned example we have:
No. of observations, n = 6,
a c
No. of unknown parameters, u = 3.
Figure 6.8 Then df =6 - 3 = 3, and we shall see that we can again
for.mulate only 3 independent condition equations between
the given observations.

210
By examining Figure 6.8, we see that there are 4 closed

loops , namely: (a - c - d - a) , (a - d - b - a) ,
(b- c -d-b) and (a- c - b -a).
This means that we can write 4 condition equations, one
for each closed loop. However, one can be deduced from
the other 3, e.g. the last mentioned loop is the summation
of the other three loops.
Let us,for instance, choose the following three loops:
loop I = a - c - b - a ,
loop II = a - c - d - a >
loop III =a - d - b - a •
These loops give the condition equations as follows
(hl + vl) - (h6 + v6) (h4 + v4) = 0,

(hl + vl) + (h3 + v3) (h2 + v2) = o,
(h4 + v4) = 0.
Then we get
v 1 - v 2 + v 3 + (hl ii2 + h3) = 0 J
v 2 - v 4 - v 5 + (ii 2 - ii 4 - ii 5 ) =0 •
The above set of condition equations can be written in the
matrix notation as
B V + W = 0 ,
3,6 6,1 3,1
where:
211
1 0 0 -1 0 -1
B = 1 -1 1 0 0 0
3,6
0 1 0 -1 -1 0
VT = (vl, v2, v3, v4, v5, v6)
rl _ 1
1,6
and
ii4 _ ii6)
w = (hl - h2 + h3)
3,1 (h - h4 - h )
2 5
Substituting the observed quantities h. i = 1, 2, II II II ' 6'

1
into the above vector we get
0.0
w= 0.0 in metres •
3,1
-0.1
The weight matrix P of the observations is formulated as:
(see example 6.17):
P = diag (0.25, 0.5, 0.5, 0.25, 0.5, 0.25)

6,6
and
-1
P = diag (4, 2, 2, 4, 2, 4).
6,6
The normal equations for the correlates K are
M K = W
3,3 3,1 3,1
where
212
12 4 4
4 8 -2
4 -2 8
By inverting M we get:
0.15 -0.1 -0.1
-0.1 0.2 0.1
-0.1 0.1 0.2
The solution Tor K is given by
0.15 -0.1 -0.11 o. 011

K
3,1
-1
=M W = -0.1r-0.1
0.2
0.1
0.1
0.2
o.ol
[ o.o
-0.1
= 1-0.02
-0.01 •
The estimated residuals are computed from equation (6.106):
= o.oo
0.02
0.02 m
-0.04
-o.o4
0.04
and are again identical with the results of example 6.17.
The adjusted observations will be.
L =E + V , i.e.
6.16 o.oo 6.16

~1
12.57 0.02 12.59
~2
6.41 0.02
~3 6.43
}i = 1.09
+
-o.o4 = 1.05
,..4
11.58 -0.04 11.54
~5
h6 5.07 0.04 5.11
in metres.
213
Finally, to compute the estimated elevations of points
b, c, and d, i.e. Hb , He and Hd, we will use the given
elevation H and the adjusted observations h .•

a ~
For instance;
~ = Ha + h 4 = 0.0 + 1.05 = 1.05 m,
He = Ha + h1 = 0.0 + 6.16 = 6.16 m,
Hd = Ha + h2 = 0.0 + 12.59 = 12.59 m.
These are obviously identical with the corresponding results
of the parametric adjustment.
Note again that when computing the estimates of the unknown
parameters from the adjusted observations we can follow any
route in computing them. They all lead to the same answer.
6.4.11 Variance-Covariance Matrix of the Conditional Adjustment Solution
The formula for the variance-covariance matrix LL of the adjusted
observations - the "result" of the conditional least squares adjustment -
can be developed by applying the law of propagation of variance-covariance
matrix (equation 6.15) on equation (6.113). In this equation, the matrices
I,T,H are, obviously, fixed. Similarly, the vector C is considered as a
vector of theoretically deduced, and therefore errorless, values, then
LC will be zero. Hence, we get:

"
~--
"L
= (aL)
at
(aL)T
LE aL
214
= L-lr - T
IL- T - TL-I + TL-T •
T (6.115)
L L L
T
It is not difficult to see that both (HET ) and (TLEI) are square symmetric
matrices, hence we can write:
T
= L--
L
TL- (2I- T
L
). (6.116,)
Recalling, from equation 6.114, that T = P-l [BT(BP-lBT)-lB] and
knowing that P = cr 0 2 LE
-1
, i.e.
LE = cr 0 2 P-1 , then by substituting these quantities into equation (6.116)
we get:
(6.ll7)
Noting that
we get finally
r-~-LL_=_cro-2-P--1-(-I--_B_T_(B-P---1-BT_)___l_B_P___l_)-.-,]
(6.118)
Here, similar to the parametric adjustment, to obtain the estimated

.
var~ance-covar~ance . ma tr~x cr 2 ~ns
. we use "" . t ead o f cr 2 , were:
h
0 0
215
~ 2
cr (6.119)
0
and we end up with

L~ = ; 2 (P-1 - p-1 BT (BP-lBT)-1 BP-1)' (6.120)
L o
or,in an abbreviated form:
LA
L
= ; o2 (I - T)P-l *) . (6.121)
Analogous to the parametric adjustment, it also can be shown
that the estimate L assures the minimum trace of its variance-covariance
matrix L£· Under the same assumptions as stated in section 6.4.8, the
estimate L is also the most probable estimate of L.
Regarding the correlation between the adjusted observations

"'L, we can see that "'L will be uncorrelated if:(i) Lis uncorrelated and
(ii) the coefficient matrix B is orthogonal. If these two conditions
are satisfied then T and P-l will be diagonal matrices. On the other
A
hand, we can experience uncorrelated L even for some other general T and
P.
Finally, we note that again the choice of the a priori variance
factor cr 0 2 does not influence the estimated LL defined by equation (6.121).
Example 6.22: Let us determine the variance-covariance matrix "' for the
LA
L
conditional adjustment formulated in example 6.20.
We have
b.H - ~h.
~ ~
v.
~
= d
i
i = 1, 2, 3,
~d.
~ ~
p 1 1 1
and = diag (d, d).
d2 ' 3
3,3 l
Thus we get
It can be shown that similarly LA ~ 2 TP-1

* v= 0
216
AT A (tlH - :EE. )2
"2 VPV 'l
0
o
= ----
r
= ------~~
4:d.
l l
The required variance-covariance matrix is given by equation
6.121, i.e.
First we compute T = P-lBTM-1B. We recall from example
6.20 that M-l = 1/ Z:d. ,

i l
B = [1,1,1]
1 1 1,3
'
Hence,
0 0 1
dl [ LJ [1, l, l]
I: d.
T = 0 d2 0 l l l
0 0 d3 l
and we obtain
dl dl
d2 d2
d3 d3
Further we get
a.l a.l
(I - T) P-l = C/.2 a.2
a.3 a.3
where a..
l
= i = 1, 2, 3.
Finally we get:
217
E - 2 (I - T) P-l
h
= cr 0
3,31
2 al al al
(AH - l.Eh·)
• l.
= Ed.
a2 a2 a2
i l.
a a3 a
3 3
.1\
Example 6.23: Let us determine the variance-covariance matrix 2-'\ for

L
the conditionally adjusted levelling net of example 6.21.
We have
v"'T = [0.00, 0.02, 0.02, -0.04, -0.04, 0.04]

1,6
in metres,
6 ~ 6 = diag [0.25, 0.5, 0.5, 0.25, 0.5, 0.25]

. m-2
1.n · and
r = df = n - u =6 - 3 = 3.
Hence,
hT "
V PV = 0.002 (unitless),
"T A
V PV
" !
cr
o
;i:-
.~ r
= 0 · 3002 = 0.00067 (unitless).
The required E£ matrix is computed again from equation
6.121 as
E"
L
= cr"o 2 (I - T) P-1
6,6
where
-1
P = diag [4, 2, 2, 4, 2, 4]
6,6
218
and T is computed from
T = p -1 (BTM-1 B).
6,6
From example 6.21, we
M
-1
= (BP-lBT)-1 = I have:
-0.1
-0.1
0.15
-0.1
0.2
0.1
-0.1
0.1
0.2
l
and
[~ -n
0 0 -1 0
B = -l 1 0 0
3,6
1 0 -1 -1
Hence J
(BTM-1 B) =
0.15 -0.1 0.1 -0.05 0 -0.05
-0.1 0 •. 2 -0.1 -0.1 -0.1 0
0.1 -0.1 0.2 0 -0.1 0.1
=
-0.05 -0.1 0 0.15 0.1 0.05
0 -0.1 -0.1 0.1 0.2 -0.1
-0.05 0 0.1 0.05 -0.1 0.15
and: T = p -1 (BTM-1 B) =
6,6
0.6 -0.4 0.4 -0.2 0-0.2

-0.2 0.4 -0.2 -0.2 -0.2 0
0.2 -0.2 0.4 0 -0.2 0.2
=
-0.2 -o.4 0 0.6 0.4 0.2
0 ..,.Q,:2: -0.2 . 0.2_ 0.4 -0.2
. '
'-
-0.2 0 o.4 ().2 -o.4 0.6
Hence
(I - T) p -1 =
219
1.6 -0.8 0.8 -0.8 0 -0.8

-0.8 1.2 -0.4 -0.8 -0.4 0
0.8 -0.4 1.2 0 -0.4 0.8
-0.8 -0.8 0 1.6 0.8 0.8
0 -0.4 -0.4 0.8 1.2 -0.8
-0.8 0 0.8 0.8 -0.8 2.4
Finally we get
~ 2
L:~
L
= (J
0
(I - T) P-l as
6,6
10.72 -5.36 5.36 -5.36 0 -5.36

-5.36 8.04 -2.68 -5.36 -2.68 0
5.36 -2.68 8.04 0 -2.68 5.36
= lo- 4
-5.36 -5.36 0 10.72 5.36 5.36
0 -2.68 -2.68 5.36 8.04 -5.36
-5.36 0 5.36 5.36 -5.36 16.08
in metres squared.
By dropping the scalar 10-4 we get the results in em2 .
The comments stated at the end of section 6.4.9 regarding

~ 2 2
the value of a versus the assumed v~lue of 1.0 for a
0 0
hold true here as well.

220
6.5 Exercise 6
1. To determine the height h of a wall shown in the Figure, the
horizontal distance t and the vertical angle 8 were observed and
found to be:
t = 85.34 m, with S£ = 2 em.
8 = 12° 37' 30" , with S- = 10" .

8
Required: Compute the statistical estimate for h
along with its RMS.
2. To determine the distance P1 P2 = c, which cannot be measured directly
due to the existence of some obstacle as shown in the Figure, the
following measurements were taken:
with S- = 3 em,
a
plp3 = b = 40 m, with s"b = 4 em,

y = 60°' with s-y = 25".
Reguired: Compute the distance plp2 and its
standard error to the nearest mm.
3. Determine the standard error of the estimated hedght h of the tower
given in Problem number 9, Exercise 4, s~ction 4.11. Consider all the
measured ~uantities, namely t, a, S and 8 to be uncorrelated.
4. From a point P 0 in the x-y coordinate system shown below, a distance
d = 5637.8 mandan azimuth T = 49.9873 grads (100 grads= 90 degrees)
to a second point P1 were measured. The relative error of d is
1.2 · 10- 4 . The RMS ofT is 0.08 centigrads (1 grad= 100 centigrads).
221
Required: Compute the following:

" ·"
( i) The coordinate difference_s ( l:::.x, l:::.y)
between points P0 and P1 .
(ii) The variance-covariance matrix EX,

" " 2
where X = (t::.x, !:::.y), in m ,
(iii) The RMS of l:::.x and !:::.y respectively.

- - ....
b.)l.
(iv) The correlation coefficient
between l:::.x and l:::.y.

~----------------~~x
5. The shown traverse consists of two legs P0 P1 and P1 P2 . The coordinates
(x , y ) of the initial point P as well as the (x, y) coordinates of

0 0 0
the reference mark R are considered to be
error-free (errorless), i.e. fixed
quantities. The measured quantities
are the horizontal angles sl and s2
and the horizontal distances d 1 and d 2
respectively. The available data are:
x = 100.0 m y = 200.0 m
x0 = 150.0 m y0 = 150.0 m
~1 = 750 with s- = 3"

sl
s2 = 21o 0 with s- = 2"
s2
d:1 = 100 m and d2 = 200 m.
The standard error of the observed distance is to be calculated according
to the formula
Sd( em) = 1. 0 -(em) ·+ d(m) . 10- 2 .

222
Required: Compute the following:

A
(i) The estimated coordinates (x 1 , y 1 ) of point P1 , along with their
associated variance-covariance matrix E(A A ) •
xl, Yl A A
(ii) The estimated coordinates (x 2 , y2) of point P2 and their
variance-covariance matrix E(A ~ )'

x2, Y2
Note that the coordinates are required to the nearest rom and the
2
variances and covariances are required in em •
(iii) Discuss the correlation among the estimated coordinates x 1 , y 1 ,
x2 and y2.
6. Having an intersection problem, as shown in the Figure, i.e.

observing the two horizontal angles S and a from the two known stations
P1 (x1 , y 1 ) and P 2 (x 2 , y 2 ) in order to determine the (x, y) coordinates
of an unknown station P.
Given: the following data:

'P (x]ij)
x.:j.. = 200.0 m
E -
(xl, Yll=C :] em
2
'
r: (>cp 'j,) ----9
I
yl = 500.0 m
x2 = 546.4 m
E -
(x2, y2
- ) =l -o.j em
2
L~
y2 = 300.0 m -0.5 3
a = 90° s-a = 3" ~ (x,}1J

and s- -= 0 .
s = 60° s-s = 2" o(.~
Required: Compute the estimated (x' y) coordinates of the unknown
station P, in metres, and their associated variance-covariance matrix

E 2
A A
in em
(x' y)
223
7. Consider problem number 1 of this exercise. Assume that the observed
quantities ~ and 8 have got also non-random (systematic) errors of
-1 em and 5" respectively. Compute the expected total error in the
derived height h in centimetres.
8. Determine the expected error in the sum of a hundred numbers in the

following two cases:
( i) each individual number is to be rounded-off to three decimal
places.
(ii) each individual number is to be truncated to three decimal
places.
Then compare and comment on the results.
/
/
9. To determine the height h of a tower, the /
proposed, in whiCh a, S, 8 and a are the
quantities to be measured. The approxi-
mate values of these quantities were
obtained from a preliminary investigation
and found to be:
a = 100 m.
Providing that the horizontal angles a and S are to be measured with
a precision of 2" (i.e. s-a ,; s--S ~ 2"), what are the required precisions
in measuring both the horizontal distance a and the vertical angle 8
(i.e. Sa and s8) such that their contributions to the standard error Sh
of the derived height h - which is specified not to be worse than
2.45 em - will be the same.

224
10. Assume that the horizontal angles in a triangulation network are to
be measured using two theodolites, "I" and"II", of different quality.
These two theodolites were tested by measuring one particular angle
several times, from which it was found that the standard deviation
of one observation, i.e. the standard deviation of the sample,observed
with theodolite "I" was s1 = 1':5 and with theodolite "II"

l
was s1 = 2': 5. If it is specified that all the angles of the net-
2
work are required to have a standard deviation of the mean, i.e.
SI, not worse than o': 5, how many times should we measure each angle
when using theodolite number"I" and when using number "II"?
11. The following observations of the length of an iron bar in metres
are made on a comparator:
3.284, 3.302, 3.253, 3.273, 3.310, 3.321, 3.304,
3.295, 3.263, 3.270.

Required: (i) the length of the bar (i.e. the mean)
(ii) the RMS of one observation j
(iii) the standard deviation of the mean.
12. The following table shows the means I. of the daily measurements of
~
the same distance t during a five day period, along with their
respective standard errors S- •

t.
~
Day Mon. Tues. Wed. Thurs. Fri.
I. (m)
~
101.01 100.00 99.97 99-96 100.02
SI _(em) 2 1 4 5 3
~
225
Required: Compute the weighted mean of the distance ~' say i, along
with its associated RMS, i.e. Si.
13. Given a gravimetric network, as shown in the figure below, determine
the gravity values g1 and g 2 at points P1 and P2 respectively,
with their variance-covariance matrix. The gravity
g
0
= 979832.12 mgal at the initial
point P is known.
0
The following table gives the observed
gravity differences with their signs,
along with the time needed for each
observation.
Station
From To I:J.g (mgal) b.T (hr)
p pl - 9.82 2.5
0
p2 Po -27.78 1.5
pl p2 +38.42 2.0
Assume that the observed differences I:J.g's are uncorrelated, and their
variances are proportional to the corresponding time intervals b.T.

226
14. GiYen a leYelling net as shown in the
Figure, the eleyations HA, HB of
points A and B are considered as
known and errorless:
HA = 300.000 m,
A
HB = 302.245 m.
The following table giYes the obserYed
height differences h.'s along with the

J..
length t 1 of each section.
Section Station h. t.
J.. J..
No. From To
(m) (km)
1 pl B 1.245 1.0
2 A pl 0.990 0.5
3 pl p2 0.500 1.0
4 p2 B 0.751 1.0
5 p3 B 1.486 0.5
6 p3 p2 0.740 1.5
Note that the arrows in the giYen figure indicate the directions of
increasing eleYations. The aboYe ob.serYations are considered
uncorrelated with~. proportional to ~i.

J..
Required: Perform a parametric least squares adjustment of the aboYe
leYelling net and find out the following:

227
~ fi ~
(i) The estimated elevations H1 , H2 and H3 of points P1 , P2 and P 3 .
(ii) The adjusted values of the given six height differences.
(iii) The estimated variance-factor §2 and compare it with the assumed,

0
apriori variance factor s20 ; comment on the results.
(iv) The estimated variance-covariance matrix ~X of X= (H1 ,H 2 , H3 ).

15. Adjust the levelling net given in problem no. 14 again by using the
conditional method of adjustment. Replace the requirement no. iv
by computing the estimated variance-covariance matrix E£ of the
adjusted height differences. Compare the results of the other three
requirements with the corresponding results from the parametric
adjustment.
16. Two physical quantities Z and Y are assumed to satisfy the following
linear model
Z = aY + S ,
where a and S are constants to be determined. The observations Y.l
and Z. obtained from an experiment are given in the following table.
l
y l 3 4 6 8 9 ll 14
z 1 2 4 4 5 7 8 9
Assume that the Y's are errorless, and the Z's were observed with equal
precision.
A
Reguired: Determine a and S which provide the best fitting line
between Z andY, in the least squares sense.

228
17. Solve problem No. 16 again, but this time consider the Z's error-
less, and the Y's with equal variances. Compare the results for
&and S with the corresponding results from problem No. 16.

c
18. The given figure shows a triangulation B
network with fixed base AB =2 km. The
eight numbered angles in the Figure
are all measured, each with a
different number n.]. of observations
as shown in the following table: A D
Angle n.]. Mean value of the angle

No.
1 2 82° 07' 09'!50
2 2 28 22 17.70
3 5 110 29 25.02
4 3 125 53 33.67
5 2 25 44 09.30
6 2 29 19 17.50
7 5 55 03 29.32
8 3 68 33 32.33
Assume that all the measurements were done with the same instrument
and under similar circumstances. (~: the weight of each angle
will be proportional to the corresponding number of repetitiomn.).

].
Required: (i) Neglecting the spherical excess in this network, compute

......
the distance CD using the adjusted values of the observed angles.
229
(ii) Considering the fixed base AB to be errorless, find

A
the estimated relative error of the estimated length CD.
19. The given figure shows a braced quadrilateral ABCD in a triangulation
network, in which all the directions marked by arrowheads were
measured with the same precision.

c
The base AB = 25 km is assumed
errorless. The spherical excess 2S

s in the four formed triangles
is computed approximately and
given by:
MBC, s = 3'! 126

D.ABD, E: = 1'! 556 7 13
12
6GBD, E: = 3'!085 9
D.ACD, E: = 1'!515
A D
The results of the direction observations are summarized in the
following table .
... ,
..
Occupied Target Direction Observed Direction
Station Station No.
.. B 7 00° 00' oo':oo

A c 8 91 30 30.35
D 9 125 53 33.91
c 30 00 00 00.00
B D 36 28 22 l/.26
A 38 110 29 2T.l3
D 26 00 00 00.00
c A 27 29 19 17.52
B 28 35 03 26.80
A 11 00 00 00.00
D B 12 35 07 29.00
c 13 68 33 32.60
230
Required: (i) Perform a conditional adjustment and find out the
adjusted values of the observed directions, along with their

A
estimated variance-covariance matrix r£ .

(ii) Using Legendre's theorem for the spherical triangleJ
i.e. by subtracting one third of the spherical excess from each
adjusted angle and then solv:i.rg the triangle as if it was a plane

,....
triangle compute the side CD from the known base .AB and the
..........
adjusted directions. Then check the computed CD by following another
route in its computation.
(iii) Compute the estimated relative error of the estimated

:::::::::
length CD.
20. The given figure shows a
triangulation network with
two fixed (errorless) stations

B 3 12 D
A and E whose x and y
coordinates are:
X y
A: 0. 0 m, 0.0 m,
E: 200.0 m, 0.0 m.
The 14 marked directions (with arrowheads) were observed with the
same precision. The results of the observations are tabulated below.

231
Occupied Target Direction Observed Directions

Station Station No.
A
B 1 00° 00' oo'!o
c 2 60 00 10.0
D 3 00 00 00.0
B c 4 6o 00 05.0
A 5 119 59 50.0
A 6 00 00 oo.o
c B 7 59 59 55.0
D 8 120 00 00.0
E 9 180 00 05.0
E 10 00 00 oo.o
D c 11 59 59 45.0
B 12 119 59 55.0
A 13 00 00 00.0
E
D 14 60 00 15.0
Reguired: Prepare the input for a computer program performing
a parametric least squares adjustment using the directions (not the
angles) to estimate the unknown coordinates of points B, C and D by
providing the following:
(i) The number of unknown parameters and the number of degrees of
freedom.
(ii) The non-linear mathematical model.
(iii) The approximate values for the ,x, y coordinates of points B,C,D.
(iv) The linearized form of the mathematical model, i.e. V = A~X-~E.

giving the symbolic elements of the vectors V and ~X and the
numerical values of the elements of the design matrix A and the
vector ~E.
232
(v) Construct the variance-covariance matrix E~E of the obser-

vations assuming the standard errors in directions to equal
2".
233
APPENDIX I
Assumptions for and Derivation of the
Gaussian PDF
The derivation of the Gaussian PDF presented here is.due to G.H.L.
Hagen (1837). The first formulation of the normal law, however,originates
with De Moivre (1733).
(i) Let us assume that m independent physical causes are influ-
encing the measurement. Let each cause contribute an elementary error
either +~ or -~ towards the overall error s. Any value of s can thus be
expressed as a combination (there are n =~ such combinations) of m
elementary errors! ~. We note first that the span of s is <-m~, m~>.
Further, we realize that s can attain only a value of an integral mu+tiple
of ~. It is not difficult to see that any two adjacent values of s differ
by 2~ since one is obtained from the other by replacing -~ by +~ and vice-
versa. Dividing the range of E
Ra (s) = m~ - (-m~) = 2m~

by the step of s, i.e. 2~, we discover that s can attain any of the following
m + 1 values
s.
~
= (2. ~
- m)~ i = 0, 1, . . . m, (I-1)
corresponding to particular distinguishable combinations of the m elementary
errors.
(ii) Let us regard now the set D of all permissible values of E
D- {E -~ € ' • • • ••• 'e;. }

0 1 m
234
the probability space of the random sample consisting of all the 2m
combinations e. Obviously, many of the 2m combinations have the same values,
because there are only m + 1 different values available. The counts, ci'
of the individual values e. (see section 3.1.1) can be computed using the
l
combined probability (see section 2.3):
m i
c.=(~)= m (m-1) (m-2) ..• (m-:i+l) = II j/ II j. (I-2)
l l i ( i-1) ( i-2) . . . 1 j=m-i+l j=l
The actual probability of any value e. is then given by

l
c
P(.e.)
l
= .i = (~)/2m
l '
(I-3)
n
(see section 3.1.2).
(iii) The above formula describes the actual PDF of our sample e
in the discrete probability space D. Since our ultimate aim is to derive
the analytic expression for the "corresponding" (we shall see later what
is meant by corresponding) continuous PDF, we want to be able to express
P as a function of e. rather than i. The easiest way to do it is to use the

l
finite differences.
Let us: .define
oP(e.)
l
= P(e.)-
l
P(e.l -1 )
and we get from e~uation (I-3)
Obviously, the ratio oP (e.)/P(e.)

l l
is then given by
235
6P(e.)/P(e.)
~ ~
= 1- i/(m-i+l). ( I-4 )(
On the other hand, i can be expressed as a function of E. from equation

~
(I-1)
2i - m = E. I !J.
~
or
Renoting 6e =2/J. and substituting for i in equation (I-4) we obtain
6P(e) =l _ e/6e + m/2

P(e) m - e/oe- m/2 + 1
=1 + m/2 - e/8e - e/6e- m/2

1 + m/2 - e/8e
=1 - 2e/8e
= _______ ..;;..__
2E - OE
1 + m/2 - e/oe (1 + m/ 2) 0E - E
(iv) The next step in the development is to convert the discrete
PDF, P(E), to a continuous PDF, i.e. to derive the "corresponding" continuous
PDF. The "corresponding" PDF is assumed to be the PDF of such a variable
e which is defined the same way as the· discrete e in (i) with the exception
that m is let to grow beyond all limits, i.e. m ~ oo, By letting m grow
we would obtain infinitely large values of E (see equation (I-1». This
would contradict our experience teaching us that the errors are always finite
in value. Hence, we have to adopt another assumption and that is that as
m grows to infinity, the absolute value o.f elementary error t:. grows to zero,
making the product m!J. in equation (I-1) always .finite.

236
Accepting these two assumptions we can write the finite difference
equation as
oP( e:) 2E - oe:

lim =-.lim
P( E) (l+m/2) oe: - e: (I-5)
m+oo m+oo
0£ + 0 0£ + 0
which is nothing else but an ordinary differential equation for the con-
tinuous PDF P(s). It can thus be re~Titten as
d P( £) 2£ - de:
- - ~----~-------
P(e:) (1 + m/2) de: - s
To simplify the solution of this differential equation let us multiply
both the numerator and denominator of the right hand side by de: and assume
that mde: 2 is c-onst8.l'lt We further assume
Then we can write -
dP •
-p=-
2 e:de:
C/2 = - -c4 e:de: (I-6)
(v) We can now finally solve the differential equation. It is
solvable by direct integration and we get
!~ = - !~ e:de: + const.
4 e:2
. - c:2 + const.
Denoting the integration constant by tn K we finally obtain
P: K exp (-2e: 2 /C). (I-7)

237
The question now arises whether we are free to regard both K and
C as two independent parameters of the above PDF. We know that the basic
equation for a PDF, i.e

00
J
_oo
P(E) de: =1 (I-8)
has to be satisfied. Substituting for p into the basic equation we get
00 00
2
f -00 P(e:) de: =J -00
K exp (-2e: /C) de: =
00
2
=K J exp (-2e /C) de =1
-oo
00
and K =1 I I
-oo
2
exp (-2e /C) de .
Hence the answer is that K must not be regarded as an independent parameter.
It is a function of C and can be evaluated by solving the integral above.
We obtain
00 2 = .; Crr
f 2 !0 exp (-2~:: /C) de (I-9)
-00
2
and K =.;.L (I-10)

err •
The Gaussian PDF can then be written as

/2 2
P(t:) =G (C;e) = vcn exp (-2e /C) (I-ll)
and we can see that it is a one-parametric PDF.

238
APPENDIX II - A
r,RDINATES OF THE
STANDARD NORMAL CURVE
1 t2
Y = -= exp (- 2)
V27r
t 0
·-
1
__......
2 3 .. 4 5 6 7 8 9
0.0 .3~)89 .3989 .3989 .3988 .3986 .3984 .3982 .3980 .3977 .3973
0.1 .3970 .3965 .3961 .3956 .3951 .3945 .3939 .3932 .3925 .3918
0 ') .3910 .3902 .3894 .3885 .3876 .3867 .3857 .3847 .3836 .3825
0.:3 .3814 .3802 .3790 .3772 .3765 .3752 .3739 .3725 .3712 .3697
0.4 .3683 .3668 .3653 .3637 .3621 .3605 .3589 .3572 .3555 .3538
0.5 .3521 .3503 .3485 .3467 .3448 .3429 .3410 .3391 .3372 .3352
0.6 .3332 .3312 .3292 .3271 .3251 .3230 .3209 .3187 .3166 .3144
0.7 .3123 .3101 .3079 .3056 .30:34 .3011 .2980 .2966 .2943 .2920
I e.s
!
0.9
.2897
.2661.
.2874
.2637
.2850
.2613
.2827
.2589
.2803
.2565
.2780
.2541
.2756
.2516
.2732
.2492
.2709
.2468
.2685
.2444
1.0 .2·i20 .2:396 .2371 .234'7 .2323 .2299 .2275 .2251 .2227 .2203
1.1 .21.79 .2155 .2131 .2107 .2083 .2059 .2036 .2012 .1989 .1965
1.2 .1942 .1919 .1895 .1872 .1849 .1826 .1804 .1781 .1758 .1736
1 .u'J .1714 .1691 .1669 .1647 .1626 .1604 .1582 .1561 .1539 .1518
1.4 .1497 .1476 .1456 .1435 .1415 .1394 .1374 .1354 .1334 .1315
1.5 .1295 .1276 .1257 .1238 .1219 .1200 .1182 .1163 .1145 .1127
1.6 .1109 .1092 .1074 .1057 .1040 .1023 .1006 .0989
l 1.7 .0940 .0925 .0909 .0893 .0878 .0863 .0848 .08:33
.0973
.0818
.0957
.0804
l.R .0790 .0775 .0761 .0748 .0734 .0721 .0707 .0694 .0681 .0669
1.9 .0656 .0()44 .0632 .0620 .0608 .0596 .0584 .0573 .0562 .0551
2.0 .0540 .0529 .0519 .0508 .0498 .0488 .0478 .0468 .0459 .0449
2.1 .0440 .0431 .0422 .0413 .0404 .0396 .0387 .0379 .0371 .0363
2.2 .0355 .0347 .0339 .0332 .0325 .0317 .0310 .0303 .0297 .0290
2.3 .0283 .0277 .0270 .0264 .0258 .0252 .0246 .0241 .0235 .0229
2.4 .0224 .0219 .0213 .0208 .0203 .0198 .0194 .0189 .0184 .0180
~
?..5 .01.75 .0171. .0167 .0163 .0158 .01.54 .0151 .01.47 .0143 .0139
2.. t>" .0136 .0132 .0129 .0126 .0122 .0119 .0116 .0113 .0110 .0107
2.7 .0104 .0101 .0099 .0096 .0093 .0091 .0088, .0086 .0084 .0081
2.8 .0079 .0077 .00'75 .0073 .0071 .0069 .0067 .0065 .0063 .0061
2.9 .0060 .0058 .0056 .0055 .0053 .0051 .0050 .0048
. .0047 .0046
3.0 .0044 .0043 .0042 .0040 •0039 .0038 .0037 .0036 .0085 .0034
3.1 .0033 . 0032 .0031 . .0030 .0029 .0028 .0027 .0026 .0025 .0025
3.2 .002<1 .0023 .0022 .0022 .0021 .0020 .0020 .0019 .0018 .0018
u.,_,
'} ry
.OD17 .0017 .0016 .0016 .0015 .0015 .0014 .0014 .0013 .0013
3.t! .0012 .0012 .0012 .0011 .0011 .0010 .0010 .0010 .0009 .0009
~.5 .0009 .00.08 .0008 .0008 .0008 .0007 .0007 .0007 .0007 .0006
3.(3 .0006 .0006 .0006 .0005 .0005 .0005 .0005 .0005 .0005 .0004
3..7 .OfJO'i .0004 .0004 .0004 .0004 .0004 .0003 .0003 .0003 .0003
:l.8 .0003 .0003 .0003 .0008 .0003 .0002 .0002 .0002 .0002 .0002
:J.9 .0002 .0002 .0002 .0002 .0002 .0002 .0002 .0002 .0001 .0001
239
APPENDIX II - B
AREA-S UNDER THE

. ST'ANDAR.D NORMAL CURVE·
· • ··· from--co. to-t ·
·:·~~··:' ;~(~~!_'~~2f~~·(
. . Y\W'aJ~L£;. ,.,,
~~2i~.
;..-· •· .. ~,.· :... ,,:;~·;·.w.::._:.:(:"··.-·..-..\~~~.:;.~ ...... ,.. ·~ . . . . . .:...... -
t 0 1 2 3 4 5 6 7 8 9
0.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359
0.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5754
0.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141
0.3 .6179 .6217 .6251? .6293 .6331 .6368 .6406 .6443 .6480 .6517
0.4 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879
0.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
0.6 .7258 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7518 .7549
0.7 .7580 .7612 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
Q.8 .7881 .7910 .7939 .7967 .7996 .8023 .8051 .8078 .8106 .8133
0.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
1.0 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .8621
1.1 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .8830
1.2 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .9015
1.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .9177
1.4 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319
1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441
1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545
1.'7 .9554 .9564 .9573 .9582 .9591 .9599 ,9608 .9616 .U625 .9633
1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706
1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767
2.0 .. 9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .9817
2.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .9857
2.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .9890
2.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916
2.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936
2.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .9952
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .Q961 .9962 .9963 .9964
2.7 .9965 .9966 ,9967 .9968 .9969 .9970 .9971 .9972 .'9973 .9974
2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981
2.9 .9981 .9982 .9982 .9983 .9984 .9984
. .9985 .9985 .9986 .9986
3.0 . .9987 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990 .9990
3.1 .9990 .9991 .9991 .9991 .9992 .9992 .9992 .9992 .9993 .9993
3.2 .9993 .9993 .9994 .9994 .9994 .9994 .9994 .9995 .9995 .9995
3.3 .9995 .9995 .9995 .9996 .9996 .9996 .9996 .9996 .9996 .9997
3.4 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9998
3.5 .9998 ·.9998 .9998 .9998 .9998 .9998 ~9998 .9998 .9998 .9998
3.6 .9998 .9998 .9999 .9999 .9999 .9999 .9999 .9999 .9999 .9999
3.7 .9999 .9999 .9999 .9999 .9999 .9999 .9999 .9999 .9999 .9999
3.8 .9999 .9999 .9999 .9999 .9999 .9999 .9999 .9999 .9999 .9999
3.9 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
240
APPENDIX II - C
A
AREAS
under the
STANDARD
NORMAL CURVE
from 0 to t 0 . t
t 0 1 2 3 4 5 6 7 8 9
0.0 .0000 .0040 .0080 .0120 .0160 .0199 .0239 .0279 .0319 .0359
0.1 .0398 .0438 .0478 .0517 .0557 .0596 .0636 .0675 .0714 .0754
0.2 .0793 .0832 .0871 .0910 .0948 .0987 .1026 .1064 .1103 .1141
0.3 .1179 .1217 .1255 .1293 .1331 .1368 .1406 .1443 .1480 .1517
0.4 .1554 .1591 .1628 .1664 .1700 .1736 .1772 .1808 .1844 .1879
.0.5 .1915 .1950 .1985 .2019 .2054 .2088 .2123 .2157 .2190 .2224
0.6 .2258 .2291 .2324 .2357 .2389 .2422 .2454 .2486 .2518 .2549
0.7 .2580 .2612 .2642 .2673 .2704 .2734 .2764 .2794 .2823 .2852
0.8 .2881 .2910 .2939 .2967 .2996 .3023 .3051 .3078 .3106 .3133
0.9 .3159 .3186 .3212 .3238 .3264 .3289 .3315 .3340 .3365 .3389
1.0 .3413 .3438 .3461 .3485 .3508 .3531 .3554 .3577 .3599 ..3621
1.1 .3643 .3665 .3686 .3708 .3729 .3749 .3770 .3790 .3810 .3830
1.2 .3849 .3869 .3888 .3907 .3925 .3944 .3962 .3980 .3997 .4015
1.3 .4032 .4049 .4066 .4082 .4099 .4115 .4131 .4147 .4162 .4177
1.4 .4192 .4207 .4222 .4236 .4251 .4265 .4279 .4292 .4306 .4319
1.5 .1332 .4345 .4357 .4370 .4382 .4394 .4406 .4418 .4420 .4441
1.fl .4452 .4463 .4474 .4484 .4495 .4505 .4515 .4525 .4535 .4545
1.7 .4554 .4564 .4573 .4582 .4591 .4599 A608 .4616 .4625 .4633
.· 1.8 .4641 .4649 .4656 .4664 .4671 .4678 .4686 .4693 .4699 .4706
1.9 .4713 .4719 .4726 .4732 ,4738 .4744 .4750 .4756 .4761 .4767
2.0 .4772 .4778 .4783 .4788 .4793 .4798 .4803 .4808 . .4812 .4817
2.1 .4821 .4826 .4830 .4834 .4838 .4842 .4846 .4850 .4854 .4857
2.2 .4861 .4864 .4868 .4871 .4875 .4878 .4881 .4884 .4887 .4890
~.3 .4893 .4896 .4898 .4901 .4904 .490() .4909 ..4911 .4!>13 . .4916
2.•i .4918 .4920 .4922 .4925 .4927 .4929 .4931 .4932 .4934 .4936
2.5 A938 .4940 .4941 .4943 .4945 .4946. .4948 .4949 .4951 .4952
2.6 .4953 .4955 .4956 .4957 .4959 .4960 .4961 .4962 .4963 .4964
2.7 .4965 .4966 .4967 .4968 .4969 .4970 .4971 .4972 .4973 .4974
2.8 .4974 .4975 .4976 .4977 .4977 .4978 .4979 .4979 .4980 .4981
2.9 .4981 .4982 .4982 .4983 .4984 .4984 .4985 .4985 .4986 .4986
.
3.0 .4987 .4987 .4987 .4988 .4988 .4989 .4989 .4989 .4990 .4990
3.1 .4990 .4991 .4991 .4991 .4992 .4992 .4992 .4992 .4993 .4993
3.2 .4993 .4993 .4994 .4994 •.;,994 .4994 .4994 .4995 .4995 .4995
3.3 .4995 .4995 .4995 .4996 .4996 .4996 .4996 .4996 .4996 .4997
3.4 .4997 .4997 .4997 .4997 .4997 .4997 .4997 .4997 .4l:l97 ..1!)98
3.5 .4998 .4998 .4998 .4998 .4998 .4998 .4998 .4998 .4998 .4998
3.6 .4998 .4998 .499~) .4999 .4999 .4999 .4999 .4999 .4999 •.4999
3.7 .4999 .4999 .4999 .4999 .4999 .4999 .4999 .4999 .4999 .4999
3.8 .49()9 .4999 .4999 .4999 .4999 A999 .4999 .4999 .4999 .4999
·~.D .5000 .5000 .5000 .5000 .5000 .5000 .5000 .5000 .5000 .5000
241
BIBLIOGRAPHY
Anderson, T.W~ 1966 (7th printing): An Introduction to Multivariate

Statistical Analysis, Wiley & Sons.
Dorrer, E., 1966: Adjustment Computations, Department of Surveying

Engineering, U.N.B., Lecture Notes No. 12.
Fraser, D.A.s.; 1967 (4th printing): Statistics- An Introduction,

Wiley & Sons •
Hamilton, A.C., 1964: Statistics in Physical Science, Ronald.
Hirvonen, R.A., 1971: Adjustment by Least Squares in Geodesy and Photo-

grammetry, Ungar.
Hogg, R.V., Craig, A.T., 1966 (5th printing): Introduction to Mathematical

Statistics, MacMillan.
Lipschutz, s., 1968: Probability, Schaum's Outline Series, MacGraw-Hill.
Speigel, M.R., 1961: Statistics, Schaum's Outline Series, McGraw-Hill.
Vander Waerden, B.L., 1969: Mathematical Statistics, Sp~inger-Verlag.
Wells, D.E. and Krakiwsky, E.J., 1971: The Method of Least Squares,
Department of Surveying Engineering, U.N.B., Lecture Notes No. 18.
Wilks, S. S. , 1963 (2nd printing): Mathematical Statistics, Wiley and Sons.
Wonnacott, T.H. and Wonnacott, R.J., 1972 (2nd edition): Introductorz

Statistics, Wiley & Sons.

P. Vanicek, Introduction To Adjustment Calculus

Uploaded by

Copyright:

Available Formats

P. Vanicek, Introduction To Adjustment Calculus

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

P. Vanicek, Introduction To Adjustment Calculus

Uploaded by

Copyright:

Available Formats

INTRODUCTION TO

Department of Geodesy & Geomatics Engineering

It has long been the author's conviction that most of the

adjustment purely as a technique without giving the student a deeper

insight without answering a good many questions beginning with "why".

This course is a result of a humble attempt to present the adjustment

structure; simply as an adjustment calculus. Evidently, when one tries

to take an unconventional approach, one is only too liable to make

as preliminary- of the Introduction to Adjustment Calculus, written for

course SE 3101 in 1971. Many people have kindly communicated their

indebted. In particular, Dr. L. Hrad{lek, Professor at the Charles

University in Prague, and Dr. B. Lund, Assistant Professor at the Math-

many points. Mr. M. Nassar, a Ph.D. student in this department, carried

Many of the improvements in formulations as well as most of the examples

and exercise problems contained herein originated from him.

None of the contributors should however, be held responsible for

municated to the author will be highly appreciated.

1. Fundamentals of the Intuitive Theory of Sets

1.1 Sets, Elements and Subsets 6

2. Fundamentals of the Mathematical Theory of Probability

2.1 Probability Space, Probability Function and Probabil..,.

3.1 Statistics of an Actual Sample

4.1 Basic Definitions . . . • . • . 89

5.1 .-The Sample Mean as "The Least Squares Estimator~' • • . 123

5.4 Least-Squares Principle for Random Multivariate 130

6.1 Primary and Derived Random Samples 133

Appendix I Assumptions for and Derivation of the Gaussian PDF . 233

In technical practice, as well as in all experimental sciences,

one is faced with the following problem: evaluate quantitatively para-

meters describing properties, features, relations or behaviour of various

basis of the results of some measurements or observations. We may, for

example, be faced with the problem of evaluating the length of a string.

determine is the observed quantity itself and the problem is fairly

simple. More complicated proposal would be, for instance, to determine

the coefficient of expansion of a rod. •rhen the parameter--the

coefficient of expansion--cannot be measured directly, as in the previous

length, by performing some computations using the mathematical relationship

whose parameters we are trying to determine. Obviously, the determination

of the orbital parameters of a satellite from various angles observed on

The adjustment is a discipline that tries to categorise those

problems and attempts to deal with them symmetrically. In order to be

Hence, the problem to be treated has to be first "translated" into the

language of mathematics, i.e., the problem has to be first mathematically

formulated. The mathematical formulation of the problem would really be

the mathematical formulation of the relation between the observed quan-

tities (observables) and the wanted quantities (parameters). This relation-

ship is called the mathematical model. Denoting the observables by L

(L stands for one, two, or n quantities) and the parameters by X (X stands

model cna be written as

The above equation merely states that there is a (implicit) relation

between the observables and the parameters. The formulation of an actual

and geometrical laws--simply using the accumulated experience. The com-

plexity of the mathematical model reflects the complexity of the problem

itself. Thus the mathematical model of our first problem is practically

where X is the wanted length and L is the observed length.

The mathematical model for the coefficient of expansion a of the

rod is more complicated, namely, for instance

where a = X, the observed length R, and the observed difference in temper-

ature t create L and R,