Error and Uncertainty - Engineering Surveying by W. Schofield
Error and Uncertainty - Engineering Surveying by W. Schofield
Error and Uncertainty - Engineering Surveying by W. Schofield
Accuracy can be estimated from residuals, for example, in the two sets of measurements below, which
mean is the more accurate, that of the measurements of line AB or line XY?
Line AB XY
measure residuals measure residuals
25.34 m +0.02 m 25.31 m 0.01 m
25.49 m +0.17 m 25.33 m +0.01 m
25.12 m 0.20 m 25.32 m 0.00 m
25.61 m +0.29 m 25.33 m +0.01 m
25.04 m 0.28 m 25.31 m 0.01 m
Mean 25.32 m 25.32 m
The residuals in this case are differences between the individual observations and the best estimate of
the distance, that is the arithmetic mean. It is clear from inspection of the two sets of residuals that the
length of line XY appears to be more accurately determined than that of line AB.
Precision is a measure of repeatability. Small residuals indicate high precision, so the mean of line XY
is more precisely determined than the mean of line AB. High precision does not necessarily indicate high
accuracy. For example, if the tape used to measure line XY was in decimals of a yard and the surveyor
assumed it was in metres, then the computed mean of line XY would be very precise but also very inaccurate.
In general
but in practice the computed precision is often taken as the assessed accuracy.
Coordinates and their accuracy and precision may be stated as being relative or absolute. Absolute
values are with respect to some previously defined datum. Relative values are those with respect to another
station. For example, the Ordnance Survey (OS) coordinates of a GPS passive network station might
be assumed to be absolute coordinates since they are with respect to the OSTN02 datum of UK. The
coordinates of a new control station on a construction site may have been determined by a series of
observations including some to the GPS station. The precision of the coordinates of the new station may
better be expressed with respect to the OSTN02 datum, or alternatively with respect to the coordinates
of another survey station on site. In the former case they may be considered as absolute and in the latter
as relative. The difference between absolute and relative precisions is largely one of local definition and
therefore of convenience. In general
Accuracy and precision are usually quoted as a ratio, or as parts per million, e.g. 1:100 000 or 10 ppm, or
in units of the quantity measured, e.g. 0.03 m.
Error is the difference between an actual true valve and an estimate of that true value. If the estimate is
a bad one, then the error will be large.
Of these three concepts, accuracy, precision and error, only precision may be numerically defined
from appropriate computations with the observations. Accuracy and error may be assumed, sometimes
erroneously, from the precision but they will never be known for sure. The best estimate of accuracy is
usually the precision but it will usually be overoptimistic.
20 Engineering Surveying
The system most commonly used for the definition of units of measurement, for example of distance and
angle, is the Systme Internationale, abbreviated to SI. The basic units of prime interest are:
Length in metres (m)
from which we have:
Thus a distance measured to the nearest millimetre would be written as, say, 142.356 m.
Similarly for areas we have:
1 m2 = 106 mm2
104 m2 = 1 hectare (ha)
106 m2 = 1 square kilometre (km2 )
1 = 60
1 = 60
A radian is that angle subtended at the centre of a circle by an arc on the circumference equal in length to
the radius of the circle, i.e.
Thus to transform degrees to radians, multiply by /180 , and to transform radians to degrees, multiply
by 180 /. It can be seen that:
A factor commonly used in surveying to change angles from seconds of arc to radians is:
Engineers and surveyors communicate a great deal of their professional information using numbers. It is
important, therefore, that the number of digits used, correctly indicates the accuracy with which the field
data were measured. This is particularly important since the advent of pocket calculators, which tend to
present numbers to as many as eight places of decimals, calculated from data containing, at the most, only
three places of decimals, whilst some eliminate all trailing zeros. This latter point is important, as 2.00 m
is an entirely different value to 2.000 m. The latter number implies estimation to the nearest millimetre as
opposed to the nearest 10 mm implied by the former. Thus in the capture of field data, the correct number
of significant figures should be used.
By definition, the number of significant figures in a value is the number of digits one is certain of
plus one, usually the last, which is estimated. The number of significant figures should not be confused
with the number of decimal places. A further rule in significant figures is that in all numbers less than
unity, the zeros directly after the decimal point and up to the first non-zero digit are not counted. For
example:
Two significant figures: 40, 42, 4.2, 0.43, 0.0042, 0.040
Three significant figures: 836, 83.6, 80.6, 0.806, 0.0806, 0.00800
Difficulties can occur with zeros at the end of a number such as 83 600, which may have three, four or
five significant figures. This problem is overcome by expressing the value in powers of ten, i.e. 8.36
104 implies three significant figures, 8.360 104 implies four significant figures and 8.3600 104 implies
five significant figures.
It is important to remember that the accuracy of field data cannot be improved by the computational
processes to which it is subjected.
Consider the addition of the following numbers:
155.486
7.08
2183.0
42.0058
If added on a pocket calculator the answer is 2387.5718; however, the correct answer with due regard
to significant figures is 2387.6. It is rounded off to the most extreme right-hand column containing
all the significant figures, which in the example is the column immediately after the decimal point. In
the case of 155.486 + 7.08 + 2183 + 42.0058 the answer should be 2388. This rule also applies to
subtraction.
In multiplication and division, the answer should be rounded off to the number of significant figures
contained in that number having the least number of significant figures in the computational process.
For instance, 214.8432 3.05 = 655.27176, when computed on a pocket calculator; however, as 3.05
contains only three significant figures, the correct answer is 655. Consider 428.4 621.8 = 266379.12,
which should now be rounded to 266 400 = 2.664 105 , which has four significant figures. Similarly,
41.8 2.1316 = 19.609683 on a pocket calculator and should be rounded to 19.6.
When dealing with the powers of numbers the following rule is useful. If x is the value of the first
significant figure in a number having n significant figures, its pth power is rounded to:
n 1 significant figures if p x
n 2 significant figures if p 10x
For example, 1.58314 = 6.28106656 when computed on a pocket calculator. In this case x = 1, p = 4 and
p 10x; therefore, the answer should be quoted to n 2 = 3 significant figures = 6.28.
22 Engineering Surveying
Similarly, with roots of numbers, let x equal the first significant figure and r the root; the answer should
be rounded to:
n significant figures when rx 10
n 1 significant figures when rx < 10
For example:
1
36 2 = 6, because r = 2, x = 3, n = 2, thus rx < 10, and answer is to n 1 = 1 significant figure.
1
415.36 4 = 4.5144637 on a pocket calculator; however, r = 4, x = 4, n = 5, and as rx > 10, the
answer is rounded to n = 5 significant figures, giving 4.5145.
As a general rule, when field data are undergoing computational processing which involves several inter-
mediate stages, one extra digit may be carried throughout the process, provided the final answer is rounded
to the correct number of significant figures.
It is well understood that in rounding numbers, 54.334 would be rounded to 54.33, whilst 54.336 would
become 54.34. However, with 54.335, some individuals always round up, giving 54.34, whilst others
always round down to 54.33. Either process creates a systematic bias and should be avoided. The process
which creates a more random bias, thereby producing a more representative mean value from a set of
data, is to round to the nearest even digit. Using this approach, 54.335 becomes 54.34, whilst 54.345 is
54.34 also.
It should now be apparent that position fixing simply involves the measurement of angles and distance.
However, all measurements, no matter how carefully executed, will contain error, and so the true value of
a measurement is never known. It follows from this that if the true value is never known, the true error can
never be known and the position of a point known only with a certain level of uncertainty.
The sources of error fall into three broad categories, namely:
(1) Natural errors caused by variation in or adverse weather conditions, refraction, unmodelled gravity
effects, etc.
(2) Instrumental errors caused by imperfect construction and adjustment of the surveying instruments
used.
(3) Personal errors caused by the inability of the individual to make exact observations due to the limitations
of human sight, touch and hearing.
(1) Mistakes are sometimes called gross errors, but should not be classified as errors at all. They are
blunders, often resulting from fatigue or the inexperience of the surveyor. Typical examples are
omitting a whole tape length when measuring distance, sighting the wrong target in a round of angles,
reading 6 on a levelling staff as 9 and vice versa. Mistakes are the largest of the errors likely to
arise, and therefore great care must be taken to obviate them. However, because they are large they
are easy to spot and so deal with.
Error and uncertainty 23
(2) Systematic errors can be constant or variable throughout an operation and are generally attributable to
known circumstances. The value of these errors may often be calculated and applied as a correction to
the measured quantity. They can be the result of natural conditions, examples of which are: refraction
of light rays, variation in the speed of electromagnetic waves through the atmosphere, expansion or
contraction of steel tapes due to temperature variations. In all these cases, corrections can be applied
to reduce their effect. Such errors may also be produced by instruments, e.g. maladjustment of the
theodolite or level, index error in spring balances, ageing of the crystals in EDM equipment. One
form of systematic error is the constant error, which is always there irrespective of the size of the
measurement or the observing conditions. Examples of constant errors in tape measurement might be
due to a break and join in the tape or tape stretch in the first metre of the tape. In this case the remedy
is to ensure that the tape is undamaged and also not to use the first metre of the tape. Examples of
constant errors in the repeated observations of a horizontal angle with a theodolite to elevated targets
might be miscentring over the station or dislevelment of the theodolite. In this case, the remedy is to
ensure that the theodolite is correctly set up.
There is the personal error of the observer who may have a bias against setting a micrometer or
in bisecting a target, etc. Such errors can frequently be self-compensating; for instance, a person
observing a horizontal angle to a cylindrical target subject to phase, the apparent biased illumination
by the sun when shining on one side will be subject to a similar bias on a similar target nearby and so
the computed angle between will be substantially correct.
Systematic errors, in the main, conform to mathematical and physical laws; thus it is argued that
appropriate corrections can be computed and applied to reduce their effect. It is doubtful, however,
whether the effect of systematic errors is ever entirely eliminated, largely due to the inability to obtain an
exact measurement of the quantities involved. Typical examples are: the difficulty of obtaining group
refractive index throughout the measuring path of EDM distances; and the difficulty of obtaining
the temperature of the steel tape, based on air temperature measurements with thermometers. Thus,
systematic errors are the most difficult to deal with and therefore they require very careful consideration
prior to, during, and after the survey. Careful calibration of all equipment is an essential part of
controlling systematic error.
(3) Random errors are those variates which remain after all other errors have been removed. They are
beyond the control of the observer and result from the human inability of the observer to make exact
measurements, for reasons already indicated above.
Random errors should be small and there is no procedure that will compensate for or reduce any
one single error. The size and sign of any random error is quite unpredictable. Although the behaviour
of any one observation is unpredictable the behaviour of a group of random errors is predictable and
the larger the group the more predictable is its behaviour. This is the basis of much of the quality
assessment of survey products.
Random variates are assumed to have a continuous frequency distribution called normal distribution
and obey the law of probability. A random variate, x, which is normally distributed with a mean and
standard deviation, is written in symbol form as N(, 2 ). Random errors alone are treated by statistical
processes.
The basic concept of errors in the data captured by the surveyor may be likened to target shooting.
In the first instance, let us assume that a skilled marksman used a rifle with a bent sight, which resulted
in his shooting producing a scatter of shots as at A in Figure 2.1.
That the marksman is skilled (or consistent) is evidenced by the very small scatter, which illustrates
excellent precision. However, as the shots are far from the centre, caused by the bent sight (systematic
error), they are completely inaccurate. Such a situation can arise in practice when a piece of EDM equipment
produces a set of measurements all agreeing to within a few millimetres (high precision) but, due to an
24 Engineering Surveying
operating fault and lack of calibration, the measurements are all incorrect by several centimetres (low
accuracy). If the bent sight is now corrected, i.e. systematic errors are minimized, the result is a scatter of
shots as at B. In this case, the shots are clustered near the centre of the target and thus high precision, due
to the small scatter, can be related directly to accuracy. The scatter is, of course, due to the unavoidable
random errors.
If the target was now placed face down, the surveyorstask would be to locate the most probable position
of the centre based on an analysis of the position of the shots at B. From this analogy several important
facts emerge, as follows.
(1) Scatter is an indicator of precision. The wider the scatter of a set of results about the mean, the less
repeatable the measurements are.
(2) Precision must not be confused with accuracy; the former is a relative grouping without regard to
nearness to the truth, whilst the latter denotes absolute nearness to the truth.
(3) Precision may be regarded as an index of accuracy only when all sources of error, other than random
errors, have been eliminated.
(4) Accuracy may be defined only by specifying the bounds between which the accidental error of a
measured quantity may lie. The reason for defining accuracy thus is that the absolute error of the
quantity is generally not known. If it were, it could simply be applied to the measured quantity to give
its true value. The error bound is usually specified as symmetrical about zero. Thus the accuracy of
measured quantity x is x x where x is greater than or equal to the true but unknown error of x.
(5) Position fixing by the surveyor, whether it is the coordinate position of points in a control network, or
the position of topographic detail, is simply an assessment of the most probable position and, as such,
requires a statistical evaluation of its precision.
(1) The true value of a measurement can never be found, even though such a value exists. This is evident
when observing an angle with a one-second theodolite; no matter how many times the angle is read,
a slightly different value will always be obtained.
(2) True error (x ) similarly can never be found, for it consists of the true value (X) minus the observed
value (x), i.e.
X x = x
(3) Relative error is a measure of the error in relation to the size of the measurement. For instance, a distance
of 10 m may be measured with an error of 1 mm, whilst a distance of 100 m may also be measured
to an accuracy of 1 mm. Although the error is the same in both cases, the second measurement may
Error and uncertainty 25
clearly be regarded as more accurate. To allow for this, the term relative error (Rx ) may be used, where
Rx = x /x
Thus, in the first case x = 10 m, x = 1 mm, and therefore Rx = 1/10 000; in the second case,
Rx = 1/100 000, clearly illustrating the distinction. Multiplying the relative error by 100 gives the
percentage error. Relative error is an extremely useful definition, and is commonly used in expressing
the accuracy of linear measurement. For example, the relative closing error of a traverse is usually
expressed in this way. The definition is clearly not applicable to expressing the accuracy to which an
angle is measured, however.
(4) Most probable value (MPV) is the closest approximation to the true value that can be achieved from
a set of data. This value is generally taken as the arithmetic mean of a set, ignoring at this stage the
frequency or weight of the data. For instance, if A is the arithmetic mean, X the true value, and n the
errors of a set of n measurements, then
n
A=X
n
where n is the sum of the errors. As the errors are equally as likely to be positive as negative, then
for a finite number of observations n /n will be very small and A X. For an infinite number of
measurements, it could be argued that A = X.
(5) The residual is the difference between the MPV of a set, i.e. the arithmetic mean, and the observed
values. Using the same argument as before, it can be shown that for a finite number of measurements,
the residual r is approximately equal to the true error .
2.4.4 Probability
Consider a length of 29.42 m measured with a tape and correct to 0.05 m. The range of these measure-
ments would therefore be from 29.37 m to 29.47 m, giving 11 possibilities to 0.01 m for the answer. If the
next bay was measured in the same way, there would again be 11 possibilities. Thus the correct value for
the sum of the two bays would lie between 11 11 = 121 possibilities, and the range of the sum would
be 2 0.05 m, i.e. between 0.10 m and +0.10 m. Now, the error of 0.10 m can occur only once,
i.e. when both bays have an error of 0.05 m; similarly with +0.10. Consider an error of 0.08; this
can occur in three ways: (0.05 and 0.03), (0.04 and 0.04) and (0.03 and 0.05). Applying this
procedure through the whole range can produce Table 2.1, the lower half of which is simply a repeat of
the upper half. If the decimal probabilities are added together they equal 1.0000. If the above results are
plotted as error against probability the histogram of Figure 2.2 is obtained, the errors being represented
by rectangles. Then, in the limit, as the error interval gets smaller, the histogram approximates to the
superimposed curve. This curve is called the normal probability curve. The area under it represents the
probability that the error must lie between 0.10 m, and is thus equal to 1.0000 (certainty) as shown in
Table 2.1.
More typical bell-shaped probability curves are shown in Figure 2.3; the tall thin curve indicates small
scatter and thus high precision, whilst the flatter curve represents large scatter and low precision. Inspection
of the curve reveals:
(1) Positive and negative errors are equal in size and frequency; they are equally probable.
(2) Small errors are more frequent than large; they are more probable.
(3) Very large errors seldom occur; they are less probable and may be mistakes or untreated systematic
errors.
The equation of the normal probability distribution curve is:
1 2 2
e 2 (x)
y= 1
(2) 2
where y = probability of the occurrence of x , i.e. the probability that x the variate deviates this far
from the central position of the distribution , is the spread of the distribution and e = the base of natural
logarithms. If = 0, i.e. the centre of the distribution is at zero and = 1, i.e. the spread is unity, the
formula for the probability simplifies to:
1 2
e 2 x
y= 1
(2 ) 2
As already illustrated, the area under the curve represents the limit of relative frequency, i.e. probability,
and is equal to unity. Thus a table of Normal Distribution curve areas (Table 2.2) can be used to calculate
Error and uncertainty 27
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
How to use the table: If (x )/ = 1.75 look down the left column to 1.7 and across the row to the element in the column headed
0.05; the value for the probability is 0.9599, i.e. the probability is 95.99%.
28 Engineering Surveying
probabilities provided that the distribution is the standard normal distribution, i.e. N(0, 12 ). If the variable
x is N(, 2 ), then it must be transformed to the standard normal distribution using Z = (x )/ , where
1
Z has a probability density function equal to (2 ) 2 eZ /2 .
2
x =
For example, the probability that x will fall between 0.5 and 2.4 is represented by area A on the normal
curve (Figure 2.4(a)). This statement can be written as:
P(0.5 < x < 2.4) = area A
Now Area A = Area B Area C (Figure 2.4(b) and (c))
where Area B represents P(x < 2.4)
and Area C represents P(x < 0.5)
i.e. P(0.5 < x < 2.4) = P(X < 2.4) P(X < 0.5)
From the table of the Normal Distribution (Table 2.2):
When x = 2.4, Area = 0.9918
When x = 0.5, Area = 0.6915
P(0.5 < x < 2.4) = 0.9918 0.6195 = 0.3723
That is, there is a 37.23% probability that x will lie between 0.5 and 2.4.
If verticals are drawn from the points of inflexion of the normal distribution curve (Figure 2.5) they will
cut that base at x and +x , where x is the standard deviation. The area shown indicates the probability
that x will lie between x and equals 0.683 or 68.3%. This is a very important statement.
Error and uncertainty 29
(a)
Frequency
Area A
0.5 2.4
4 3 2 1 0 1 2 3 4
Value of measurement
(b)
Frequency
Area B
2.4
4 3 2 1 0 1 2 3 4
Value of measurement
(c)
Frequency
Area C
0.5
4 3 2 1 0 1 2 3 4
Value of measurement
The standard deviation (x ), if used to assess the precision of a set of data, implies that 68% of the time,
the arithmetic mean (x) of that set should lie between (x x ). Put another way, if the sample is normally
distributed and contains only random variates, then 7 out of 10 should lie between (x x ). It is for this
reason that two-sigma or three-sigma limits are preferred in statistical analysis:
2x = 0.955 = 95.5% probability
It is important to be able to assess the precision of a set of observations, and several standards exist for
doing this. The most popular is standard deviation ( ), a numerical value indicating the amount of variation
about a central value.
In order to find out how precision is determined, one must first consider a measure which takes into
account all the values in a set of data. Such a measure is the deviation from the mean (x) of each observed
value (xi ), i.e. (xi x), and one obvious consideration would be the mean of these values. However, in a
normal distribution the sum of the deviations would be zero because the sum of the positive deviations
would equal the sum of the negative deviations. Thus the mean of the squares of the deviations may be
used, and this is called the variance ( 2 ).
i=n
2 = (xi x)2 /n (2.1)
i=1
Theoretically is obtained from an infinite number of variates known as the population. In practice,
however, only a sample of variates is available and S is used as an unbiased estimator. Account is taken of
Error and uncertainty 31
the small number of variates in the sample by using (n 1) as the divisor, which is referred to in statistics
as the Bessel correction; hence, variance is:
n
S =
2
(xi x)2 /n 1 (2.2)
i=1
As the deviations are squared, the units in which variance is expressed will be the original units squared.
To obtain an index of precision in the same units as the original data, therefore, the square root of the variance
is used, and this is called standard deviation (S), thus:
1
n 2
Standard deviation = S = (xi x)2 /n 1 (2.3)
i=1
Standard deviation is represented by the shaded area under the curve in Figure 2.5 and so establishes
the limits of the error bound within which 68.3% of the values of the set should lie, i.e. seven out of a
sample of ten.
Similarly, a measure of the precision of the mean (x) of the set is obtained using the standard error
(Sx ), thus:
n 1
2
1
Standard error = Sx = (xi x)2 /n(n 1) = S/n 2 (2.4)
i=1
Standard error therefore indicates the limits of the error bound within which the true value of the mean
lies, with a 68.3% certainty of being correct.
It should be noted that S and Sx are entirely different parameters. The value of S will not alter signifi-
cantly with an increase in the number (n) of observations; the value of Sx , however, will alter significantly
as the number of observations increases. It is important therefore that to describe measured data both values
should be used.
2.6 WEIGHT
Weights are expressed numerically and indicate the relative precision of quantities within a set. The greater
the weight, the greater the precision of the observation to which it relates. Thus an observation with a weight
of two may be regarded as more precise than an observation with a weight of one. Consider two mean
measures of the same angle: A = 50 50 50 of weight one, and B = 50 50 47 of weight two. This is
equivalent to three observations, 50 , 47 , 47 , all of equal weight, and having a mean value of
quantity, i.e. w n; (c) by the use of variance and co-variance factors. This last method is recommended
and in the case of the variance factor is easily applied as follows. Equation (2.4) shows
1
Sx = S/n 2
That is, error is inversely proportional to the square root of the number of measures. However, as w n,
then
2
w 1/Sx
i.e. weight is proportional to the inverse of the variance.
It is not unusual, when taking repeated measurements of the same quantity, to find at least one which appears
very different from the rest. Such a measurement is called an outlier, which the observer intuitively feels
should be rejected from the sample. However, intuition is hardly a scientific argument for the rejection of
data and a more statistically viable approach is required.
As already indicated, standard deviation represents 68.3% of the area under the normal curve and
is therefore representative of 68.26% confidence limits. This leaves 31.74% of the area under the tails
of the curve, i.e. 0.1587 or 15.87% on each side. In Table 2.2 the value of z at 1.00 is 0.8413
(0.8413 = 1 0.1587) and indicates that the table is only concerned with the tail on one side of not both.
Therefore to calculate confidence limits for a variate both tails must be considered, so if 95% confidence
limits are required that implies that each tail will be 2.5% and so look for 97.5% or 0.9750 in the table.
The value of z = ( x)/ associated with 0.9750 in Table 2.2 is 1.96. This indicates that for a Normal
Distribution 95% of the population lies within 1.96 of .
Thus, any random variate xi , whose residual error (xi x) is greater than 1.96S, must lie in the extreme
tail ends of the normal curve and might therefore be ignored, i.e. rejected from the sample. In the Normal
Distribution the central position of the distribution is derived from the theoretical infinite population.
In practice, in survey, it is derived from a limited data set. For example, the true value of a measurement
of a particular distance could only be found by averaging an infinite number of observations by an infinite
number of observers with an infinite number of measuring devices. The best one could hope for in practice
would be a few observations by a few observers with very few instruments. Therefore the computed mean
value of the observations is an estimate, not the true value of the measurement. This uncertainty is taken
into account by using the t distribution (Table 2.3) rather than the Normal Distribution.
Worked example
Area = probability 0.800 0.900 0.950 0.980 0.990 0.995 0.998 0.999
degrees of freedom N t
1 1.376 3.078 6.314 15.895 31.821 63.657 159.153 318.309
2 1.061 1.886 2.920 4.849 6.965 9.925 15.764 22.327
3 0.978 1.638 2.353 3.482 4.541 5.841 8.053 10.215
4 0.941 1.533 2.132 2.999 3.747 4.604 5.951 7.173
5 0.920 1.476 2.015 2.757 3.365 4.032 5.030 5.893
6 0.906 1.440 1.943 2.612 3.143 3.707 4.524 5.208
7 0.896 1.415 1.895 2.517 2.998 3.499 4.207 4.785
8 0.889 1.397 1.860 2.449 2.896 3.355 3.991 4.501
9 0.883 1.383 1.833 2.398 2.821 3.250 3.835 4.297
10 0.879 1.372 1.812 2.359 2.764 3.169 3.716 4.144
12 0.873 1.356 1.782 2.303 2.681 3.055 3.550 3.930
14 0.868 1.345 1.761 2.264 2.624 2.977 3.438 3.787
16 0.865 1.337 1.746 2.235 2.583 2.921 3.358 3.686
18 0.862 1.330 1.734 2.214 2.552 2.878 3.298 3.610
20 0.860 1.325 1.725 2.197 2.528 2.845 3.251 3.552
25 0.856 1.316 1.708 2.167 2.485 2.787 3.170 3.450
30 0.854 1.310 1.697 2.147 2.457 2.750 3.118 3.385
40 0.851 1.303 1.684 2.123 2.423 2.704 3.055 3.307
60 0.848 1.296 1.671 2.099 2.390 2.660 2.994 3.232
100 0.845 1.290 1.660 2.081 2.364 2.626 2.946 3.174
1000 0.842 1.282 1.646 2.056 2.330 2.581 2.885 3.098
Find the appropriate value in the row N = 5 in the t table (Table 2.3). At a probability of 0.95 the value of
t is 2.015 therefore the computed value of 2.064 indicates that there is slightly more than a 95% chance
that the last observation contains a non-random error.
It should be noted that successive rejection procedures should not be applied to the sample.
Much data in surveying is obtained indirectly from various combinations of observed data, for instance
the coordinates of the ends of a line are a function of its length and bearing. As each measure-
ment contains an error, it is necessary to consider the combined effect of these errors on the derived
quantity.
The general procedure is to differentiate with respect to each of the observed quantities in turn and sum
them to obtain their total effect. Thus if a = f (x, y, z, . . .), and each independent variable changes by a
small amount (an error) x, y, z, . . . , then a will change by a small amount equal to a, obtained from
the following expression:
a a a
a = x + y + z + (2.6)
x y z
in which a/x is the partial derivative of a with respect to x, etc.
34 Engineering Surveying
Consider now a set of measurements and let the residuals xi , yi , and zi , be written as xi , yi , and zi
and the error in the derived quantity aI is written as ai :
a a a
a1 = x1 + y1 + z1 +
x y z
a a a
a2 = x2 + y2 + z2 +
x y z
.. .. .. ..
. . . .
a a a
an = xn + yn + zn +
x y z
Now squaring both sides gives
2 2
a a a a
a1 =
2
x1 + 2
2
x1 y1 + + y12 +
x x y y
2 2
a a a a
a22 = x22 + 2 x2 y2 + + y22 +
x x y y
.. .. .. ..
. . . .
2 2
a a a a
an2 = xn2 + 2 xn yn + + yn2 +
x x y y
In the above process many of the square and cross-multiplied terms have been omitted for simplicity.
Summing the results gives
2 2
a a a a
a =
2
x +2
2
xy + + y2 +
x x y y
As the measured quantities may be considered independent and uncorrelated, the cross-products tend to
zero and may be ignored.
Now dividing throughout by (n 1):
2 2 2 2 2 2 2
a a x a y a z
= + + +
n1 x n1 y n1 z n1
The sum of the residuals squared divided by (n 1), is in effect the variance 2 , and therefore
2 2 2
a a a
a =
2
x +
2
y +
2
z2 + (2.7)
x y z
which is the general equation for the variance of any function. This equation is very important and is used
extensively in surveying for error analysis, as illustrated in the following examples.
Worked examples
Example 2.2. Three angles of a triangle each have a standard error of 2 . What is the total error (T ) in
the triangle?
1 1
T = (22 + 22 + 22 ) = 2(3) = 3.5
2 2
Example 2.3. In measuring a round of angles at a station, the third angle c closing the horizon is obtained
by subtracting the two measured angles a and b from 360 . If angle a has a standard error of 2 and angle b
a standard error of 3 , what is the standard error of angle c?
since c = 360 a b
1 1
then c = (a2 + b2 ) 2 = (22 + 32 ) 2 = 3.6
Example 2.4. The standard error of a mean angle derived from four measurements is 3 ; how many
measurements would be required, using the same equipment, to halve this uncertainty?
s 1
From equation (2.4) m = 1
s = 3 4 2 = 6
n 2
i.e. the instrument used had a standard error of 6 for a single observation; thus for m = 1.5 , when
s = 6 :
2
6
n= = 16
1.5
Example 2.5. If the standard error of the sum of independently observed angles in a triangle is to be not
greater than 6.0 , what is the permissible standard error per angle?
1
From equation (2.9) T = p (n) 2
where T is the triangular error, p the error per angle, and n the number of angles.
T 6.0
p = 1
= 1
= 3.5
(n) 2 (3) 2
(ab1 ) (ab1 ) a 2 b a
2
A =
2
a + b = + (2.12)
a b b b2
1
a a
2 b
2 2
A = +
b a b
a 2 1
= (R + Rb2 ) 2 (2.13)
b a
The case for the power of a number must not be confused with the case for multiplication, for example
a3 = a a a, with each term being exactly the same.
Thus if A = an , then the variance
n 2
2
a
A =
2
a = nan1 a A = nan1 a (2.14)
a
A nan1 a na
Alternatively RA = n
= n
= = nRa (2.15)
a a a
Similarly for roots, if the function is A = a1/n , then the variance
2 2 2
a 1/n 1 1/n1 1 1/n 1
A =
2
a = a a = a a a (2.16)
a n n
2
a1/n a a1/n a
= A = (2.17)
n a n a
The same approach is adapted to general forms which are combinations of the above.
Worked examples
Example 2.6. The same angle was measured by two different observers using the same instrument,
as follows:
Observer A Observer B
86 34 10 86 34 05
33 50 34 00
33 40 33 55
34 00 33 50
33 50 34 00
34 10 33 55
34 00 34 15
34 20 33 44
Error and uncertainty 37
Observer A r r2 Observer B r r2
86 34 10 10 100 86 34 05 7 49
33 50 10 100 34 00 2 4
33 40 20 400 33 55 3 9
34 00 0 0 33 50 8 64
33 50 10 100 34 00 2 4
34 10 10 100 33 55 3 9
34 00 0 0 34 15 17 289
34 20 20 400 33 44 14 196
Mean = 86 34 00 0 1200 = r 2 86 33 58 0 624 = r 2
SA 13.1
(b) (i) Standard error SxA = 1
= 1
= 4.6
n 2 82
1
624 2
(a) (ii) Standard deviation SB = = 9.4
7
9.4
(b) (ii) Standard error SxB = 1
= 3.3
8 2
(c) As each arithmetic mean has a different precision exhibited by its Sx value, the arithmetic means must
be weighted accordingly before they can be averaged to give the MPV of the angle:
1 1
Weight of A = = 0.047
Sx2A 21.2
1
Weight of B = 0.092
10.9
The ratio of the weight of A to the weight of B is 0.047:0.092
= 86 33 59
As a matter of interest, the following point could be made here: any observation whose residual is
greater than 2.998S should be rejected at the 98% level (see Section 2.7). Each data set has 8 observations
38 Engineering Surveying
and therefore the mean has 7 degrees of freedom. This is a 2-tailed test therefore the 0.99 column is used.
As 2.998SA = 39.3 and 2.998SB = 28.2 , all the observations should be included in the set. This test
should normally be carried out at the start of the problem.
Example 2.7. Discuss the classification of errors in surveying operations, giving appropriate examples.
In a triangulation scheme, the three angles of a triangle were measured and their mean values recorded
as 50 48 18 , 64 20 36 and 64 51 00 . Analysis of each set gave a standard deviation of 4 for each of
these means. At a later date, the angles were re-measured under better conditions, yielding mean values of
50 48 20 , 64 20 39 and 64 50 58 . The standard deviation of each value was 2 . Calculate the most
probable values of the angles. (KU)
The angles are first adjusted to 180 . Since the angles within each triangle are of equal weight, then the
angular adjustment within each triangle is equal.
50 48 18 + 2 = 50 48 20 50 48 20 + 1 = 50 48 21
64 20 36 + 2 = 64 20 38 64 20 39 + 1 = 64 20 40
64 51 00 + 2 = 64 51 02 64 50 58 + 1 = 64 50 59
179 59 54 180 00 00 179 59 57 180 00 00
1
Weight of the first set = w1 = 1/42 =
16
1
Weight of the second set = w2 = 1/22 =
4
Thus w1 = 1, when w2 = 4.
(50 48 20 ) + (50 48 21 4)
MPV = = 50 48 20.8
5
Similarly, the MPVs of the remaining angles are:
64 20 39.6 64 50 59.6
The values may now be rounded off to single seconds.
Example 2.8. A base line of ten bays was measured by a tape resting on measuring heads. One observer
read one end while the other observer read the other the difference in readings giving the observed length
of the bay. Bays 1, 2 and 5 were measured six times, bays 3, 6 and 9 were measured five times and the
remaining bays were measured four times, the means being calculated in each case. If the standard errors
of single readings by the two observers were known to be 1 mm and 1.2 mm, what will be the standard
error in the whole line due only to reading errors? (LU)
1
Standard error in reading a bay Ss = (12 + 1.22 ) 2 = 1.6 mm
Consider bay 1. This was measured six times and the mean taken; thus the standard error of the mean is:
Ss 1.6
Sx = 1
= 1
= 0.6 mm
l 2 62
This value applies to bays 2 and 5 also. Similarly for bays 3, 6 and 9:
1.6
Sx = 1
= 0.7 mm
52
Error and uncertainty 39
1.6
For bays 4, 7, 8 and 10 Sx =
1
= 0.8 mm
42
These bays are now summed to obtain the total length. Therefore the standard error of the whole line is
1
(0.62 + 0.62 + 0.62 + 0.72 + 0.72 + 0.72 + 0.82 + 0.82 + 0.82 + 0.82 ) 2 = 2.3 mm
Example 2.9.
(a) A base line was measured using electronic distance-measuring (EDM) equipment and a mean distance
of 6835.417 m recorded. The instrument used has a manufacturers quoted accuracy of 1/400 000
of the length measured 20 mm. As a check the line was re-measured using a different type of EDM
equipment having an accuracy of 1/600 000 30 mm; the mean distance obtained was 6835.398 m.
Determine the most probable value of the line.
(b) An angle was measured by three different observers, A, B and C. The mean of each set and its standard
error is shown below.
A 89 54 36 0.7
B 89 54 42 1.2
C 89 54 33 1.0
These values can now be used to weight the lengths and find their weighted means as shown below.
1.024
MPV = 6835 + = 6835.410 m
2.5
40 Engineering Surveying
(b)
34.15
MPV = 89 54 30 + = 89 54 36
5.41
Example 2.10. In an underground correlation survey, the sides of a Weisbach triangle were measured as
follows:
W1 W2 = 5.435 m W1 W = 2.844 m W2 W = 8.274 m
Using the above measurements in the cosine rule, the calculated angle WW1 W2 = 175 48 24 . If the
standard error of each of the measured sides is 1/20 000 of its length, find the standard error of the
calculated angle in seconds of arc. (KU)
Exercises
(2.1) Explain the meaning of the terms random error and systematic error, and show by example how each
can occur in normal surveying work.
A certain angle was measured ten times by observer A with the following results, all measurements
being equally precise:
74 38 18 , 20 , 15 , 21 , 24 , 16 , 22 , 17 , 19 , 13
(The degrees and minutes remained constant for each observation.)
The same angle was measured under the same conditions by observer B with the following results:
74 36 10 , 21 , 25 , 08 , 15 , 20 , 28 , 11 , 18 , 24
Determine the standard deviation for each observer and relative weightings. (ICE)
(2.2) Derive from first principles an expression for the standard error in the computed angle W1 of a
Weisbach triangle, assuming a standard error of w in the Weisbach angle W , and equal proportional
standard errors in the measurement of the sides. What facts, relevant to the technique of correlation using
this method, may be deduced from the reduced error equation? (KU)