Astm E177

Designation: E 177 – 90a (Reapproved 2002)
Standard Practice for

Use of the Terms Precision and Bias in ASTM Test Methods1
This standard is issued under the fixed designation E 177; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (e) indicates an editorial change since the last revision or reapproval.
1. Scope COMBINATIONS OF SOURCES OF VARIABILITY

1.1 The purpose of this practice is to present concepts Repeatability and Laboratory Bias 23
necessary to the understanding of the terms “precision” and Other Within-a-Single Laboratory Precisions 24
“bias” as used in quantitative test methods. This practice also Reproducibility and Bias of the Test Method 25
Range of Materials 26
describes methods of expressing precision and bias and, in a
final section, gives examples of how statements on precision METHODS OF EXPRESSING PRECISION AND BIAS
and bias may be written for ASTM test methods.
Indexes of Precision 27
NOTE 1—The term “accuracy”, used in earlier editions of Practice Preferred Indexes of Precision for ASTM Test Methods 28
E 177, embraces both precision and bias (see Section 20 and Note 4). Preferred Statements of Bias for ASTM Test Methods 29
Elements of a Statement of Precision and Bias 30
1.2 Informal descriptions of the concepts are introduced in
STATEMENTS OF PRECISION AND BIAS
the text as the concepts are developed, and appear in the
following sections: Examples of Statements of Precision and Bias 31
Section
Terminology 3 APPENDIX
Significance and Use 4
Alphabetical List of Descriptions of Terms from the Text Appendix
GENERAL CONCEPTS X1
Test Method 5
1.3 This standard does not purport to address all of the
Measurement Terminology 6 safety concerns, if any, associated with its use. It is the
Observation 7 responsibility of the user of this standard to establish appro-
Test Determination 8
Test Result 9 priate safety and health practices and determine the applica-
bility of regulatory limitations prior to use.
SOURCES OF VARIABILITY
2. Referenced Documents
Experimental Realization of a Test Method 10
Operator 11 2.1 ASTM Standards:
Apparatus 12 E 178 Practice for Dealing with Outlying Observations2
Environment 13
Sample 14
E 456 Terminology Relating to Quality and Statistics2
Time 15 E 691 Practice for Conducting an Interlaboratory Study to
Determine the Precision of a Test Method2
STATISTICAL CONCEPTS
E 1169 Guide for Conducting Ruggedness Tests2
Accepted Reference Value 16 2.2 ANSI/ASQC Standard:
Statistical Control 17 A1-1978 Definitions, Symbols, Formulas and Tables for
Precision 18
Bias 19 Control Charts3
Accuracy 20 2.3 Other Documents:
Variation of Precision and Bias with Material 21 TAPPI Collaborative Reference Program, Reports
Variation of Precision and Bias with Sources of Variability 22
25 through 51, Aug. 1973 through Jan. 19784
1 2
This practice is under the jurisdiction of ASTM Committee E11 on Quality and Annual Book of ASTM Standards, Vol 14.02.
3
Statistics and is the direct responsibility of Subcommittee E11.20on Test Method Available from American Society for Quality Control, 230 West Wells St.,
Evaluation and Quality Control. Milwaukee, WI 53203.
4
Current edition approved June 29, 1990. Published August 1990. Originally Available from the Technical Association of the Pulp and Paper Industry,
published as E 177 – 61. Last previous edition E 177 – 90. Technology Park/Atlanta, P.O. Box 105113, Atlanta, GA 30348.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.
--`,`````,``,``,````,,`,`,`,-`-`,,`,,`,`,,`---
COPYRIGHT 2003; ASTM International 1 Document provided by IHS Licensee=IHS Employees/1111111001, User=EEPPM03,
12/29/2003 12:03:21 MST Questions or comments about this message: please call
the Document Policy Group at 1-800-451-1584.
E 177 – 90a (2002)
ASQC Glossary and Tables for Statistical Quality Control3 determine how control of such factors should be specified in the written
description of the method. For example, temperature of the laboratory or
3. Terminology of a heating device used in the test may have a significant effect in some
cases and less in others. In a screening procedure, deliberate variations in
3.1 The terminology defined in Terminology E 456 applies temperature would be introduced to establish the limits of significant
in all areas affected by this practice, except where modified by effect, (1, 2, 3).5
this practice.
3.2 This practice is specifically concerned with the devel- 5.4 A well-written test method specifies control over such
opment of statements on precision and bias for inclusion as factors as the test equipment, the test environment, the quali-
descriptors of the performance of a test method. This applica- fications of the operator (explicitly or implicitly), the prepara-
tion requires refinement of the Terminology E 456 definitions, tion of test specimens, and the operating procedure for using
as discussed herein. the equipment in the test environment to measure some
3.3 The informal descriptions of concepts developed in this property of the test specimens. The test method will also
practice have been collected in Appendix X1, and have been specify the number of test specimens required and how
arranged alphabetically for easy reference. measurements on them are to be combined to provide a test
result (Section 9), and might also reference a sampling proce-
4. Significance and Use dure appropriate for the intended use of the method.
4.1 Part A of the “Blue Book,” Form and Style for ASTM 5.5 It is necessary that the writers of the test method provide
Standards, requires that all test methods include statements of instructions or requirements for every known outside influence.
precision and bias. This practice discusses these two concepts
and provides guidance for their use in statements about test 6. Measurement Terminology
methods. 6.1 The following terms have been used to describe both the
4.2 Precision—A statement of precision allows potential measurement process and the partial or complete result of the
users of a test method to assess in general terms the test process: measurement, observation, observed value, test, test
method’s usefulness with respect to variability in proposed determination, test result, and others. These terms have often
applications. A statement on precision is not intended to been used loosely and interchangeably.
contain values that can be exactly duplicated in every user’s 6.2 For clarity, it is necessary to select certain of these terms
laboratory. Instead, the statement provides guidelines as to the for specific use. However, the word “measurement” will be
kind of variability that can be expected between test results used in a generic sense to cover observation (or observed
when the method is used in one or more reasonably competent value), test determination and test result. The use of the word
laboratories. For a discussion of precision, see Section 18. “test” by itself is discouraged.
4.3 Bias—A statement on bias furnishes guidelines on the 6.3 A quantitative test method may have three distinct
relationship between a set of typical test results produced by stages: (1) the direct measurement or observation of dimen-
the test method under specific test conditions and a related set sions or properties; (2) the arithmetical combination of the
of accepted reference values (see Section 19). observed values to obtain a single determination; and (3) the
arithmetical combination of a number of determinations to
GENERAL CONCEPTS obtain the test result of the test method. These three stages are
explained and illustrated in Sections 7-9.
5. Test Method
5.1 Section 2 of the ASTM Regulations describes a test 7. Observation
method as “a definitive procedure for the identification, mea-
7.1 For the purposes of this practice, observation or ob-
surement, and evaluation of one or more qualities, character-
served value should be interpreted as the most elemental single
--`,`````,``,``,````,,`,`,`,-`-`,,`,,`,`,,`---
istics, or properties of a material, product, system or service

reading or corrected reading obtained in the process of making
that produces a test result.”
a measurement. This statement is a narrower interpretation
5.2 In this practice only quantitative test methods that
than is given in Terminology E 456 in that the latter applies to
produce numerical results are considered. Also, the word
nonquantitative as well as quantitative test methods.
“material” is used to mean material, product, system or service;
7.2 An observation may involve a direct reading (for ex-
the word “property” is used herein to mean that a quantitative
ample, a zero-adjusted micrometer reading of the thickness of
test result can be obtained that describes a characteristic or a
a test strip at one position along the strip) or it may require the
quality, or some other aspect of the material; and “test method”
interpolation of the reading from a calibration curve.
refers to both the document and the procedure described
therein for obtaining a quantitative test result for one property.
8. Test Determination
For a discussion of test result, see Section 9.
5.3 During its development, a test method should be sub- 8.1 For a quantitative test method, a test determination may
jected to a screening procedure and ruggedness test in order to be described as (1) the process of calculating from one or more
establish the proper degree of control over factors that may observations a property of a single test specimen, or as (2) the
affect the test results (see Guide E 1169).
NOTE 2—A screening procedure or ruggedness test is a procedure for
investigation of the effects of variations in environmental and other 5
The boldface numbers in parentheses refer to a list of references at the end of
pertinent factors on the test results obtained from a test in order to this standard.
E 177 – 90a (2002)
value obtained from the process. Thus, the test determination being converted into a test result. Such treatment might include
may summarize or combine one or more observations. separation of the analytical response for the substance of
8.2 Examples: interest from the chromatographic absorption data, elimination
8.2.1 The measurement of the density of a test specimen or other treatment of outliers (see Practice E 178) in the data
may involve the separate observation of the mass and the for the known standard substances, and preparation of a
volume of the specimen and the calculation of the ratio calibration curve to determine the test result.
mass/volume. The density calculated from the ratio of one pair 9.4 Precision statements for ASTM test methods are appli-
of mass and volume observations made on one specimen is a cable to comparisons between test results, not test determina-
test determination. tions nor observations, unless specifically and clearly indicated
8.2.2 The determination of the thickness of a test specimen otherwise (see Section 18).
strip may involve averaging micrometer caliper observations
taken at several points along the strip. SOURCES OF VARIABILITY
9. Test Result 10. Experimental Realization of a Test Method
--`,`````,``,``,````,,`,`,`,-`-`,,`,,`,`,,`---
9.1 A test result is the value obtained by carrying out the 10.1 A realization of a test method refers to an actual
complete protocol of the test method once, being either a single application of the test method to produce a test result as
test determination or a specified combination of a number of specified by the test method. The realization involves an
test determinations. interpretation of the written document by a specific test
9.2 In general, a test method describes not only the manner operator, who uses a specific unit and version of the specified
in which each test determination is to be made, but also the test apparatus, in the particular environment of his testing
number of test determinations to be made and how these are to laboratory, to evaluate a specified number of test specimens of
be combined to provide the test result. the material to be tested. Another realization of the test method
9.3 Examples: may involve a change in one or more of the above emphasized
9.3.1 The test method on density might require that the mass experimental factors. The test result obtained by another
and volume observations of a specimen be combined to give a realization of the test method will usually differ from the test
test determination of density (8.2.1) and the test determination result obtained from the first realization. Even when none of
of each of five specimens be averaged to give a test result. the experimental factors is intentionally changed, small
9.3.2 The test method for paper thickness may require that changes usually occur. The outcome of these changes may be
the determination of strip thickness in 8.2.2 be made on ten seen as variability among the test results.
strips and that the ten test determinations be averaged to give 10.2 Each of the above experimental factors and all others,
the test result. known and unknown, that can change the realization of a test
9.3.3 The test method for a tensile strength test of paper may method, are potential sources of variability in test results. Some
specify that a tensile strength determination be performed on of the more common factors are discussed in Sections 11-15.
each of ten specimens and that the ten tensile test determina-
tions be averaged to get the test result. 11. Operator
9.3.4 In chemical analyses a variety of situations may occur. 11.1 Clarity of Test Method—Every effort must be made in
Thus, in some cases, the method may call for the preparation of preparing an ASTM standard test method to eliminate the
a single solution from a test unit, and measurement on three possibility of serious differences in interpretation. One way to
aliquots (specimens) of the solution made up to a specified check clarity is to observe, without comment, a competent
volume. The average of the three analytical determinations laboratory technician, not previously familiar with the method,
would then be called the test result. In other cases of chemical apply the draft test method. If the technician has any difficulty,
analysis, the method may call for two individual test determi- the draft most likely needs revision.
nations, each one made on a different specimen with recalibra- 11.2 Completeness of Test Method—It is necessary that
tion of the measuring instrument for each of the two determi- technicians, who are generally familiar with the test method or
nations. The average of the two determinations would then be similar methods, not read anything into the instructions that is
the test result. not explicitly stated therein. Therefore, to ensure minimum
9.3.5 In rubber testing, the method may describe not only variability due to interpretation, procedural requirements must
the shape of the test specimen to be taken from a sheet of be complete.
rubber, but also the preparation of the sheet, including com- 11.2.1 If requirements are not explicitly stated in the test
pounding and curing. For example, one rubber test method method (see 5.5), they must be included in the instructions for
specifies that four sheets be individually compounded and the interlaboratory study (see Practice E 691).
cured and three specimens tested from each sheet. The test 11.3 Differences in Operator Technique—Even when opera-
result is then defined as the average of the four medians, each tors have been trained by the same teacher or supervisor to give
median being the middle determination, in the order of practically identical interpretations to the various steps of the
magnitude, of the three values obtained from a sheet. test method, different operators (or even the same operator at
9.3.6 Some test methods, such as those for analytical different times) may still differ in such things as dexterity,
chemistry, involve calibration with known standard substances. reaction time, color sensitivity, interpolation in scale reading,
The originally collected test determinations may be subjected and so forth. Unavoidable operator differences are thus one
to complex computational and statistical treatment prior to source of variability between test results. The test method
E 177 – 90a (2002)
should be designed and described to minimize the effects of STATISTICAL CONCEPTS
these operator sources of variability.
16. Accepted Reference Value
12. Apparatus 16.1 A measurement process is generated by the application
12.1 Tolerances—In order to avoid prohibitive costs, only of a test method. Variability can be introduced unintentionally
necessary and reasonable manufacturing and maintenance into the measurement process through the impact of many
tolerances can be specified. The variations allowed by these sources, such as heterogeneity of the material, state of main-
reasonable specification tolerances can be one source of tenance and calibration of equipment, and environmental
variability between test results from different sets of test fluctuations (Sections 10-15). The variability may include
equipment. systematic as well as random components. The systematic
12.2 Calibration—One of the variables associated with the components may be evaluated (Section 19) if an accepted
equipment is its state of calibration, including traceability to reference value is available. An accepted reference value,
national standards. The test method must provide guidance on according to Terminology E 456, is a value that serves as an
the frequency of verification and of partial or complete agreed-upon reference for comparison. It may be:
recalibration; that is, for each test determination, each test (1) a theoretical or established value based on scientific
result, once a day, week, etc, or as required in specified principles;
situations. (2) an assigned value based on experimental work of some
national or international organization such as the U.S. National
13. Environment Institute of Standards and Technology;
13.1 The properties of many materials are sensitive to (3) a consensus value based on collaborative experimental
temperature, humidity, atmospheric pressure, atmospheric con- work under the auspices of a scientific or engineering group; or
taminants, and other environmental factors. The test method
usually specifies the standard environmental conditions for (4) for a specific application, an agreed upon value obtained
testing. However, since these factors cannot be controlled using an accepted reference method.
perfectly within and between laboratories, a test method must 16.2 When the accepted reference value is the theoretical
be able to cope with a reasonable amount of variability that value, it is sometimes referred to as the “true” value, but this
inevitably occurs even though measurement and adjustment for usage is not recommended.
the environmental variation have been used to obtain control
(see 17.2). Thus, the method must be both robust to the 17. Statistical Control
differences between laboratories and require a sufficient num- 17.1 A process is in a state of statistical control if the
ber of test determinations to minimize the effect of within- variations between the observed test results from it can be
laboratory variability. attributed to a constant system of chance causes. This is a
modification of the definition of a “a state of statistical control”
14. Sample (Test Specimens) given in ANSI/ASQC Standard A 1-1978 (or the 1983 ASQC
14.1 A lot (or shipment) of material must be sampled. Since Glossary and Tables for Statistical Quality Control) by using
it is unlikely that the material is perfectly uniform, sampling the term “test results” in place of “sampling results”. By
variability is another source of variability among test results. In “chance causes” is meant unknown factors, generally numer-
some applications, useful interpretation of test results may ous and individually of small magnitude, that contribute to
require the measurement of the sampling error. In interlabora- variation, but that are not readily detectable or identifiable.
tory evaluation of test methods to determine testing variability, 17.2 The measurement process is in a state of statistical
--`,`````,``,``,````,,`,`,`,-`-`,,`,,`,`,,`---
special attention is required in the selection of the material control when the test results obtained vary in a predictable
sample (see 18.4 and Practice E 691) in order to obtain test manner, showing no unassignable trends, cycles, abrupt
specimens that are as similar as possible. A small residual changes, excess scatter, or other unpredictable variations as
amount of material variability is almost always an inseparable determined by application of appropriate statistical methods.
component of any estimate of testing variability. The ensurance of a state of statistical control is not a simple
matter (4), but may be helped by the use of control charts (see
15. Time Part 3, STP 15D) (5, 6).
15.1 Each of the above sources of variability (operator 17.2.1 If the set of test results to be considered in terms of
performance, equipment, environment, test specimens) may statistical control is obtained in different laboratories, it may be
change with time; for example, during a period when two or possible to view the laboratories as a “sample” of all qualified
more test results are obtained. The longer the period, the less laboratories that are likely to use the given test method, or as
likely changes in these sources will remain random (that is, the a set comprising a special category of such laboratories, and
more likely systematic effects will enter), thereby increasing that the differences between the laboratories represent random
the net change and the observed differences in test results. variability. “Qualified” may mean, for example, laboratories
These differences will also depend on the degree of control that have used this test method for a year or more.
exercised within the laboratory over the sources of variability. 17.3 The presence of outliers (Practice E 178) may be
In conducting an interlaboratory evaluation of a test method, evidence of a lack of statistical control in the production
the time span over which the measurements are made should be process or in the measurement process. It is quite proper to
kept as short as reasonably possible (see Sections 23 and 24). discard outliers for which a physical explanation is known.
E 177 – 90a (2002)
Discarding outliers in the measurement process on the basis of 19. Bias
statistical evidence alone may yield biased results since one 19.1 The bias of a measurement process is a generic concept
can truly measure the value of the property of interest only if related to a consistent or systematic difference between a set of
the measurement process is in control. The presence of one or test results from the process and an accepted reference value of
more outliers may indicate a weakness in the test method or its the property being measured. The measuring process must be
documentation. in a state of statistical control; otherwise the bias of the process
17.4 The discussion in succeeding sections assumes that the has no meaning. In determining the bias, the effect of the
measurement process is in a state of statistical control for some imprecision is averaged out by taking the average of a very
specified set of conditions. If measurements are all to be made large set of test results. This average minus the accepted
in a given laboratory, for example, any systematic deviation reference value is an estimate of the bias of the process (test
from the expected value pertinent to that laboratory will show method). Therefore, when an accepted reference value is not
up as a bias for measurements made under the prescribed available, the bias cannot be established.
conditions (see Section 19). 19.2 The magnitude of the bias may depend on what sources
18. Precision of variability are included, and may also vary with the test level
and the nature of the material (see Section 21).
18.1 The precision of a measurement process, and hence the
19.3 When evaluating the bias of a test method, it is usually
stated precision of the test method from which the process is
advisable to minimize the effect of the random component of
generated, is a generic concept related to the closeness of
the measurement error by using at each test level the average
agreement between test results obtained under prescribed like
of many (30 or more) test results, measured independently, for
conditions from the measurement process being evaluated. The
each of several relatively uniform materials, the reference
measurement process must be in a state of statistical control;
values for which have been established by one of the alterna-
else the precision of the process has no meaning. The greater
tives in 16.1 (see 23.3 and 25.3).
the dispersion or scatter of the test results, the poorer the
19.4 If the bias of a test method is known, an adjustment for
precision. (It is assumed that the least count of the scale of the
the bias may be incorporated in the test method in the section
test apparatus is not so poor as to result in absolute agreement
on calculation or in a calibration curve and then the method
among observations and hence among test results.) Measures
would be without bias.
of dispersion, usually used in statements about precision, are,
19.5 The concept of bias may also be used to describe the
in fact, direct measures of imprecision. Although it may be
systematic difference between two operators, two test sites (see
stated quantitatively as the reciprocal of the standard deviation,
23.3), two seasons of the year, two test methods, and so forth.
precision is usually expressed as the standard deviation or
Such bias is not a direct property of the test method, unless one
some multiple of the standard deviation (see Section 27).
of the test sites or test methods provides the accepted reference
18.2 A measurement process may be described as precise
value. The effect of such bias may be reflected in the measured
when its test results are in a state of statistical control and their
reproducibility of the test method.
dispersion is small enough to meet the requirements of the
testing situations in which the measurement process will be
applied. The test results of two different processes expressed in 20. Accuracy
the same units may be statistically compared as to precision, so 20.1 Accuracy is a generic concept of exactness related to
that one process may be described as more (or less) precise the closeness of agreement between the average of one or more
than the other. test results and an accepted reference value. Unless otherwise
18.3 The precision of the measurement process will depend qualified, the use of the word “accuracy” by itself is to be
on what sources (Sections 10-15) of variability are purposely interpreted as the accuracy of a test result. The accuracy of a
included and may also depend on the test level (see Section test result is the closeness of agreement between the test result
21). An estimate of precision can be made and interpreted only and the accepted reference value. It depends on both the
if the experimental situation (prescribed like conditions) under imprecision and the bias of the test method.
which the test results are obtained is carefully described. There 20.2 There are two schools of thought on defining the
is no such thing as the precision of a test method; a separate accuracy of a measuring process (5, 7). In either case, the
precision statement will apply to each combination of sources measurement process must be in a state of statistical control,
of variability. The precision of a particular individual test result otherwise the accuracy of the process has no meaning:
depends on the prescribed conditions for which it is considered 20.2.1 The closeness of agreement between the accepted
a random selection. For example, will it be compared with reference value and the average of a large set of test results
other results obtained within the laboratory or with results obtained by repeated applications of the test method, prefer-
obtained in other laboratories? No valid inferences on the ably in many laboratories.
precision of a test method or a test result can be drawn from an 20.2.2 The closeness of agreement between the accepted
individual test result. reference value and the individual test results (8, 9).
18.4 In order to minimize the effect of material variability in 20.3 In 20.2.1 the imprecision is largely eliminated by the
evaluating the precision of a test method, it is desirable to use of a large number of measurements and the accuracy of the
select a relatively uniform material for each of several test measuring process depends only on bias. In 20.2.2 the impre-
levels (magnitudes) chosen for the property being tested (see cision is not eliminated and the accuracy depends on both bias
Practice E 691 for further information). and imprecision. In order to avoid confusion resulting from use
--`,`````,``,``,````,,`,`,`,-`-`,,`,,`,`,,`---
E 177 – 90a (2002)
of the word “accuracy”, only the terms precision and bias of material. The precision estimate for this operator, day, and
should be used as descriptors of ASTM test methods. equipment is determined from the variability of the test results.
In this situation and the other experiments listed below, all
21. Variation of Precision and Bias with Material potential sources of variability must be carefully controlled
21.1 A test method is intended to cover a class of materials. within the tolerances specified in the test method.
Any one material within the class differs from any other in the 23.1.2 Precision from Repeated Experiments Within a
following two basic ways: the level of the property that is being Laboratory—In order to get an expression of precision that
measured; and the matrix of the material. The matrix is the applies to any operator and day with a specific set of equipment
totality of all properties, other than the level of the property to at a given laboratory, the experiment of 23.1.1 must be
be measured, that can have an effect on the measured value. repeated on different days by the same and different operators.
Thus the precision and the bias of the test method may be Then the precision estimates, obtained as in 23.1.1, for each
functions of the property level and of the material matrix. operator-day combination must be suitably combined or pooled
21.2 In some cases, a test method may be intended to be to obtain an estimate of single-operator-day precision that
applied to more than one class of materials. If so, it may be applies to this laboratory and equipment. If the laboratory has
advisable to provide separate statements of precision for each several sets of equipment for this test method, the experiment
class (see 31.3). may be enlarged to include tests on each set of equipment and
the test results pooled in order to obtain an overall single-
22. Variation of Precision and Bias with Sources of operator-day-equipment precision for that laboratory.
Variability
23.1.3 Precision from Within-Laboratory Experiments in
22.1 The precision and bias of test results obtained by Several Laboratories—In order to obtain an estimate of
repeated applications of a test method depend upon what within-laboratory precision that is characteristic of the test
combinations of the sources of variability (Sections 10-15) method and may reasonably be applied to any laboratory, the
affect the variability of the test results. For example, test results whole within-laboratory experiment of 23.1.2 could be re-
obtained by all possible operators within one laboratory using peated in a number of laboratories. Alternatively, this desired
one set of test apparatus would have a bias based in part on that broadly-applicable estimate may be obtained by pooling
laboratory’s apparatus and environment and a precision that within-laboratory information from only one operator-day-
would depend in part on the quality of training and supervision equipment combination carried out in each of a number of
of operators in that laboratory. Many combinations of sources laboratories. Although only one operator, one day, and one set
of variability are possible. Some of the combinations used by of equipment are combined in each laboratory, the use of many
ASTM committees are described in Sections 23-25. laboratories, as in an interlaboratory study such as described in
Practice E 691, provides an evaluation based on many opera-
COMBINATIONS OF SOURCES OF VARIABILITY tors, many days and many units of equipment. This abbreviated
(TYPES OF PRECISION AND BIAS) approach, only one operator-day-equipment combination in
23. Repeatability and Laboratory Bias each laboratory, is based on the assumption that this estimate of
within-laboratory precision does not change, or should not be
23.1 Within-Laboratory Precision—Information about a expected to change, significantly from laboratory to laboratory.
frequently used within-laboratory precision, sometimes called Consequently, this measure of precision can be treated as a
single-operator-day-apparatus precision, can be obtained from characteristic of the test method. This pooled within-laboratory
at least the three experimental situations described in 23.1.1- precision is called the repeatability of the test method.
23.1.3, the last situation being most reliable; that is, the
23.2 Repeatability Conditions—While other conditions
estimate of this precision is improved progressively by pooling
(Section 24) have sometimes been used for obtaining repeated
additional information.
test results in the determination of repeatability, the preferred
NOTE 3—If the test method requires a series of steps, the “single- conditions (illustrated above in 23.1-23.1.3) are those under
operator-equipment” requirement means that for a particular step the same --`,`````,``,``,````,,`,`,`,-`-`,,`,,`,`,,`---
which test results are obtained with the same test method in the
combination of operator and equipment is used for every test result and on same laboratory, by the same operator with the same equip-
every material. Thus one operator may prepare the test specimens, a
second measure the dimensions and a third measure the breaking force.
ment, in the shortest practical period of time, using test units or
The “single-day” requirement means that the test results, at least for a test specimens (see Practice E 691, 10.3) taken at random, from
particular material are obtained in the shortest practical period of time, a single quantity of material that is as nearly homogeneous as
whether this be a fraction of a day or several days. possible. For meaning of “same operator, same equipment”
23.1.1 Precision From an Experiment Involving One Op- and“ shortest practical period of time,” see Note 3 above.
erator, Day and Apparatus—A single, well-trained operator 23.3 Repeatability—The closeness of agreement between
using one set of equipment obtains two or more test results in test results obtained under repeatability conditions.
a short period of time during which neither the equipment nor 23.4 Bias of a Particular Laboratory, relative to the other
the environment is likely to change appreciably. The variability laboratories may be calculated by averaging test values ob-
is due primarily to small changes in equipment, calibration, tained as described in 23.1.2 for that laboratory and comparing
reagents, environment, and operator’s procedure, and possibly the result with the average of all test values obtained as
to some heterogeneity in the material tested. The last is kept described in 23.1.3. The bias of the test method may be
small by use of test specimens from a reasonably uniform lot calculated by comparing the latter average with the accepted
E 177 – 90a (2002)
reference value (Section 16), or it may be determined as the reproducibility of the test method, depending on the degree
described in 25.3. Once the bias is known, the method should of common supervision of the test stations.
be modified to correct for it (see 19.5). 25.2 Reproducibility, as used in 25.1 and 25.1.1, is a general
term for a measure of precision applicable to the variability
24. Other Within-A-Single Laboratory Precisions between single test results obtained in different laboratories
24.1 Single-Operator-Apparatus, Multi-Day Precision—A using test specimens taken at random from a single sample of
single operator using one set of equipment obtains replicate test material. This use of the word “reproducibility” is narrower
results as in Section 23, but one on each of two or more days. then that defined in Terminology E 456 because it assumes the
Since the time interval is greater than in Section 23, there is a simpler interlaboratory study of 23.1.3 and Practice E 691
greater chance that the equipment (including its calibration) where only one operator-day-apparatus combination is in-
and the environment may change, and that the change will volved in each laboratory.
depend on the degree of control or supervision maintained by 25.3 Bias of Test Method—The bias of the test method, for
the laboratory over these factors. Therefore, the precision a specific material, may be calculated by comparing the
calculated in this between-day within-laboratory situation, may average of all the test results obtained in 25.1 for that material
vary appreciably from laboratory to laboratory and often with the accepted reference value (see Section 16) for that
cannot be regarded as a universal parameter of the test method. material. If no accepted reference value is available, bias
While this multi-day precision has been called “repeatability” cannot be calculated (however, see 29.2). For a valid determi-
by some ASTM committees, it is better to reserve the term for nation of bias, the results of the test method must indicate a
the precision estimate described in 23.1.3, which is more likely state of statistical control (see Section 17).
to be an estimate of a universal characteristic of the test 26. Range of Materials
method. If information on multi-day precision is needed by a
laboratory, it should be studied in that laboratory, since the 26.1 The estimates of precision and bias described in
estimate may vary widely from laboratory to laboratory. Sections 23-25 are based on test results from a material at one
24.2 Multi-Operator, Single-Day-Apparatus Precision— level of the property of interest. The experiments should be
Each of several operators in one laboratory using the same set extended to other related materials yielding test results at other
of equipment obtains a test result. Since the operator effect may test levels. Related materials are materials that may have
depend on the degree of training and supervision exercised in similar matrixes of other properties (see Section 21) and are
the laboratory, the precision among test results (between likely to be compared by means of the test method.
operators within laboratory) may vary widely from laboratory 26.2 Precision and bias may be constants or simple func-
to laboratory, and therefore may not be regarded as a universal tions of the test level or they may depend so appreciably on the
parameter of the test method (see Note in example in 31.7). If matrix of other properties of the materials that the test method
information on multi-operator precision is needed by a labo- will have to be modified to take into account these other,
ratory, it should be studied by that laboratory. possibly-interfering, properties before reasonable and consis-
tent values for precision and bias can be obtained.
25. Reproducibility and Bias of the Test Method METHODS OF EXPRESSING PRECISION AND
25.1 Between-Laboratory Precision—Each of several labo- BIAS
ratories, each with its own operator, apparatus, and environ-
mental conditions, obtains a test result on randomly-selected 27. Indexes of Precision
specimens from the same reasonably-uniform sample of mate- 27.1 General—Precision may be stated in terms of an index
rial. The variability of the test results may be used to calculate consisting of some positive value, a. The index is expressed in
the between-laboratory precision, which, when based on a the same units as those of the test result, or as a percent of the
single test result from each laboratory, is also called the test result. The numerical value of a will be smaller when the
reproducibility of the test method. The laboratories being individual test results from repeated applications of the test
compared in order to obtain the between-laboratory reproduc- method are more closely grouped. The larger the index, the less
ibility of the test method should be independent of each other. precise the measurement process. A test method has a separate
Independent means that the laboratories should not be under index of precision for each type of precision (see Sections
the same supervisory control, nor should they have worked 22-25) and this index may vary in a systematic way with the
together to resolve differences. The value found for the test level or it may vary from material to material even at the
between-laboratory precision will depend on the choice of same test level.
laboratories and the selection of operators and apparatus within 27.2 Basis—The usual source of the index of precision is
each laboratory. the sample estimate of the standard deviation, (denoted by the
25.1.1 The precision within a single laboratory or facility symbol s), of a random set of test results for that type of
having multiple test stations will depend largely on the degree precision (for example, from an interlaboratory study such as
of supervision provided. If information on this precision is Practice E 691), where standard deviation has its usual mean-
required, the laboratory should run its own internal study, ing (for example, see Terminology E 456). The number of test
possibly using Practice E 691, with each station treated as a results in the set should be sufficiently large (at least 30) so that
laboratory. The precision determined (that is, “between station the sample standard deviation(s) computed from the randomly-
reproducibility”), can be expected to be somewhat better than selected set be a good approximation to the standard deviation
--`,`````,``,``,````,,`,`,`,-`-`,,`,,`,`,,`---
E 177 – 90a (2002)
of the population of all test results (denoted by the symbol s) 27.3.6 Other Indexes—For some applications, limits based
that could be obtained for that type of precision. See Practice on 95 % probability are not adequate. Basic multipliers other
E 691 for an example of the design of an interlaboratory study than 1.960 (or about 2.0) may be used, yielding probabilities
to determine within-laboratory and between-laboratory stan- other than “approximately 0.95”. As discussed below, however,
dard deviations, also called repeatability and reproducibility the (d2s) = (2.8 s) and (d2CV %) = (2.8 CV %) indexes are
standard deviations. recommended, unless there is a special need.
27.3 Possible Indexes of Precision:
27.3.1 Standard Deviations(s)—See 27.2. 28. Preferred Indexes of Precision for ASTM Test
27.3.2 “Two”-Standard Deviation Limits (2s)— Methods
Approximately 95 % of individual test results from laborato- 28.1 Preferred Types of Precision and Preferred Indexes—
ries similar to those in an interlaboratory study can be expected The types of precision described in 23.1.3 and 25.1, namely,
to differ in absolute value from their average value by less than repeatability and reproducibility, are the preferred types of
1.960 s (about 2.0 s). precision statements for ASTM test methods. The preferred
27.3.3 Difference “Two”-Standard-Deviation Limit (d2s)— index for each of these types is the 95 % limit on the difference
Approximately 95 % of all pairs of test results from laborato- between two test results (see 27.3.3), namely, 2.8 s or 2.8
ries similar to those in the study can be expected to differ in CV %. Also the corresponding standard deviation (s) or percent
absolute value by less than 1960 =2 s (about =2 s ) = 2.77 s coefficient of variation (CV %) shall be indicated.
(or about 2.8 s). This index is also known as the 95 % limit on 28.2 Recommended Terminology for Preferred Indexes:
the difference between two test results. For the two cases
described in Sections 23 and 25, these limits are known as the r 5 95 % repeatability limit (1)
repeatability and reproducibility limits. R 5 95 % reproducibility limit (1)
27.3.4 Multiplier for 95 % Limit: or, to help prevent confusion between r and R, use:
27.3.4.1 The multiplier 1.960 or 2.0 used in 27.3.2 and r 5 95 % repeatability limit ~within a laboratory! (2)
27.3.3 assumes an underlying normal distribution for the test
--`,`````,``,``,````,,`,`,`,-`-`,,`,,`,`,,`---
R 5 95 % reproducibility limit ~between laboratories! (2)
results being compared. For methods in which the average of
several test determinations is reported as a single test result, the Similarly, the recommended terminology for the correspond-
assumption of normality is usually reasonable, even for skewed ing standard deviations is:
or bimodal distributions. When normality cannot be assumed, sr 5 repeatability standard deviation ~within a laboratory! (3)
it is usually satisfactory to continue to use the multiplier 2.0 but
sR 5 reproducibility standard deviation ~between laboratories! (3)
recognize that the actual probability limit will differ somewhat
from the nominal 95 % limit. and for the coefficients of variation:
27.3.4.2 It may be thought that the use of the multiplier CV %r 5 repeatability coefficient of variation in percent ~within a
1.960 (or approximately 2.0) in 27.3.2 and 27.3.3 requires that laboratory! (4)
the sample standard deviation (s) be assumed to be equal to the CV %R 5 reproducibility coefficient of variation in percent ~between
population (or “true”) standard deviation (s). No within or laboratories! (4)
between-laboratory study will yield a standard deviation (s)
exactly equal to the “true” standard deviation (s), and few will where:
come close unless at least 30 laboratories are included in the r = 1.960 =2 sr = 2.8 sr or r = 1.960 =2 CV %r = 2.8
study. No multiplier for s will ensure an actual limit of exactly CV %r
95 %. The use of the multiplier t, (Student’s t), instead of the R = 1.960 =2 sR = 2.8 sR or R = 1.960 =2 CV %R = 2.8
multiplier, 1.960 does not remedy the situation. In order to CV %R
resolve this problem, a range of probabilities around 95 % must depending on how the indexes vary with the test level (see
be accepted as defining the “95 % limit”. For appropriate 28.5). For other than the preferred types, the more general
choices of the defining range, the multiplier 1.960 (or 2.0) may terminology “95 % limit” may be used with a description of the
still be used. It has been shown that 1.960 is the best choice for sources of variability; for example:
achieving the desired (but approximate) 95 % coverage (10). 95 % limit (operator-to-operator, within-laboratory) and similarly
The multiplier is independent of the number of test results in for the corresponding standard deviation:
the within-laboratory study or the number of laboratories in the operator-to-operator within-laboratory standard deviation.
study for between-laboratory precision. However, a within- or
between-laboratory study must be of reasonably large size in 28.3 Whenever the general terms “repeatability” and “re-
order to provide reliable information on which to base a producibility” or the more specific terminology “repeatability
precision statement. limit” and “reproducibility limit” are stated with numerical
27.3.5 Indexes in Percent—In some instances (see 28.5) values, users will have to assume that the 95 % limits are
there may be some advantage in expressing the precision index intended, unless otherwise specified.
as a percentage of the average test result; that is, percent 28.4 Quantitative estimates of repeatability and reproduc-
coefficient of variation (CV %). The notation may then be ibility may be obtained from an interlaboratory study con-
(CV %), (2CV %), (d2CV %), etc. ducted as directed in Practice E 691.
E 177 – 90a (2002)
28.5 Variation of Index With Test Level—The choice be- study and analysis of the data. This section should give the
tween 2.8 s and 2.8 CV % and the form for the statement of the ASTM Research Report number for the interlaboratory data
precision indexes depends upon how the indexes vary with the and analysis.
test level. 30.3 A description of any deviation from complete adher-
28.5.1 If a 2.8 s index is approximately constant throughout ence to the test method for each test result, such as preparation
the test range, then the 2.8 s index is recommended. Express in one laboratory of the cured test sheets and distribution
the index in the units of the measured property. thereof to the participating laboratories, when curing is a
28.5.2 If a 2.8 s index is approximately proportional to the specified part of the test method.
test level, then use the 2.8 CV % index. Express the index in 30.4 The number of test determinations and their combina-
percentage of the test level. tion to form a test result, if not clearly defined in the body of
28.5.3 In either case, express the index as a single average the test method.
(or pooled) number followed parenthetically by the actual 30.5 A statement of the precision between test results
range of the index values (highest and lowest) encountered in expressed in terms of the 95 % repeatability limit and the 95 %
the interlaboratory study. reproducibility limit (see 28.2), including any variation of these
28.5.4 If a 2.8 s index is neither approximately constant nor statistics with test level or material (see 28.5 and section 28.6).
approximately proportional to the test level, plot the index Report the repeatability and reproducibility standard deviations
versus the test level to determine how they are related. If the (or percent coefficients of variation) among test results as
index varies systematically with the test level, express the indicated in 28.2. Finally, state that repeatability and reproduc-
index by a combination of 2.8 s and 2.8 CV % (see example ibility are used as directed in ASTM Practice E 177.
--`,`````,``,``,````,,`,`,`,-`-`,,`,,`,`,,`---
31.3), by a simple formula, or by a plot. If the index varies in 30.6 If precision under additional conditions (for example,
no systematic way with the level, but jumps from material to operator-to-operator or day-to-day) has been determined, re-
material (perhaps because some materials are inherently more port the number of operators or days per laboratory. Include a
variable than others), express the index by a table (see 31.6) or careful description of the additional conditions, and the preci-
by a single compromise value selected by judgment. Carefully sion values obtained, using such terminology as 95 % limit
describe each material in the table. The jumping may be due to (operator-to-operator within laboratory).
interfering properties in the material matrixes (Section 21) and 30.7 A statement concerning what is known about bias,
the description may eventually allow identification of the including how the method has been modified to adjust for what
cause. is known about bias and that it is now without known bias. If
the value of the property being measured can be defined only
29. Preferred Statements of Bias for ASTM Test Methods in terms of the test method, state this and whether the method
29.1 Some information may be available concerning the is generally accepted as a reference method. If an estimate of
bias or part of the bias of a test method as determined from an the maximum bias of the method can be made on theoretical
interlaboratory study (25.3 and 23.3) or from known effects of grounds (for example, by examining the maximum probable
environmental or other deviations as determined in ruggedness contributions of various steps in the procedure to the total
tests (see 5.3). An adjustment for what is known about the bias bias), then describe these grounds in this section. Give the
can be incorporated in the calculations or calibration curves. ASTM Research Report number on the theoretical or experi-
The statement on bias should then state how this correction is mental study of bias.
provided for in the test method.
29.2 If the bias of a test method, or the uncorrected balance STATEMENTS OF PRECISION AND BIAS
of the bias, is not known because there is no accepted reference
value (see 25.3), but upper and lower bounds can be estimated 31. Statements of Precision and Bias
by a theoretical analysis of potential systematic errors, credible 31.1 Example Statements of Precison and Bias—In the
bounds for this uncorrectable balance of the bias should be simplest case, the statement will appear essentially as shown in
given in the bias statement (see example Ex 9 of 31.9). (9) illustrative example Ex.1. Ex.1 is a simplified example. Nor-
mally, at least six laboratories and at least three materials
NOTE 4—No formula for combining the precision and the bias of a test
should be included in the study in accordance with Practice
method into a single numerical value of accuracy is likely to be useful.
Instead separate statements of precision and bias should be presented. The E 691. (No general conclusions about the test method can be
value may then be used jointly in any specific application of the test considered valid from so few materials and laboratories.)
method. Ex.1 Precision and Bias
30. Elements of a Statement of Precision and Bias Ex.1.1 Interlaboratory Test Program—An interlaboratory study
of the permanent deformation of elastomeric yarns was run in 1969.
30.1 The precision and bias section of a test method should Each of two laboratories tested five randomly drawn test specimens
include, as a minimum, the elements specified in 30.2-30.5 and from each of three materials. The design of the experiment, similar
to that of Practice E 691, and a within-between analysis of the data
in 30.7: are given in ASTM Research Report No. XXXX.
30.2 A brief description of the interlaboratory test program Ex.1.2 Test Result—The precision information given below for
on which the statement is based, including (1) what materials average permanent deformation in percentage points at 100-min
relaxation time is for the comparison of
were tested, (2) number of laboratories, (3) number of test two test results, each of which is the average of five test determinations.
results per laboratory per material, and the (4) interlaboratory Ex.1.3 Precision:
practice (usually Practice E 691) followed in the design of the 95 % repeatability limit (within laboratory) 0.8 %
E 177 – 90a (2002)
95 % reproducibility limit (between labora- 2.9 % 31.2 The illustrative example Ex.2 is another simplified
tories) example in which only two materials have been used but with
The above terms (repeatability limit and reproducibility limit) are used as speci-
fied in Practice E 177. The respective standard deviations among test results, the required minimum number (six) of participating laborato-
related to the above numbers by the factor 2.8, are: ries:
repeatability standard deviation = 0.3 %
reproducibility standard deviation = 1.0 %.
Ex.1.4 Bias—This method has no bias because permanent deformation of elas-
tomeric yarns is defined in terms of this method.
--`,`````,``,``,````,,`,`,`,-`-`,,`,,`,`,,`---
E 177 – 90a (2002)
Ex.2 Precision and Bias Ex.4 Precision
Ex.2.1 Interlaboratory Test Program—An interlaboratory study Ex.4.1 Interlaboratory Test Program—The information given
was run in which randomly drawn test specimens of two materials below is based on data obtained in the TAPPI Collaborative
(kraft envelope paper and wove envelope paper) were tested for Reference Program for self-evaluation of laboratories, Reports 25
tearing strength in each of six laboratories, with each laboratory through 51 (Aug. 1973 through Jan. 1978). Each report covers two
testing two sets of five specimens of each material. Except for the use materials with each of approximately 16 laboratories testing 5
of only two materials, Practice E 691 was followed for the design specimens of each material.
and analysis of the data, the details are given in ASTM Research Ex.4.2 Test Result—The precision information given below has
Report No. XXXY. been calculated for the comparison of two test results, each of which
Ex.2.2 Test Result—The precision information given below in the is the average of 10 test determinations.
units of measurement (grams) is for the comparison of two test Ex.4.3 95 % Repeatability Limit (within laboratory)—The repe-
results, each of which is the average of five test determinations: atability is 5.4 % of the test result. For the different materials the
Ex.2.3 Precision: repeatability ranged from 3.7 to 9.6 %. The range of the central 90
Material A Material B percent of the repeatability values was 3.9 to 8.7 %.
Ex.4.4 95 % Reproducibility Limit (between laboratories)—The
Average Test Value 45 gf 100 gf reproducibility is 19.2 % of the test result. For the different
95 % repeatability limit (within laboratory) 3 gf 7 gf materials the range of all of the calculations of reproducibility was
95 % reproducibility limit (between laboratories) 6 gf 12 gf 6.4 to 45.4 %. The range of the central 90 percent of the calculations
was 12.2 to 25.5 %.
The above terms (repeatability limit and reproducibility limit) are
Ex.4.5 Definitions and Standard Deviations—The above terms
used as specified in Practice E 177. The respective standard devia-
repeatability limit and reproducibility limit are used as specified in
ions among test results may be obtained by dividing the above limit
Practice E 177. The respective percent coefficients of variation
values by 2.8.
among test results may be obtained by dividing the above numbers
by 2.8.
Ex.2.4 Bias—The original draft of this abbreviated method was
experimentally compared in one laboratory with the appropriate 31.5 Precision is often constant for low test values and
reference method (ASTM DXXXX) and was found to give results
approximately 10 % high, as theoretical considerations would sug- proportional for higher test values, as shown in the following
est (See ASTM Research Report No. XXXW). An adjustment for example:
this bias is now made in Section XX on calculations, so that the
Ex.5 Precision
final result is now without known bias.
Test range 0.010 to 1200 mm
31.3 If a sufficient number of different materials to cover the 95 % repeatability limit (within labora- 0.002 mm or 2.5 % of the average,
test range are included in the interlaboratory study (6 or more tory) whichever is larger
95 % reproducibility limit (between lab- 0.005 mm or 4.2 % of the average,
in accordance with Practice E 691), then the approximate oratories) whichever is larger
variation in precision with test level may be determined. Since
two distinctly separate classes of material are tested by the The above terms repeatability limit and reproducibility limit are
used as specified in Practice E 177. The respective standard devia-
method shown in illustrative example Ex.3, two separate ions and percent coefficients of variation among test results may be
interlaboratory studies were made. In the first study, the obtained by dividing the above limit values by 2.8.
repeatability was found to be essentially proportional to the test 31.6 A table may be used especially if the precision indexes
value (with minor variation from material to material as vary irregularly from material to material. Note in the follow-
shown), whereas the reproducibility had a more complex linear ing example that the materials have been arranged in increasing
relationship (that is, a constant as well as a proportional term). order of test value:
In the second study, the repeatability and the reproducibility Ex.6A Precision
Repro-
were each found to be proportional to the test value. Repeata-
duci- Repro-
Glucose in bility Repeat-
Ex.3 Precision bility duci-
Material Serum, Stand- ability
Stand- bility
Average ard Devi- Limit
Coarse-fiber materials ard De- Limit
ation
viation
Test range 30 to 150 g A 41.518 1.063 1.063 2.98 2.98
95 % repeatability limit (within laboratory) 7 % (6 to 8.5 %) of the test result B 79.680 1.495 1.580 4.19 4.42
95 % reproducibility limit (between labora- 2 g + 10 % (8 to 12 %) of the test C 134.726 1.543 2.148 4.33 6.02
tories) result D 194.717 2.625 3.366 7.35 9.42
Well beaten (fine-fiber) materials E 294.492 3.935 4.192 11.02 11.74
Test range 20 to 75 g Ex.6A.1 Interlaboratory Test Program—An interlaboratory

95 % repeatability limit (within laboratory) 4 % (3.5 to 5 %) of the test result study of glucose in serum was conducted in accordance with
95 % reproducibility limit (between labora- 7 % (5 to 8 %) of the test result Practice E 691 in eight laboratories with five materials, with each
tories) laboratory obtaining three test results for each material. See ASTM
Research Report No. XXXX.
Ex.3.1 The values shown above for the limits are the average Ex.6A.2 The terms repeatability limit and reproducibility limit
and range) in each case as found in separate interlaboratory studies in Ex.6A are used as specified in Practice E 177.
Ex.6B Precision
--`,`````,``,``,````,,`,`,`,-`-`,,`,,`,`,,`---
for the coarse and fine-fiber materials. The terms repeatability limit
and reproducibility limit are used as specified in Practice E 177. The Repeat- Reproduc-
respective standard deviations among test results may be obtained Pentosans ability ibility Repeat- Repro-
by dividing the above limit values by 2.8. Material in Pulp, Stand- Stand- ability ducibility
Average ard De- ard De- Limit Limit
31.4 Precision information can often be obtained from viation viation
studies made for other purposes. Example below illustrates this A 0.405 0.015 0.114 0.04 0.32
approach and also illustrates another way of showing variation B 0.884 0.032 0.052 0.09 0.14
from material to material. C 1.128 0.143 0.196 0.40 0.55
E 177 – 90a (2002)
Ex.6B Precision effect in some laboratories is as high as the reproducibility between

Repeat- Reproduc- laboratories, it is possible that reproducibility also may be improved by
Pentosans ability ibility Repeat- Repro- better operator training.
Material in Pulp, Stand- Stand- ability ducibility
Average ard De- ard De- Limit Limit 31.8 . An example of a bias statement when bias has been
viation viation removed through comparison with a reference method is given
D 1.269 0.038 0.074 0.11 0.21 in 31.2 and Ex.2.4. A similar statement would apply for any
E 1.981 0.040 0.063 0.11 0.18 accepted reference value, for example, from an accepted
F 4.181 0.032 0.209 0.09 0.58
G 5.184 0.133 0.243 0.37 0.68
reference material. If bias depends on other properties of the
H 10.401 0.194 0.585 0.54 1.64 material, a statement such as the following might be used:
I 16.361 0.216 1.104 0.60 3.09 Ex.8 Bias
Ex.6B.1 Interlaboratory Test Program—An interlaboratory Ex.8.1. Bias—A ruggedness study (ASTM Research Report
study of pentosans in pulp was conducted in accordance with No. XXXZ) showed that test results are temperature dependent,
Practice E 691 with seven participating laboratories each obtaining with the dependence varying with the type of material. Therefore, if
three test results of each of nine materials. See ASTM Research the test temperature cannot be maintained within the specified
Report No. YYYY. limits, determine the temperature dependence for the specific
Ex.6B.2 The terms repeatability limit and reproducibility limit in material being tested and correct test results accordingly.
Ex.6B are used as specified in Practice E 177.
31.9 A maximum value for the bias of a test method may be
31.7 . If multi-operator precision (23.1) as well as repeat- estimated by an analysis of the effect of apparatus and
ability and reproducibility has been evaluated, its variation procedural tolerances on the test results, as illustrated below:
among laboratories may be shown as in illustrative example Ex.9 Bias
Ex.7. Ex.9.1. Bias—Error analysis shows that the absolute value of the
maximum systematic error that could result from instrument and
Ex.7 Precision other tolerances specified in the test method is 3.2 % of the test
Average test value 100 g result.
95 % repeatability limit (within a labo- 7 % (6 to 8 %) of the test result
ratory) 31.10 Even when a quantitative statement on bias is not
95 % reproducibility limit (between lab- 15 % (13 to 16 %) of the test result
oratories)
possible, it is helpful to the user of the method to know that the
95 % limit (operator-to-operator, within 6 % to 15 % of the test result developers of the method have considered the possibility of
laboratory) bias. In such cases, a statement on bias based on one of the
Ex.7.1 The values shown above for the limits are, in each case, following examples may be used:
the average (and range) found in the interlaboratory study. The
terms, repeatability, reproducibility and operator-to-operator limit, Ex.10.1 Bias—This method has no bias because (insert the name
are used as specified in this practice. The respective standard devia- of the property) is defined only in terms of this test method.
ions may be obtained by dividing the above limit values by 2.8. Ex.10.2 Bias—Since there is no accepted reference material,
method, or laboratory suitable for determining the bias for the
procedure in this test method for measuring (insert the name of the
NOTE 5—Since the lower value for the operator-to-operator effect was property), no statement on bias is being made.
obtained in a laboratory that has a continuing training program for its Ex.10.3 Bias—No justifiable statement can be made on the bias
of the procedure in this test method for measuring (insert the name
operators, it appears that the operator-to-operator effect may be reduced by
of the property) because (insert the reason).
training. Furthermore, since the upper value for the operator-to-operator
APPENDIX
(Nonmandatory Information)
X1. DESCRIPTIONS OF TERMS
X1.1 The following brief descriptions have been extracted X1.1.4 observation or observed value—the most elemental
from the text. For fuller discussions of the concepts, see the single reading or corrected reading obtained in the process of
referenced sections. making a measurement. (Section 7)
X1.1.5 precision—a generic concept related to the closeness
X1.1.1 accepted reference value—a value that serves as an
of agreement between test results obtained under prescribed
agreed-upon reference for comparison. (Section 16)
like conditions from the measurement process being evaluated.
X1.1.2 accuracy—a generic concept of exactness related to (Section 18).
the closeness of agreement between the average of one or more X1.1.6 repeatability—the closeness of agreement between
test results and an accepted reference value. (Section 20) test results obtained under repeatability conditions.
X1.1.3 bias—a generic concept related to a consistent or X1.1.7 repeatability conditions—conditions under which
systematic difference between a set of test results from the test results are obtained with the same test method in the same
process and an accepted reference value of the property being laboratory, by the same operator with the same equipment, in
measured. (Section 19) the shortest practical period of time, using test units or test
--`,`````,``,``,````,,`,`,`,-`-`,,`,,`,`,,`---
E 177 – 90a (2002)
specimens taken at random, from a single quantity of material X1.1.11 test method—a definitive procedure for the identi-
that is as nearly homogeneous as possible. fication, measurement, and evaluation of one or more qualities,
X1.1.8 reproducibility—a general term for a measure of characteristics, or properties of a material, product, system, or
precision applicable to the variability between single test service that produces a test result. (Section 5)
results obtained in different laboratories using test specimens X1.1.12 test result—the value obtained by carrying out the
taken at random from a single sample of material. (Section 25) complete protocol of the test method once, being either a single
X1.1.9 statistical control—a process is in a state of statis- test determination or a specified combination of a number of
tical control if the variations between the observed test results
test determinations. (Section 9)
from it can be attributed to a constant system of chance causes.
(Section 17)
X1.1.10 test determination—(1) the process of calculating
from one or more observations a property of a single test
specimen, or (2) the value obtained from the process.
(Section 8)
REFERENCES
(1) Youden, W. J., “Experimental Design and ASTM Committee”, Mate- (6) Manual on Presentation of Data and Control Chart Analysis, ASTM
rials Research and Standards, ASTM, November 1961, pp. 862–867. STP 15D, ASTM 1976.
(2) Wernimont, Grant, “Ruggedness Evaluation of Test Procedures”, (7) Murphy, R. B., “On the Meaning of Precision and Accuracy”,
Standardization News, Vol. 5, No. 3, March 1977, pp. 13–16. Materials Research and Standards, ASTM, April 1961, pp. 264–267.
(3) Youden, W. J., Statistical Techniques for Collaborative Tests, Associa-
(8) Eisenhart, Churchill,“ The Reliability of Measured Values—Part I:
tion of Official Analytical Chemists, Washington, DC, 1967, pp.
Fundamental Concepts”, Photogrammetric Engineering, June 1952,
29–32.
pp. 542–561.
(4) Shewhart, Walter A., Statistical Method from the Viewpoint of Quality
Control, The Graduate School of the Department of Agricultural, (9) Eisenhart, Churchill, “Realistic Evaluation of the Precision and Accu-
Washington, DC, 1939. racy of Instrument Calibration Systems”, Journal of Research of the
(5) Mandel, John, The Statistical Analysis of Experimental Data, National Bureau of Standards, 67C, 1963, pp. 161–187.
Interscience-Wiley Publishers, New York, NY, 1964 (out of print); (10) Mandel, John and Lashof, T. W., “The Nature of Repeatability and
corrected and reprinted by Dover Publishers, New York, NY, 1984, p. Reproducibility,” Journal of Quality Technology, Vol 19, No. 1,
105. January 1987, pp. 29–36.
ASTM International takes no position respecting the validity of any patent rights asserted in connection with any item mentioned
--`,`````,``,``,````,,`,`,`,-`-`,,`,,`,`,,`---
in this standard. Users of this standard are expressly advised that determination of the validity of any such patent rights, and the risk
of infringement of such rights, are entirely their own responsibility.
This standard is subject to revision at any time by the responsible technical committee and must be reviewed every five years and
if not revised, either reapproved or withdrawn. Your comments are invited either for revision of this standard or for additional standards
and should be addressed to ASTM International Headquarters. Your comments will receive careful consideration at a meeting of the
responsible technical committee, which you may attend. If you feel that your comments have not received a fair hearing you should
make your views known to the ASTM Committee on Standards, at the address shown below.
This standard is copyrighted by ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959,
United States. Individual reprints (single or multiple copies) of this standard may be obtained by contacting ASTM at the above
address or at 610-832-9585 (phone), 610-832-9555 (fax), or [email protected] (e-mail); or through the ASTM website
(www.astm.org).

Astm E177

Uploaded by

Copyright:

Available Formats

Astm E177

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Astm E177

Uploaded by

Copyright:

Available Formats

Designation: E 177 – 90a (Reapproved 2002)

Standard Practice for

1. Scope COMBINATIONS OF SOURCES OF VARIABILITY

istics, or properties of a material, product, system or service

9. Test Result 10. Experimental Realization of a Test Method

Test range 20 to 75 g Ex.6A.1 Interlaboratory Test Program—An interlaboratory

Ex.6B Precision effect in some laboratories is as high as the reproducibility between

X1. DESCRIPTIONS OF TERMS

You might also like