ANALYTICA. CHIMICA ACTA ELSEVIER Analytica Chimica Acta 391 (1999) 127-134 Detection and quantification limits: origins and historical overview” Lloyd A. Currie National Institute of Standards and Technology, Gaithersburg, MD 20899, USA Received 17 February 1998; accepted 18 February 1998 Abstract Detection and quantification capabilities represent fundamental performance characteristics of measurement processes, yet there have been decades of confusion and miscommunication regarding the underlying concepts and terminology. New. coordinated documents prepared for the International Union of Pure and Applied Chemistry (TUPAC) [L.A. Currie, IUPAC ‘Commission on Analytical Nomenclature, Recommendations in Evaluation of Analytical Methods including Detection and ‘Quantification Capabilities, Pure Appl. Chem. 67 (1995) 1699-1723] and the International Organization for Standardization (SO) [P. Witrich, Chairman, ISO/DIS 11843-1,2 (1995), Capability of Detection, ISO/TC69/SC6, ISO Standard, 1843-1 1977] promise to alleviate this situation by providing, for the first time, a harmonized position on standards and recommendations for adoption by the international scientific community. The text begins with (1) a brief historical summary of detection limits in chemistry, illustrating the critical need for the development of a sound and uniform system of terms and symbols; and (2) a review of the ISO-IUPAC deliberations and the ensuing harmonized position on concepts and nomenclature. In the following text a number of special topics are introduced, including: specification of the measurement process, attention to the meaning and evaluation of “sigma”, special considerations for calibration (or regression)-based detection and quantification limits, the central role of the blank, and finally, some challenges for the future. © 1999 Elsevier Science B.V. All rights reserved. Keywords: Detection; Quantification; International nomenclature; Metrology; IUPAC; ISO 1, Introduction nized. For some three decades, however, efforts to develop an internationally accepted system of nomen- The importance of objective measures for detection _clature and scientifically sound conceptual basis for and quantification capabilities of chemical and phy- such capabilities have been impeded by the evolution sical measurement. process s has long been recog- _of a wide range of terms, concepts, and ad hoc rules that are contradictory, and in some cases not even —— internally self-consistent, The need for a harmonized Tel: #1-301-975-3919; fax: +1-301-216-1134; e-mail: international position was recognized in 1993, when curtie@ nist gov members of IUPAC and ISO met, with the objective of sestngs Aeian Saisie! Assan, 197) Orginal ne: Ving Sich a position, based on documents then “Foundations and future of detection and quantification limits” in draft stage. These efforts culminated in 1995, Contribution of the National Institute of Standards and Technology: the publication of IUPAC recommendations for the not subject to copyright. international chemical community [9], and the pre- (0003-2670/99/$ - see front matter C; 1999 Elsevier Science B.V. All rights reserved PI: $0003-2670(99)00105-1 128 LA. Currie/Analytica Chimica Acta 391 (1999) 127- paration of an ISO document for metrology in general [7]. While not identical, the documents are based on the same fundamental concepts, and compatible for- mulations and terminology. In the following text, we present these concepts, together with their application to chemical metrology as developed in the IUPAC document. A brief review of the relevant history of the topic, and discussions of critical approaches and open questions in the application of the basic concepts which follow are drawn, in part, from Currie [2.3]. The literature cited in the foregoing references pro- vides detailed information on the statistical and che- mical aspects of the topic. 2. Communication and concepts 2.1. A very brief history Early papers on chemical detection limits were published by Kaiser [10] and Currie [1]. In the latter, which treated the hypothesis testing approach to detection decisions and limits in chemistry, it was shown that the then current meanings attached to the expression “detection limit” led to numerical values that spanned three orders of magnitude, when applied to the same specific measurement process. Differing concepts are but one side of the problem. A review of the history of detection and quantification limits in chemistry published in 1988 [2} gives a partial com- pilation of terms, all referring to the same, detection limit concept ~ ranging from “identification limit” to “limiting detectable concentration” to “Limit of guar- antee for purity”, for example. Perhaps the most serious terminological trap has been the use of the expression “detection limit” (a) by some to indicate the critical value (Lc) of the estimated amount or concentration above which the decision “detected” is made; but (b) by others, to indicate the inherent “true” detection capability (Lp) of the measurement, process in question. The first, “signal/noise” school, explicitly recognizes only the false positive (a, Type-1 error), which in effect makes the probability of the false negative (3, Type-2 error) equal to 50%. The second, “hypothesis testing” school employs inde- pendent values for a and 3, commonly each equal to 0.05 or perhaps 0.01. In the most extreme case, the same numerical value of the “detection limit” is 134 employed, e.g., 3.300, where 09 is the standard devia- tion of the estimated net signal ($) when its true value (S) is zero. The ratio Sfa for the first (signal/noise) group, assuming normality, is thus 0.50/0.0005 or 1000, far in excess of the unit ratio of the second group! Unfortunately, many are unaware of the above subtle differences in concepts and terminology, or of the care and attention to assumptions required for calculating meaningful detection limits; but these become manifest when difficult, low-level laboratory intercomparisons are made. By way of illustration, in the International Atomic Energy Agency's interla- boratory comparison of arsenic in horse kidney (1g/ g level), several laboratories failed to detect the As, yet their reported “detection limits” lay far below quan- titative results reported by others; and the range of reported values spanned nearly five orders of magni- tude ([2], Chapter 9) 2.1.1, Stimuli for the recent IUPAC and ISO efforts Early guidance on detection limits, strongly influ- enced by the work of Kaiser, was given by IUPAC. Resulting “official” recommendations appear in the early editions of the Compendium of Analytical Nomenclature, where the “limit of detection ... is derived from the smallest measure that can be detected with reasonable certainty for a given analytical pro- cedure”. This quantity is then related to a multiple k of the standard deviation of the blank “according to the confidence level desired”, with the general recom- mendation of k=3. Hypothesis testing was not men- tioned, nor was the issue of false negatives, nor the quantification limit [8,9]. Deficiencies in the “IUPAC definition” have been recognized by many [12]. and are responsible, in part, for the new IUPAC effort [9]. A second important factor in the new IUPAC effort to develop a proper treatment of both detection and quantification limits, and a parallel effort by ISO, were "The regrettable confusion between the two types of “detection limits” arose in part from an oversight in Kaiser's first publication fon the topic [10] which ignored the idea of the false negative. False negatives nevertheless occur, and when ignored (equivalent to equating Lo and Lp) their de facto probability becomes 50%, at least for symmetric distributions, Kaiser corrected his error by inventing the term “limit of guarantee for purity” [11], but this {erm has scarcely ever appeared in the literature. LA. Currie /Analstica Chimica Acta 391 (1999) 127-134 129 formal requests by Codex? in 1990 to each of the organizations for guidance on the terms “limit of detection” and “limit of determination”, because of the urgency of endorsing methods of analysis based on their capabilities to reliably measure essential and toxic elements in foods — a problem that Codex had been attempting to resolve since 1982, 2.2. Swnmary of the 1995 international recommendations Detection and quantification capabilities are con- sidered by IUPAC as fundamental performance char- acteristics of the chemical measurement process (CMP). As such, they require a fully defined, and controlled measurement process. taking into account such matters as types and levels of interference, and even the data reduction algorithm. Perhaps the most important purposes of these performance character- istics are for planning — ie., for the selection or development of CMPs to meet a specific scientific or regulatory need. In the new IUPAC and ISO docu- ments, detection limits. (minimum detectable amounts) are derived from the theory of hypothesis testing and the probabilities of false positives a, and false negatives 3 [7,9]. Quantification limits (9] are defined in terms of a specified value for the relative standard deviation (RSD). Default values for a and 3 are 0,05, each; and the default RSD for quantifi is taken as 10%. As CMP performance characteristics, both detee- tion and quantification limits are associated with underlying rrue values of the quantity of interest; they are not associated with any particular, outcome or result. The detection decision, on the other hand, is result-specific; it is made by comparing the experi- mental result with the critical value, which is the minimum significant estimated value of the quantity of interest. For presentation of the defining relations, L is used as the generic symbol for the quantity of interest. This is replaced by S when treating net analyte signals; and x, when treating analyte concen- trations or amounts. The symbol A is used by IUPAC to represent the sensitivity, or slope of the calibration ion ‘Codex. Alimentarius Commission, Cormmittee on Methods of Analysis and Sampling, Food and Agricultural Organization, World Health Organization, curve; it is noted that A is not necessarily independent of x, nor even a simple function of x. Subscripts C. D, and Q are used to denote the critical value, detection limit, and quantification limit, respectively. The defining relations, with default’ parameter values in parentheses, are given as follows: Detection decision (critical value) (Lc, o=! Pr(L > Le|L = 0) LclL = Lo) < a0) Further discussion of such a non-zero null is given in Chapter 1 of [2] in connection with discrimination limits (Section 3.3.1). 2.24, Signal and concentration domains Formulation of the generic, defining “L” expres- sions for signal and concentration detection and quantification limits is given in [9], and extended in [3]. Treatment of Sc, Sp, and Sq is relatively straightforward; but some interesting problems arise in making the transition to the concentration domain, particularly when there is non-negligible uncertainty in the “sensitivity” A (slope of the calibration curve). For the simplest (single component, straight line) calibration function the estimated concentration £ is given by &= (y—B)/A ay Although this is the simplest case, it nonetheless offers challenges, for example, in the application of error propagation for the estimation of the & distribu- tion if e, is non-normal, and for the non-linear parts of the transformation (denominator of Eq. (11)). In [3], which treats a normal response only, three cases for error in the estimated slope of the calibration curve are considered: (1) negligible (A known), (2) systematic (e4 fixed), and (3) random (¢,~N(O, V4)). The last of these is commonly employed to obtain “calibration” ed detection limits, using a tech- nique introduced by Hubaux and Vos [6]. This method, however, has a drawback in that it yields, “detection limits” that are random variables, as acknowledged in the original publication of the method ({6], p. 850); different results being obtained for each realization of the calibration curve, even though the underlying measurement process is fixed. IUPAC [9] suggests dealing with this problem by considering the median of the distribution of such “detection limits”. (The expected value is infinity.) An alternative, suggested in [9] and developed in (3), treats the exact, non-normal distribution of tin developing concentration detection limits. 3. Trans-definitional issues 3.1. Specification of the measurement (and evaluation) process This is absolutely essential. Quoting from Section 3.7.2 of IUPAC [9]: “Just as with other performance characteristics, Lp and Lg cannot be specified in the absence of a fully defined measurement proces: including such matters as types and levels of inter- ference as well as the data reduction algorithm, ‘Inter- ference free detection limits’ and ‘instrument detection limits’, for example, are perfectly valid within their respective domains; but if detection or quantification characteristics are sought for a more complex chemical measurement process, involving, for example sampling, analyte separation and purifi- cation, and interference and matrix effects, then it is mandatory that all these factors be considered in deriving values for Lp and Lo for that process. Other- wise the actual performance of the CMP (detection, quantification capabilities) may fall far short of the requisite performance”. 3.2. What is sigma? Beyond the conceptual framework for detection and quantification limits, there is probably no more diffi- cult or controversial issue than deciding just what to use for a9 or its estimate s9. Though often ignored, the matter of heteroscedasticity must obviously be taken into account, But it is crucial also to take into account the complete error budget of the specified measure- ment process, If this is done successfully, and if all assumptions are satisfied, the apparent dichotomy between the intralaboratory approach and the inter- laboratory approach to detection and quantification 132 LA, Currie/Analytica Chimica Acta 391 (1999) 127-134 m8, | ZL e BN Fig. 1, Sampled (S] and target [7] populations. limits essentially vanishes.’ Some perspective on this issue is given in Section 4.1 of [9] (quoted below), in terms of sampled and target populations, Sampled population (S] vs. target population (7 The above concept has been captured by Natrella (14] with reference to two populations which represent, respectively, the population (of potential measure- ments) actually sampled, and that which would be sampled in the ideal, bias-free case. The correspond- ing § and T populations are shown schematically in Fig. 1, for a two-step measurement process. When only the § population is randomly sampled (left-hand side of the figure), the error e, from the first step is systematic while e> is random. In this case, the estimated uncertainty is likely to be wrong, because (a) the appatent imprecision (2s) is too small, and (b) an unmeasured bias (¢;) has been introduced. Realiza- tion of the 7-population (right-hand side of the figure) requires that all steps of the CMP be random — ice., ¢) and e in the figure behave as random, independent errors; T thus represents a compound probability distribution. If the contributing errors combine line- arly and are themselves normal, then the 7-distribution also is normal. The concept of the S and T populations is absolutely central to all hierarchical measurement processes (compound CMPS), whether intralaboratory or interlaboratory. Strict attention to the concept is essential if one is to obtain consistent uncertainty in. practical application, the complexity of defining and estimating goes considerably deeper. For a recent informative and provocative discussion of this and related regulatory-detection issues, the reader may consult [4] estimates for compound CMPs involving different samples, different instruments, different operators. or even different methods. In the context of (material) sampling, an excellent exposition of the nature and importance of the hierarchical structure has been presented by Horwitz [5]. From the interlaboratory perspective, the first popu- lation in Fig. 1 (e:) would represent the distribution of errors among laboratories; the second [S} would reflect intralaboratory variation (“repeatability”) and the third [7], overall variation (“reproducibility”). If the sample of collaborating laboratories can be taken as unbiased, representative, and homogeneous, then the interlaboratory “process” can be treated as 2 compound CMP. In this fortunate (at best, asymptotic) situation, results from individual laboratories are con- sidered random, independent variates from the com- pound CMP population. For parameter estimation (means, variances) in the interlaboratory environment it may be appropriate to use weights — for example. when member laboratories employ different numbers of replicates [13] 4. Central role of the blank ([9], Section 3.8.8) 4.1, Blank (B) ‘The blank is one of the most crucial quantities in trace analysis, especially in the region of the detection limit. In fact, as shown above, the distribution and standard deviation of the blank are intrinsic to caleu- lating the detection limit of any CMP. Standard devia- tions are difficult to estimate with any precision (ca. SC observations required for 10% RSD for the standard deviation). Distributions (cdfs) are harder! It follows that extreme care must be given to the minimization and estimation of realistic blanks for the overall CMP. and that an adequate number of full scale blanks must be assayed, to generate some confidence in the nature of the blank distribution and some precision in the estimate of the blank RSD. Note: An imprecise estimate for the blank standard deviation is taken into account without difficulty in detection decisions, through the use of Student’s-r Detection limits, however, are themselves rendered imprecise if ap is not well known (see [9], Section LA. Currie /Analytica Chimica Acta 391 (1999) 127-134 133 Blanks or null effects may be described by three different terms, depending upon their origin: the instrumental background is the null signal (which for certain instruments may be set to zero, on the average) obtained in the absence of any analyte- or interference-derived signal; the (spectrum or chroma- togram) baseline comprises the summation of the instrumental background plus signals in the analyte (peak) region of interest due to interfering species; the chemical (or analyte) blank is that which arises from contamination from the reagents, sampling procedure, or sample preparation steps, which corresponds to the very analyte being sought. Assessment of the blank (and its variability) may be approached by an “external” or “internal” route, in close analogy to the assessment of random and sys- tematic error components [2]. The “external” approach consists of the generation and direct evaluation of a series of ideal or surrogate blanks for the overall ‘measurement process, using samples that are identical or closely similar to those being taken for analysis — but containing none of the analyte of interest. The CMP and matrix and interfering species should be unchanged. (The surrogate is simply the best available approximation to the ideal blank — ie., one having a similar matrix and similar levels of interferants.) The “internal” approach has been described as “propaga- tion of the blank”. This means that each step of the CMP is examined with respect to contamination and interference contributions, and the whole is then esti- mated as the sum of its parts ~ with due attention to differential recoveries and variance propagation. This is an important point: that the blank introduced at one stage of the CMP will be attenuated by subsequent partial recoveries. Neither the internal nor the external approach to blank assessment is easy, but one or the other is mandatory for accurate low-level measure- ments; and consistency (internal, external) is requisite for quality. Both approaches require expert chemical knowledge concerning the CMP in question 5. Some future challenges The “real world” holds many complexities and challenges beyond the preceding, relatively straight- forward univariate, linear, and monotonic perspective Very real detection and quantification capability issues in the environmental and physical sciences, for exam- ples, include: (1) multiple univariate and multivariate detection decisions and limits as found, for example, in the analysis of multicomponent nuclear spectra; (2) multivariate environmental blanks, especially in the measurement and source apportionment of multiiso- topic chemical species, such as atmospheric carbon monoxide; and (3) the appropriate treatment of detec- tion and quantification limits for archaeological ot geophysical artifacts or events that are governed by discrete, non-linear, and non-monotonic calibration curves, as in the case of radiocarbon ('4C) dating, An introduction and references to some of these problems in the physical and environmental sciences may be found in (3) Acknowledgements The author gratefully acknowledges the work of Dr David E. Coleman of Alcoa Laboratories, who served as Voluntary Peer Reviewer for the original version of this paper. Appendix A Detection of earthquake precursors* A graphical representation of these concepts is given in Fig. 2, where the “driving force” in this hypothetical example is the ability to detect the release of specific chemical precursors of earthquakes (e.g. radon) at levels corresponding to earthquakes of magnitude Lg and above. Thus Ly is the “requisite limit” or maximum acceptable limit for undetected earthquakes; this is driven, in tu, by a maximum acceptable loss to society. (Derivation of Ly values for sociotechnical problems, of course, is far more com- plex than the subject of this report!) The lower part of, the figure shows the minimum detectable value for the chemical precursor Lp, that must not exceed La, and its relation to the probability density functions (pdfs) at L=0 and at L=Lp together with « and 8, and the decision point (critical value) Lc. The figure has been purposely constructed to illustrate heteroscedasticity - “his example is adapted directly fom Section 3.7.3. of [9] 134 LA. Currie /Analytica Chimica Acta 391 (1999) 127-134 Ms octets Loss / 40 1 Acceptable 0.1 oo Richter ° “Scale pa Z Net Signal a 7 &® ib Fig. 2. Detection: needs and capabilities. Top portion shows the requisite limit Lg, bottom shows detection capability Lo. in this case, variance increasing with signal level, and unequal o and 9. The point of the latter construct is that, although 0.05 is the recommended default value for these parameters, particular circumstances may dictate more stringent control of the one or the other. Instructive implicit issues in this example are that (1) a major factor governing the detection capability could be the natural variation of the radon background (blank variance) in the environment sampled, and (2) a calibration factor or function is needed in order to couple the two abscissae in the diagram. In prin- ciple, the response of a sensing instrument could be calibrated directly to the Richter scale (earthquake magnitude); alternatively, there could be a two-stage calibration: instrument response-radon concentration, and radon concentration-Richter scale. References [11 LA. Curie, Limits for qualitative detection and quantitative determination, Anal. Chem. 40 (1968) 586. [2] L.A. Currie, (Bd), Detection in Analytical Chemistry Importance, Theory, and Practice, ACS Symposium Series vol. 361, American Chemical Society, Washington, DC 1988. [3] L.A. Curve, in: International Conference on Environmetric: and Chemometries, Las Vegas, NV, 1995, Detection: inter national update, and some emerging di-lemmas involving calibration, the blank, and multiple detection decisions Chemom. Intell. Lab. Syst. 37 (1977) 151-181 R.D. Gibbons, Some statistical and conceptual issues in the detection of low-level environmental pollutants, Environ Ecol, Statist, 2 (1995) 125-167; [Discussants: K. Cambell W.A. Huber, CE, White, H.D. Kahn, B.K, Sinha, W.P. Smith H. Lacayo, W.A. Telliard, C.B. Davis) [5] W. Horwitz, TUPAC Commission on Analytical Nomencla ture, Nomenclature for Sampling in Analytical Chemistry Pure Appl. Chemn, 62 (1990) 1193 [6] A. Hubaux, G. Vos, Decision and detection limits for lineas calibration curves, Anal. Chem. 42 (1970) 849-855. [7] P. Wittich, Chairman, ISO/DIS 11843-1,2 (1995), Capability of Detection, ISO/TC69/SC6, ISO Standard, 11843-1 1977 [8] TUPAC, Orange book, in: H. Preiser, G.H. Nancollas (Eds.) Compendium of Analytical Nomenclature, 2nd ed., Black: well, Oxford: Ist ed., 1978, [9] TUPAC, prepared by L.A. Currie, IUPAC Commission ot Analytical Nomenclature, Recommendations in. Evaluat of Analytical Methods including Detection and Quantficatior Capabilities, Pure Appl. Chem, 67 (1995) 1699-1723. (10] H. Kaiser, Die Berechnung der Nachweisempfindlichkeit Spectrochim. Acta 3 (1947) 40, {11] H. Kaiser, Zum Problem der Nachweisgrenze, Z. Anal. Chim 209 (1968) 1 [12] GL. Long, LD. Winfordner, Limit of detection: a closer look at the IUPAC definition, Anal. Chem. 55(7) (1983) 713A, (13) J. Mandel, R.C. Paule, Interlaboratory evaluation of @ ‘material with unequal numbers of replicates, Anal. Chem. 42 (1970) 1196. [14] M. Natrella, Experimental Statistics, NBS Handbook 91, US Government Printing Office, Washington, 1963. [15] TUPAC, J. Inczédy, T. Lengyel, A.M. Ure, A. Gelenesér, A. Hulanicki (Eds.) Compendium of Analytical Nomenclature, 3rd ed., Blackwell, Oxford, 1998, ia Note added in proof Shortly after the submission of this article to Analytica Chimica Acta a new edition of the TUPAC Compendium of Analytical Nomenclature was published [15]. In the new edition the deficiencies in the [old] “IUPAC Definition”” of the Detection Limit, as discussed in Section 2.1.1, were remedied by incorpora- tion of the new IUPAC Recommendations of 1995 [9],

