Analytical Method Equivalency

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9
At a glance
Powered by AI
The key takeaways are that validation ensures a method is suitable for its intended purpose, while equivalency demonstrates the sameness of two methods, and a two one-sided t-test or confidence intervals are commonly used to statistically show equivalency.

Validation determines the quality of a single method, while equivalency demonstrates that two methods produce equivalent results. Two independently validated methods are not inherently equivalent, as validation does not detect differences between methods.

When establishing equivalency, important considerations include the choice of samples, acceptance criteria, evaluation of data, and documentation of the process. Both simple and more statistically-based approaches can be taken.

DATA AND REVIEW

Analytical Method Equivalency


An Acceptable Analytical Practice
Don Chambers, Gregg Kelly, Giselle Limentani, Ashley Lister, K. Rick Lung, and Ed Warner

D
uring pharmaceutical development and postapproval,
it is often necessary to change analytical methods to en-
sure they remain stability-indicating, take advantage of
improved analytical technology, monitor new related
substances as a result of changes in synthetic or formulation
processes, and improve analytical efficiency (e.g., through au-
tomation). When changes are made, pharmaceutical manufac-
turers should demonstrate that the new method produces re-
sults equivalent to those produced by the previous method.
PHOTODISC

Complicating this task are questions that often arise about the
types of changes that will require equivalency assessment as well
as the manner in which equivalency must be determined.
Participants in a 2003 PhRMA At the 2003 Pharmaceutical Research and Manufacturers of
workshop present the industry’s America (PhRMA) Workshop on Acceptable Analytical Practices
current thinking on developing with regard to analytical method equivalency, participants gen-
erally agreed that equivalency and comparability are not the same
analytical method equivalency, and that equivalency is a subset of comparability. Method com-
including the importance of sample parability could be used to assess similarities as well as differences
selection, acceptance criteria, data and typically involves evaluating an entire profile of method per-
evaluation, and documentation. formance and not just one specific result. The goal of method
equivalency is to demonstrate acceptable method performance
For Client Review Only. All Rights Reserved. Advanstar Communications Inc. 2005
by comparison of a specific set of results (e.g., an assay).
The scope of this acceptable analytical practice on analytical
Don Chambers, PhD,* is the senior method equivalency applies only to small-molecule pharma-
director of analytical development for ceutical products and may not be applicable to biotechnology
Schering-Plough Research Institute, 2000
Galloping Hill Road, Kenilworth, NJ 07033,
products, changes in method parameters that are already within
tel. 908.740.2318, fax 908.740.2107, donald. validated ranges, modifications of methods during early devel-
[email protected]. Gregg Kelly, PhD, opment, nor method improvements that are intended to have
is a senior principal scientist for Pfizer, Inc. different characteristics.
(Groton, CT). Giselle Limentani, PhD,
is a director of product development for
GlaxoSmithKline (Research Triangle Park, NC).
Validation versus equivalency
Ashley Lister, PhD, is an associate The nature of validation testing is different from equivalency
director for Purdue Pharma LP (Ardsley, NY). testing. According to the International Conference on Harmo-
K. Rick Lung, PhD, is a senior scientist nization (ICH) guideline Q2A, “The objective of validation on
for AstraZeneca LP (Wilmington, DE). an analytical procedure is to demonstrate that it is suitable for
Ed Warner, is a director of statistical
services, Global Quality, for Schering-Plough
its intended purpose” (1). In practice, validation typically de-
(Union, NJ). termines the quality of a single analytical method. Equivalency
demonstrates the sameness of two analytical methods. Two in-
*To whom all correspondence should be addressed.
dependently validated methods are not inherently equivalent.
Submitted: Oct. 1, 2004. Accepted: Feb. 28, 2005.
Two methods may be validated in the same manner, with the
same criteria, and yield nonequivalent data, because validation
64 Pharmaceutical Technology SEPTEMBER 2005 www.phar mtech.com
DATA AND REVIEW
Pharmacopeial Forum (2). The US Phar- ria limits compared with results generated
Summary of method macopeia (USP) maintains a well-estab- by an existing method. If a new method
equivalency application lished chapter about method validation. requires a change outside the validated
Equivalence testing is recommended when: The compendial proposal is to treat the range of an existing method, then it will
• trending is important; subjects separately. That the new proposal require additional validation. Depending
• in the late-development stage (Phase III) or is in a compendial publication is evidence on the degree of change, equivalency test-
postregistration; that the regulatory landscape may be ing may be appropriate. For example, if a
• there is potential impact on results caused by a changing toward a more structured and new purity method describes a tempera-
method change. widespread application of equivalency test- ture change for an high-performance
It is less important to apply equivalence ing. Equivalency testing also may be gain- liquid chromatography (HPLC) column,
testing when: ing momentum from manufacturing units then the effect on peak shape for low-level
• trending is unimportant; and quality control laboratories. These impurities may be important. Nonequiv-
• there is no expectation of equivalent results (e.g., units more frequently require demonstra- alent results can even be obtained when
a change to the formulation or test method that tion of analytical equivalency before inau- harmonizing two compendial methods.
causes an intentionally different performance gurating a new method in an established Equivalency testing is appropriate in these
and/or analytical result); product line to avoid a shift in results. cases. Application of equivalency testing
• method changes do not affect results; Whenever trending is an important con- is typically more frequent and more for-
• in an early development stage (Phase I or II), sideration in the evaluation of data, such mal in Phase III (or postregistration),
when one is making deliberate process changes as in product stability assessments, method when it is important to carefully under-
and optimizing methods. equivalency may offer an advantage over stand how changes to analytical methods
validation alone. Equivalency can be an affect results. The sidebar, “Summary of
experiments are not designed to detect dif- imperative consideration if a discontinu- method equivalency application” summa-
ferences in methods. ity in stability trending would indicate a rizes appropriate times to apply equiva-
The distinction between validation and different product expiry. It is also impor- lency testing.
equivalency is recognized in the guidelines tant to know whether a new method could Nonetheless, many researchers believe
for equivalency testing proposed in the produce results crossing acceptance crite- that validating a new analytical method is
sufficient and that equivalency testing is
never needed. Currently, there are no reg-
ulatory requirements or guidance for
equivalency testing. Some feel that adding
“voluntary” work to heavy work demands
can be overly burdensome. Others apply
For Client Review Only. All Rights Reserved. Advanstar Communications Inc. 2005 validation of a new analytical method as
a means of equivalency testing, especially
when the same acceptance criteria are used
as in the original method validation.
Sometimes equivalency testing adds lit-
tle value to a drug development process
(see sidebar, “Summary of method equiv-
alency application”). When there is a de-
liberate change in the formulation or
method producing an expected and de-
sired change in analytical results, equiva-
lency testing is unnecessary. During Phase
I and Phase II, equivalency testing is typ-
ically informal if performed at all. One ex-
ample might be a change in paddle speed
for a dissolution method to address low
recovery or to enhance discrimination be-
tween various formulations or processing
schemes. Alternatively, if a method change
has a negligible influence on the accuracy
or precision of the method, then equiva-
lency testing may not be needed. For ex-
ample, the only change to an HPLC
method might be to reduce the flow rate
from 1.5 mL/min to 1.0 mL/min. This
66 Pharmaceutical Technology SEPTEMBER 2005 www.phar mtech.com
DATA AND REVIEW
well-selected samples for equivalence test-
Table I: Typical acceptance criteria for method transfer. ing would be samples that are intention-
Method type Acceptance criteria ally produced by a development facility to
Assay, content uniformity 6 2% justify a regulatory test method (i.e., aber-
6 5%
rant samples). Comparing results from
Dissolution
samples that will yield a range of values
Impurities, degradation products 615 – 20% and/or comparison of profiles will provide the greatest assurance that
methods are equivalent.
change may dramatically improve the res- and the analytical effect, equivalency test- For cases in which a product is uniform
olution of a closely eluting pair of impu- ing sometimes is performed with limited and stable and results indicate little change
rities without significantly affecting the or no revalidation for small changes to over time, a sample set may be selected for
corresponding quantitative results. In this HPLC columns, for example column di- equivalency testing that represents a rea-
case, equivalency testing is unwarranted. mensions or a change in manufacturer. sonable range of potential results. Less em-
Whenever there is little reason to link to phasis on aberrant samples may be appro-
historical data, equivalency testing has lim- Choice of samples priate. Testing out-of-specification samples
ited value. The choice of samples used in equivalency that fall just outside of criteria is worth-
Although less frequently observed, testing is important to obtain the best re- while if samples are available. Samples with
equivalency testing with limited or no val- sults. Homogeneous lots should be se- narrow result ranges should be avoided
idation has occurred in certain situations. lected so that equivalency testing can dis- when possible.
For example, when an automated disso- tinguish method variability from sample Some sample choices should be avoided
lution method is patterned after a vali- variability (i.e., batch uniformity). To the for compliance reasons, including any
dated manual dissolution method, equiv- extent possible, various representative lots sample that would generate a redundant
alency testing alone may be sufficient. If covering a wide spectrum of results should GMP result. For example, the practice of
any validation were needed, it would only be identified. Stability samples used be- applying two methods at one or more time
involve affected areas such as filter differ- yond testing windows or expiry periods points in an ongoing stability program to
ences. Depending on the degree of change are appropriate sample choices. Other gain confidence in method equivalence is
discouraged. If method equivalency has
been appropriately demonstrated, then
crossover testing should not be necessary.

Documentation
There are three current documentation
For Client Review Only. All Rights Reserved. Advanstar Communications Inc. 2005standard op-
practices: no documentation,
erating procedures, and protocols. Gener-
ally within the industry, standard operat-
ing procedures for method equivalency do
not exist—perhaps owing to a lack of a con-
sensus on the need for equivalency testing
and a reflection of the evolving nature of its
application. Typically, protocols are applied
during the later stages of product life (i.e.,
Phase III and after registration). When
equivalency testing is formally executed,
protocols are the most commonly used tool.
The methods under evaluation and the pro-
cedure for evaluating equivalence are de-
scribed in the written document. As with
other uses of protocols, acceptance criteria
are set, and the paperwork is signed before
experimentation is conducted. Nonethe-
less, such protocols could be substituted by
a well-defined standard operating proce-
dure, with project-specific details outlined
in a prospective plan documented in a note-
book by the involved laboratory. Documen-
tation is typically minimized or not used
when equivalency testing is informal.
68 Pharmaceutical Technology SEPTEMBER 2005 www.phar mtech.com
Informal means are appropriate in Phases
I and II, but formal documentation is some-
Potential problems with
times available even in early development. Student’s t test as an
equivalence test.
Acceptance criteria Student’s t test often is applied to compare results
The intent of method equivalency is to from a new analytical test method with those from
demonstrate that one method performs another test method.In any statistical test,two
within an acceptable range of a second hypotheses can be identified.The null hypothesis
method. Assigning acceptance criteria is expresses the widely held or conservative
the act of deciding what may be an accept- viewpoint.The alternative hypothesis expresses a
able difference. Unfortunately, deciding view that would lead to a modification of the
what constitutes an important difference current understanding or require that an action be
between results is difficult. taken that otherwise would not be.Typically,the
The selection of appropriate accept- following hypotheses are set up in Student’s t test.
ance criteria to demonstrate method
equivalency can be a challenging and H0: m1 5 m2
somewhat daunting task. Choosing ac- Ha:m1 ° m2
ceptance criteria involves asking the ques-
tion, What is the goal of this exercise? Are in which H0 and Ha are the null and alternate
you attempting to demonstrate that two hypotheses,and m1 and m2 are the population
methods are the same, or do you want to mean values.
show that the method is acceptable? The These hypotheses show mathematically that
answers to these questions may affect how Student’s t test assumes the mean values of two
your acceptance criteria are selected. Do populations are equal and requires the
you want to set your acceptance criteria experimenter to prove that they are different.By
so that you are making the most accurate convention,the null hypothesis is assumed to be
and precise measurement as possible (i.e., correct so that Student’s t test evaluates the
What you can achieve)? Or, do you select evidence against this assumption.As a
acceptance criteria solely on the basis of consequence,these hypotheses imply the
being as accurate and precise as needed? experimenter is attempting to prove that the mean
Occasionally, those two points coincide, values of two populations are different with high
but sometimes, the difference between confidence rather than attempting to prove they
what is needed and what can be achieved are the same.This is what renders the
is significant. straightforward Student’s t test less appropriate for
demonstrating equivalence.
How to set acceptance criteria Because the statistical test is based on the
Setting acceptance criteria is important for assumption that the variances of the two
For Client Review Only. Allmany
Rights Reserved.
reasons. Advanstar
If criteria are Communications Inc. 2005
too restrictive, populations are equal,caution must be taken
results may fail the study even if they are before conducting the statistical test for differences
acceptable. If criteria are set too wide, then in mean to ensure that the two sets of data being
the equivalence of two methods may not compared for equivalence are comparable in
be adequately demonstrated. The PhRMA precision.
workshop included in-depth discussions
about the means of setting acceptance cri- performance of the method.
teria. The following highlights some of the Workshop participants agreed that pop-
breakout sessions on this topic. ular metrics involve comparing mean val-
Regardless of the manner in which ac- ues produced by the two methods relative
ceptance criteria are selected, certain re- to the predefined criteria shown in Table
quirements must be met. All data must con- I. Although, to some extent, attendees dis-
tinue to support the previously established cussed the appropriateness of this prac-
specifications, accommodating both release tice, their collaborative opinion was that
and stability specifications, if different. As it is a common industry practice.
previously discussed, sample selection is A second group represented at the
critical (see “Choice of samples” section). meeting expressed a desire to set accept-
With an injudicious sample choice, the test ance criteria derived from a statistical
may fail the acceptance criteria on the basis analysis. Others felt uncomfortable using
of sample variability and not because of the statistics because they either lacked prac-
www.phar mtech.com
DATA AND REVIEW

Table II: An example of two sets of highly precise Table III: An example of two sets of imprecise
laboratory data. laboratory data.
Standard Standard
Assay results Mean Assay results Mean
deviation deviation
Manual 100.0, 99.9, 100.0,
100.0 0.08 Method 1 82, 92, 87, 85, 85, 86 86 3.3
test method 99.9, 99.9, 100.1

Automated 99.8, 99.8, 99.7,


99.8 0.05 Method 2 77, 78, 87, 79, 89, 80 82 5.0
test method 99.7, 99.8, 99.8

tical guidance on its application, did not possess fundamental equivalency study has failed to meet the predefined acceptance
understanding of statistics, or were not clear about the regula- criteria, then decisions on how to proceed should be based on a
tory requirements. complete review of relevant data and sound scientific judgment.

Failure to meet the acceptance criteria Data evaluation


Failure to meet predetermined acceptance criteria should trigger Approaches for establishing equivalence typically are based on
a review of the equivalence study to identify an assignable cause the observed difference between the means of data sets devel-
for the results. A thorough review of the method should be con- oped with each of the methods being compared. The size of the
ducted to determine whether anything inherent about the method observed difference is gauged against a predefined limit estab-
has caused the problem. The acceptance criteria also should be lished before the data are collected. Researchers participating
reviewed to determine whether they were set properly and whether in the workshop acknowledged that this comparison of results
they were appropriate for the purpose of the test. This evaluation may or may not involve the use of statistical methods.
also should include a decision on whether the difference between Some companies are not using statistics to evaluate equiva-
two methods is important. If the difference is significant and un- lence data, while others are considering the use of statistical
acceptable, then the new method must be rejected. If a method methods but do not have adequate support from a statistician.

For Client Review Only. All Rights Reserved. Advanstar Communications Inc. 2005

72 Pharmaceutical Technology SEPTEMBER 2005 www.phar mtech.com


DATA AND REVIEW
Many of the participants that are using rate outcome. This approach includes cal- difference of means, not to establish
statistical methods indicated that they use culating appropriate sample sizes, specify- whether the means are “similar” based on
Student’s t test to evaluate method equiv- ing acceptance criteria, selecting and han- a measure of practical importance. There-
alence (3). Other companies are using sta- dling samples, and developing appropriate fore, the test may not be the proper method
tistical approaches that are based on those experimental designs to reduce the influ- to reflect the intent of the investigator.
used in bioequivalence studies (4). Com- ence of potentially confounding factors. Small studies with imprecise results often
panies using a statistical approach do not Although Student’s t test is a popular lead to difficulties in detecting true differ-
limit the practice of statistics to data eval- approach, there are significant problems ences, should they truly exist. Large stud-
uation; they also include statistical prin- associated with its use as an equivalence ies with precise data can correctly lead to
ciples to plan the study to ensure an accu- test. Student’s t test is designed to detect a the conclusion that a difference does exist,
even when the size of the difference may
be inconsequential. The sidebar, “Poten-
tial problems with Student’s t test as an
equivalence test” provides examples re-
garding the potential problems associated
with this statistical method.
The problem is that the two-sample t
test has an inappropriate hypothesis for
establishing equivalence. In a statistical hy-
pothesis test, only the alternative hypoth-
esis can be statistically proven. Though
counterintuitive, failure to reject the null
hypothesis does not prove that the null hy-
pothesis condition truly exists. In the
equivalence problem, equivalence is proven
or demonstrated. To accomplish this, the
statement of the problem must be reversed:
The appropriate statistical approach is one
designed to determine whether the differ-
ences between two methods are within a
set of boundaries (upper and lower lim-
its) predesignated by the investigator as
For Client Review Only. All Rights Reserved. Advanstar Communications Inc. 2005 being practically relevant. A statistical test
is conducted independently at both the
upper and lower end to determine whether
there is sufficient evidence to reject the hy-
pothesis that the difference between the
methods is as large or larger than this limit.
The resulting statistical hypotheses can be
tested with two one-sided t tests—a
method known in the pharmaceutical in-
dustry as Schuirmann’s two one-sided test.
This test is currently used by several
PhRMA member companies. The sidebar,
“Statistical equivalence tests” details this
statistical method and demonstrates how
this approach would help in the interpre-
tation of the same examples presented in
the sidebar, “Potential problems with Stu-
dent’s t test as an equivalence test.”
Schuirmann’s two one-sided test has
been used to assess bioequivalence data for
years (4, 5). In bioequivalence tests, the
statement of the problem is whether two
formulations (e.g., a generic product and
the innovator product) can be considered
equivalent. In practice, equivalence is
74 Pharmaceutical Technology SEPTEMBER 2005 www.phar mtech.com
DATA AND REVIEW
determined by comparing the clinical endpoints measured against ceptable in some cases, they may be too loose or too tight as
predetermined limits. Predetermined limits are established as a methods and specifications evolve.
practical range within which the conclusion of bioequivalence Common approaches to selecting acceptance criteria include:
is supported (e.g., 80–125%). Schuirmann’s two one-sided test • basing the criteria on the variability of the assay;
is not limited to the bioequivalence problem but is easily ex- • attempting to compare methods in conjunction with speci-
tended to other applications such as method transfers or the fications that have been calculated from process capability
comparison of manufacturing process results after process mod- limits;
ifications. The test also has been used to determine equivalence • basing criteria on commonly accepted conventions from other
between robotic and manual assays as well as content unifor- applications.
mity methods for tablets (6). This approach to comparing re- Each of these approaches has potential pitfalls, so caution
sults from two methods was outlined recently as part of the in- should be exercised. In most cases, the selection of acceptance
process revision in Pharmacopeial Forum for USP General criteria is not purely a statistical issue. Acceptance criteria must
Chapter ^1010& “Analytical Data: Interpretation and Treatment” be selected with input from development chemists as well as
(2). Incorporating the variability of the method and pre-speci- statisticians.
fying the criteria for equivalence allows the samples sizes for the Most analysts will qualitatively conclude that the two sets of
study to be calculated in advance. This approach can be expanded results in Table II are nearly the same. By applying the two-sam-
further by incorporating distinctions between an unacceptable- ple Student’s t test, however, the calculated t value is 5.07, which
and an acceptable-sized difference and then using power calcu- is greater than the critical t value for 95% confidence and 10 de-
lations (7). Some companies reported taking this expanded grees of freedom (2.23). Therefore, the null hypothesis is re-
approach (8). jected, and equivalence between the manual and automated test
The two one-sided test requires prespecified criteria. Most method cannot be claimed. By this approach, as data become
PhRMA companies consider the selection of acceptance limits more precise, the two sample t test becomes more sensitive to
for establishing equivalence to be challenging. Some compa- small differences in the mean, even if those differences are prac-
nies derive limits from method validation data, while most com- tically unimportant. This may lead the investigator to conclude
panies use widely held conventions (e.g., 2% for content assays; that there is sufficient evidence (.95% confidence) that the two
5% for dissolution assays). Although these values may be ac- method means are not equivalent.

For Client Review Only. All Rights Reserved. Advanstar Communications Inc. 2005

76 Pharmaceutical Technology SEPTEMBER 2005 www.phar mtech.com


DATA AND REVIEW
Consider an example in which the labo- would be an inappropriate conclusion to These two examples demonstrate that
ratory data are less precise and fewer data draw. If 95% confidence is required to re- in the context of equivalence testing, Stu-
are generated. The calculated t value based ject the null hypothesis of equal method dent’s t test approach is unable to detect
on the data in Table III (t 5 1.83) is less than means, then the appropriate conclusion is true differences in the mean in the face of
the critical t value for the 95% significance that there is insufficient evidence to reject large variability and small sample sizes,
level and 10 degrees of freedom (2.23). the initial belief that the method means are while good precision and extensive test-
Therefore, the null hypothesis cannot be re- the same. This conclusion is fundamentally ing may result in the ability to detect very
jected, and the results might mistakenly be different from one stating that there is suf- small true mean differences that are likely
considered to support the conclusion that ficient evidence to conclude that the two to be deemed statistically significant. It
the methods are equivalent. This clearly method means are the same. clearly does not provide an adequate ap-
proach to demonstrating equivalence.
Consider how the statistical equivalence
approach helps interpret the data shown
in Tables II and III. For Table II, assume
that the predefined upper and lower ac-
ceptable limits around the difference (i.e.,
uL and uU) are symmetrical and predesig-
nated as 21 and 11. Calculating a confi-
dence interval for the difference in the
sample means (0.2) requires the pooled
standard deviation, which is 0.0683, and
the critical t value for 10 degrees of free-
dom with a set for the two one-sided test
at 0.10, which is 1.81. The calculated con-
fidence limits for the difference are 0.13
to 0.27, and these are completely contained
within the predesignated interval. The
conclusion from the statistical test favors
equivalence. This result contrasts that ob-
tained by Student’s t test in the sidebar
“Potential problems with using Student’s
t test as an equivalence test,” which is that
For Client Review Only. All Rights Reserved. Advanstar Communications Inc. 2005 the two method means were not equiva-
lent. Data precision leads to a very narrow
confidence band in the two one-sided test,
thereby increasing the precision of our es-
timate of the difference and our success
in drawing a correct conclusion.
Again, consider how the statistical
equivalence approach helps with the in-
terpretation of the data in Table III. As-
sume the size of the upper and lower ac-
ceptable limits around the difference has
been predesignated from 25.0 to 15.0
(common metric for dissolution, see Table
I). Calculating the confidence interval for
the difference in sample means (4%) again
requires the pooled standard deviation,
which is 4.3, and the critical t value for 10
degrees of freedom with a set for the two
one-sided test at 0.10, which is 1.81. The
endpoints of the confidence limits for the
difference are 0.49 to 8.49 (calculated using
nonrounded data), which are not con-
tained within the predesignated interval.
The conclusion from the statistical equiv-
alence test is not in favor of equivalence.
78 Pharmaceutical Technology SEPTEMBER 2005 www.phar mtech.com
DATA AND REVIEW
Summary
Statistical equivalence tests. As long as there is change, there will be a desire to compare the
The hypotheses for equivalence testing may be expressed as follows: old and the new. Changes in analytical methods may at times
produce true changes in the results. A comparison of the two
Ho: m1 2 m2 < uL or m1 2 m2 > uU
methods may show them to be inequivalent. Other analytical
Ha: uL , m1 2 m2 , uU
method changes may not be so dramatic. The results achieved
in which Ha is the alternate hypothesis and uL and uU are predefined as lower before and after some analytical method changes (e.g., improved
and upper acceptable limits for equivalence.In this statistical test,if the actual efficiency, minor changes to a particular step of a procedure, or
difference between the methods (the true mean difference) is in between the for automation of a manual method) may be expected to be es-
boundaries designated by u,the means of the two populations are considered sentially the same. Under these circumstances, one will proba-
sufficiently equal.This method is commonly referred to as Schuirmann’s two bly need to assess the equivalency of the results produced by
one-sided test or the two one-sided t test. the two methods. Equivalency testing offers advantages over
The hypotheses for Schuirmann’s two one-sided test can be represented as performing validation alone, because validation criteria are set
follows: to determine the soundness of a single method, not the same-
ness of two independent methods.
To establish analytical method equivalency, the choice of sam-
ples, acceptance criteria, data evaluation, and documentation
Ha,which is the hypothesis we wish to demonstrate,is represented by the should be considered. A simple approach using popular met-
region where the true population difference between two sets of data (m1 2 rics may be sufficient, but a more statistically based evaluation
m2) is within the lower and upper acceptance limits uL and uU. also can be useful. Acceptable analytical practices for these top-
The two one-sided test also can be executed using a confidence interval (CI) ics, based on the input of PhRMA workshop participants, have
calculated for the difference in sample means using a t value for the appropriate been presented to foster good scientific rationale for determin-
degree of confidence (e.g., 95%) and the sample sizes of the two sets of data ing the equivalency of two analytical methods.
being compared.
Acknowledgment
We thank Drs. Phil Palermo and Galen Radebaugh for their ex-
pert assistance in leading the breakout sessions and capturing
in which the input from meeting participants. We also thank Dr. Jeffrey
D. Hofer and the PhRMA Statistical Expert Team for their help-
ful comments and suggestions.
is the calculated difference in the sample means;
References
1. International Conference on Harmonization, Q2A: Harmonized Tri-
is the t value for 95% confidence for n1 + n2 2 2 degrees of freedom (Note that partite Guideline, Text on Validation of Analytical Procedures (ICH,
Geneva, Switzerland, Oct. 27, 1994), p.1.
for 95% confidence,a is 0.10 for a two one-sided test); n1 is the sample size for 2. US Pharmacopeia, “General Chapter ^1010& ‘Analytical Data: Inter-
the first data set; n2 is the sample size for the second data set; and sp is the pretation and Treatment,’” Pharma. Forum 30 (1), 236–263 (2004).
pooled standard deviation. 3. J.C. Miller and J.N. Miller, Statistics for Analytical Chemistry (Ellis Hor-
For Client Review Only. All Rights Reserved. Advanstar Communications Inc. 2005
If this confidence interval is within the predefined acceptance limits,the null wood LTD, New York, NY, 3d ed., 1993), p. 55.
hypothesis is rejected in favor of the alternative,and the two sets of results are 4. Shein-Chung Chow and Jen-Pei Liu, Design and Analysis of Bioavail-
ability and Bioequivalence Studies (Marcel Dekker, New York, NY, 1992),
considered equivalent.Because the magnitude of the calculated confidence p. 77.
interval increases as the pooled standard deviation increases,Schuirmann’s two 5. D. Schuirmann, “A Comparison of the Two One-Sided Test Proce-
one-sided test is more likely to conclude that there is a lack of equivalence in the dures and the Power Approach for Assessing the Equivalence of Av-
presence of too much variation,the opposite of the effect seen with the two- erage Bioavailability,” J. Pharmaco. Biopharm. 15, 657–680 (1987).
sample Student t test. 6. K.R. Lung et al., “Statistical Method for the Determination of Equiv-
alence of Automated Test Procedures,” J. Automated Methods and Man-
agement in Chemistry 25 (6), 123–127 (2003).
This is again in contrast to the result obtained by Student’s t test 7. J. Stein and N. Doganaksoy, “Sample Size Considerations for Assess-
in the sidebar,“Potential problems with Student’s t test as an equiv- ing the Equivalence of Two Process Means,” Qual. Engineer. 12 (1),
alence test.” Data imprecision now leads to a very broad confi- 105–110 (1999–2000).
dence band, and we cannot conclude equivalence. Even a broad- 8. G.B. Limentani et al., “Beyond the t-Test: Statistical Equivalence Test-
ing for Analytical Scientists,” Anal. Chem. 77, 221A–226A (2005). PT
ening of the acceptance criteria to as much as 68.0% would not
have led to a conclusion of equivalence. This is an improvement
over the Student’s t test result. Please rate this article.
These two examples illustrate that the statistical equivalence On the Reader Service Card, circle a number:
test based on the two one-sided test does not favor large variabil- 345 Very useful and informative
ity but instead rewards precision. 346 Somewhat useful and informative
347 Not useful or informative
Your feedback is important to us.

80 Pharmaceutical Technology SEPTEMBER 2005 www.phar mtech.com

You might also like