Wikipedia
Wikipedia
Wikipedia
PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information.
PDF generated at: Sun, 13 Jul 2014 16:18:15 UTC
Contents
Articles
Chi-squared test
Chi-squared distribution
10
Statistics
14
References
Article Sources and Contributors
25
26
Article Licenses
License
27
Chi-squared test
Chi-squared test
A chi-squared test, also referred to as chi-square test or
sampling distribution of the test statistic is a chi-squared distribution when the null hypothesis is true. Also
considered a chi-squared test is a test in which this is asymptotically true, meaning that the sampling distribution (if
the null hypothesis is true) can be made to approximate a chi-squared distribution as closely as desired by making the
sample size large enough.
Some examples of chi-squared tests where the chi-squared distribution is only approximately valid:
Pearson's chi-squared test, also known as the chi-squared goodness-of-fit test or chi-squared test for
independence. When the chi-squared test is mentioned without any modifiers or without other precluding context,
this test is usually meant (for an exact test used in place of
, see Fisher's exact test).
Yates's correction for continuity, also known as Yates' chi-squared test.
CochranMantelHaenszel chi-squared test.
McNemar's test, used in certain 22 tables with pairing
Tukey's test of additivity
The portmanteau test in time-series analysis, testing for the presence of autocorrelation
Likelihood-ratio tests in general statistical modelling, for testing whether there is evidence of the need to move
from a simple model to a more complicated one (where the simple model is nested within the complicated one).
One case where the distribution of the test statistic is an exact chi-squared distribution is the test that the variance of
a normally distributed population has a given value based on a sample variance. Such a test is uncommon in practice
because values of variances to test against are seldom known exactly.
References
Weisstein, Eric W., "Chi-Squared Test" [1], MathWorld.
Corder, G.W., Foreman, D.I. (2009). Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach
Wiley, ISBN 978-0-470-45461-9
Greenwood, P.E., Nikulin, M.S. (1996) A guide to chi-squared testing. Wiley, New York. ISBN 0-471-55779-X
Nikulin, M.S. (1973). "Chi-squared test for normality". In: Proceedings of the International Vilnius Conference
on Probability Theory and Mathematical Statistics, v.2, pp.119122.
Bagdonavicius, V., Nikulin, M.S. (2011) "Chi-square goodness-of-fit test for right censored data". The
International Journal of Applied Mathematics and Statistics, p. 30-50.Wikipedia:Citing sources#What
information to include
Chi-squared test
References
[1] http:/ / mathworld. wolfram. com/ Chi-SquaredTest. html
Chi-squared distribution
This article is about the mathematics of the chi-squared distribution. For its uses in statistics, see chi-squared test.
For the music group, see Chi2 (band).
Probability density function
Notation
or
Parameters
Support
pdf
CDF
Mean
Median
Mode
max{ k 2, 0 }
Variance
2k
Skewness
Ex. kurtosis 12 / k
Entropy
MGF
CF
(1 2 i t)k/2
Chi-squared distribution
In probability theory and statistics, the chi-squared distribution (also chi-square or -distribution) with k degrees
of freedom is the distribution of a sum of the squares of k independent standard normal random variables. A special
case of the gamma distribution, it is one of the most widely used probability distributions in inferential statistics, e.g.,
in hypothesis testing or in construction of confidence intervals.[1] When there is a need to contrast it with the
noncentral chi-squared distribution, this distribution is sometimes called the central chi-squared distribution.
The chi-squared distribution is used in the common chi-squared tests for goodness of fit of an observed distribution
to a theoretical one, the independence of two criteria of classification of qualitative data, and in confidence interval
estimation for a population standard deviation of a normal distribution from a sample standard deviation. Many other
statistical tests also use this distribution, like Friedman's analysis of variance by ranks.
Definition
If Z1, ..., Zk are independent, standard normal random variables, then the sum of their squares,
is distributed according to the chi-squared distribution with k degrees of freedom. This is usually denoted as
The chi-squared distribution has one parameter: k a positive integer that specifies the number of degrees of
freedom (i.e. the number of Zis)
Characteristics
Further properties of the chi-squared distribution can be found in the box at the upper right corner of this article.
where (k/2) denotes the Gamma function, which has closed-form values for integer k.
For derivations of the pdf in the cases of one, two and k degrees of freedom, see Proofs related to chi-squared
distribution.
Differential equation
Chi-squared distribution
Chernoff bound for the CDF and tail (1-CDF) of a chi-squared random variable with ten
degrees of freedom (k = 10)
, Chernoff bounds on the lower and upper tails of the CDF may be obtained. For the cases when
(which include all of the cases when this CDF is less than half):
, similarly, is
For another approximation for the CDF modeled after the cube of a Gaussian, see under Noncentral chi-squared
distribution.
Additivity
It follows from the definition of the chi-squared distribution that the sum of independent chi-squared variables is also
chi-squared distributed. Specifically, if {Xi}i=1n are independent chi-squared variables with {ki}i=1n degrees of
freedom, respectively, then Y = X1 + + Xn is chi-squared distributed with k1 + + kn degrees of freedom.
Sample mean
The sample mean of n i.i.d. chi-squared variables of degree k is distributed according to a gamma distribution with
shape
and
scale
parameters:
Asymptotically, given that for a scale parameter going to infinity, a Gamma distribution converges towards a
Normal distribution with expectation = k and variance 2 = k2, the sample mean converges towards:
Chi-squared distribution
Note that we would have obtained the same result invoking instead the central limit theorem, noting that the
expectation of the is k, and its variance 2k (and hence the variance of the sample mean being 2k/n).
Entropy
The differential entropy is given by
Noncentral moments
The moments about zero of a chi-squared distribution with k degrees of freedom are given by[5][6]
Cumulants
The cumulants are readily obtained by a (formal) power series expansion of the logarithm of the characteristic
function:
Asymptotic properties
By the central limit theorem, because the chi-squared distribution is the sum of k independent random variables with
finite mean and variance, it converges to a normal distribution for large k. For many practical purposes, for k>50
the distribution is sufficiently close to a normal distribution for the difference to be ignored. Specifically, if X~(k),
then as k tends to infinity, the distribution of
tends to a standard normal distribution. However,
convergence is slow as the skewness is
The sampling distribution of ln(2) converges to normality much faster than the sampling distribution of 2, as the
logarithm removes much of the asymmetry. Other functions of the chi-squared distribution converge more rapidly
to a normal distribution. Some examples are:
If X ~ (k) then
is approximately normally distributed with mean
and unit variance (result credited
to R. A. Fisher).
If X ~ (k) then
and variance
This is
Chi-squared distribution
(normal distribution)
(Noncentral chi-squared distribution with non-centrality parameter
If
then
As a special case, if
then
If
and
If
, then
then
If
. (gamma distribution)
(chi distribution)
, then
If
If
If
then
(Inverse-chi-squared distribution)
and
If
(beta distribution)
If
then
Chi-squared distribution
Generalizations
The chi-squared distribution is obtained as the sum of the squares of k independent, zero-mean, unit-variance
Gaussian random variables. Generalizations of this distribution can be obtained by summing the squares of other
types of Gaussian random variables. Several such distributions are described below.
Linear combination
If
distribution of
calculate the pdf
Chi-squared distributions
Noncentral chi-squared distribution
Main article: Noncentral chi-squared distribution
The noncentral chi-squared distribution is obtained from the sum of the squares of independent Gaussian random
variables having unit variance and nonzero means.
Generalized chi-squared distribution
Main article: Generalized chi-squared distribution
The generalized chi-squared distribution is obtained from the quadratic form zAz where z is a zero-mean Gaussian
vector having an arbitrary covariance matrix, and A is an arbitrary matrix.
Applications
The chi-squared distribution has numerous applications in inferential statistics, for instance in chi-squared tests and
in estimating variances. It enters the problem of estimating the mean of a normally distributed population and the
problem of estimating the slope of a regression line via its role in Students t-distribution. It enters all analysis of
variance problems via its role in the F-distribution, which is the distribution of the ratio of two independent
chi-squared random variables, each divided by their respective degrees of freedom.
Following are some of the most common situations in which the chi-squared distribution arises from a
Gaussian-distributed sample.
if X1, ..., Xn are i.i.d. N(, 2) random variables, then
where
The box below shows some statistics based on Xi Normal(i, 2i), i = 1, , k, independent random variables that
have probability distributions related to the chi-squared distribution:
Chi-squared distribution
Name
Statistic
chi-squared distribution
chi distribution
1.64
2.71
3.84
6.64
10.83
0.10
3.22
4.60
5.99
9.21
13.82
0.35
4.64
6.25
7.82
11.34 16.27
0.71
5.99
7.78
9.49
13.28 18.47
1.14
7.29
9.24
1.63
8.56
2.17
9.80
2.73
3.32
4.17 5.38 6.39 8.34 10.66 12.24 14.68 16.92 21.67 27.88
10
3.94
4.87 6.18 7.27 9.34 11.78 13.44 15.99 18.31 23.21 29.59
0.20
0.10
0.05
0.01
0.001
Chi-squared distribution
References
[1] NIST (2006). Engineering Statistics Handbook - Chi-Squared Distribution (http:/ / www. itl. nist. gov/ div898/ handbook/ eda/ section3/
eda3666. htm)
[2] Hald 1998, pp.633692, 27. Sampling Distributions under Normality.
[3] F. R. Helmert, " Ueber die Wahrscheinlichkeit der Potenzsummen der Beobachtungsfehler und ber einige damit im Zusammenhange
stehende Fragen (http:/ / gdz. sub. uni-goettingen. de/ dms/ load/ img/ ?PPN=PPN599415665_0021& DMDID=DMDLOG_0018)", Zeitschrift
fr Mathematik und Physik 21 (http:/ / gdz. sub. uni-goettingen. de/ dms/ load/ toc/ ?PPN=PPN599415665_0021), 1876, S. 102219
[4] R. L. Plackett, Karl Pearson and the Chi-Squared Test, International Statistical Review, 1983, 61f. (http:/ / www. jstor. org/ stable/
1402731?seq=3) See also Jeff Miller, Earliest Known Uses of Some of the Words of Mathematics (http:/ / jeff560. tripod. com/ c. html).
[5] Chi-squared distribution (http:/ / mathworld. wolfram. com/ Chi-SquaredDistribution. html), from MathWorld, retrieved Feb. 11, 2009
[6] M. K. Simon, Probability Distributions Involving Gaussian Random Variables, New York: Springer, 2002, eq. (2.35), ISBN
978-0-387-34657-1
[7] Chi-Squared Test (http:/ / www2. lv. psu. edu/ jxm57/ irp/ chisquar. html) Table B.2. Dr. Jacqueline S. McLaughlin at The Pennsylvania State
University. In turn citing: R.A. Fisher and F. Yates, Statistical Tables for Biological Agricultural and Medical Research, 6th ed., Table IV
Further reading
Hald, Anders (1998). A history of mathematical statistics from 1750 to 1930. New York: Wiley.
ISBN0-471-17912-4.
Elderton, William Palin (1902). "Tables for Testing the Goodness of Fit of Theory to Observation". Biometrika 1
(2): 155163. doi: 10.1093/biomet/1.2.155 (http://dx.doi.org/10.1093/biomet/1.2.155).
External links
Hazewinkel, Michiel, ed. (2001), "Chi-squared distribution" (http://www.encyclopediaofmath.org/index.
php?title=p/c022100), Encyclopedia of Mathematics, Springer, ISBN978-1-55608-010-4
Calculator for the pdf, cdf and quantiles of the chi-squared distribution (http://calculus-calculator.com/
statistics/chi-squared-distribution-calculator.html)
Earliest Uses of Some of the Words of Mathematics: entry on Chi squared has a brief history (http://jeff560.
tripod.com/c.html)
Course notes on Chi-Squared Goodness of Fit Testing (http://www.stat.yale.edu/Courses/1997-98/101/chigf.
htm) from Yale University Stats 101 class.
Mathematica demonstration showing the chi-squared sampling distribution of various statistics, e.g. x, for a
normal population (http://demonstrations.wolfram.com/StatisticsAssociatedWithNormalSamples/)
Simple algorithm for approximating cdf and inverse cdf for the chi-squared distribution with a pocket calculator
(http://www.jstor.org/stable/2348373)
10
Definition
Pearson's chi-squared test is used to assess two types of comparison: tests of goodness of fit and tests of
independence.
A test of goodness of fit establishes whether or not an observed frequency distribution differs from a theoretical
distribution.
A test of independence assesses whether paired observations on two variables, expressed in a contingency table,
are independent of each other (e.g. polling responses from people of different nationalities to see if one's
nationality is related to the response).
The procedure of the test includes the following steps:
1. Calculate the chi-squared test statistic,
to the critical value from the chi-squared distribution with df degrees of freedom, which in many
general population, values would occur in each cell with equal frequency. The "theoretical frequency" for any cell
(under the null hypothesis of a discrete uniform distribution) is thus calculated as
are
11
Other distributions
When testing whether observations are random variables whose distribution belongs to a given family of
distributions, the "theoretical frequencies" are calculated using a distribution from that family fitted in some standard
way. The reduction in the degrees of freedom is calculated as
, where is the number of co-variates
used in fitting the distribution. For instance, when checking a three-co-variate Weibull distribution,
when checking a normal distribution (where the parameters are mean and standard deviation),
, and
. In other
where
= Pearson's cumulative test statistic, which asymptotically approaches a
distribution.
= an observed frequency;
= an expected (theoretical) frequency, asserted by the null hypothesis;
= the number of cells in the table.
The chi-squared statistic can then be used to
calculate a p-value by comparing the value
of the statistic to a chi-squared distribution.
The number of degrees of freedom is equal
to the number of cells
, minus the
reduction in degrees of freedom, .
The result about the numbers of degrees of
freedom is valid when the original data are
multinomial and hence the estimated
parameters are efficient for minimizing the
chi-squared statistic. More generally
however, when maximum likelihood
estimation does not coincide with minimum
chi-squared estimation, the distribution will
lie somewhere between a chi-squared
distribution with
and
Bayesian method
For more details on this topic, see Categorical distribution With a conjugate prior.
In Bayesian statistics, one would instead use a Dirichlet distribution as conjugate prior. If one took a uniform prior,
then the maximum likelihood estimate for the population probability is the observed probability, and one may
compute a credible region around this or another estimate.
Test of independence
In this case, an "observation" consists of the values of two outcomes and the null hypothesis is that the occurrence of
these outcomes is statistically independent. Each observation is allocated to one cell of a two-dimensional array of
cells (called a contingency table) according to the values of the two outcomes. If there are r rows and c columns in
the table, the "theoretical frequency" for a cell, given the hypothesis of independence, is
where
is the total sample size (the sum of all cells in the table). With the term "frequencies" this page does not
Fitting the model of "independence" reduces the number of degrees of freedom by p=r+c1. The number of
degrees of freedom is equal to the number of cells rc, minus the reduction in degrees of freedom, p, which reduces
to(r1)(c1).
For the test of independence, also known as the test of homogeneity, a chi-squared probability of less than or equal
to 0.05 (or the chi-squared statistic being at or larger than the 0.05 critical point) is commonly interpreted by applied
workers as justification for rejecting the null hypothesis that the row variable is independent of the column variable.
The alternative hypothesis corresponds to the variables having an association or relationship where the structure of
this relationship is not specified.
Assumptions
The chi-squared test, when used with the standard approximation that a chi-squared distribution is applicable, has the
following assumptions:Wikipedia:Citation needed
Simple random sample The sample data is a random sampling from a fixed distribution or population where
every collection of members of the population of the given sample size has an equal probability of selection.
Variants of the test have been developed for complex samples, such as where the data is weighted. Other forms
can be used such as purposive sampling[1]
Sample size (whole table) A sample with a sufficiently large size is assumed. If a chi squared test is conducted
on a sample with a smaller size, then the chi squared test will yield an inaccurate inference. The researcher, by
using chi squared test on small samples, might end up committing a Type II error.
Expected cell count Adequate expected cell counts. Some require 5 or more, and others require 10 or more. A
common rule is 5 or more in all cells of a 2-by-2 table, and 5 or more in 80% of cells in larger tables, but no cells
with zero expected count. When this assumption is not met, Yates's Correction is applied.
Independence The observations are always assumed to be independent of each other. This means chi-squared
cannot be used to test correlated data (like matched pairs or panel data). In those cases you might want to turn to
McNemar's test.
12
Examples
Goodness of fit
In this context, the frequencies of both theoretical and empirical distributions are unnormalised counts, and for a
chi-squared test the total sample sizes
of both these distributions (sums of all cells of the corresponding
contingency tables) have to be the same.
For example, to test the hypothesis that a random sample of 100 people has been drawn from a population in which
men and women are equal in frequency, the observed number of men and women would be compared to the
theoretical frequencies of 50 men and 50 women. If there were 44 men in the sample and 56 women, then
If the null hypothesis is true (i.e., men and women are chosen with equal probability), the test statistic will be drawn
from a chi-squared distribution with one degree of freedom (because if the male frequency is known, then the female
frequency is determined).
Consultation of the chi-squared distribution for 1 degree of freedom shows that the probability of observing this
difference (or a more extreme difference than this) if men and women are equally numerous in the population is
approximately 0.23. This probability is higher than conventional criteria for statistical significance (0.001 or 0.05),
so normally we would not reject the null hypothesis that the number of men in the population is the same as the
number of women (i.e., we would consider our sample within the range of what we'd expect for a 50/50 male/female
ratio.)
Problems
The approximation to the chi-squared distribution breaks down if expected frequencies are too low. It will normally
be acceptable so long as no more than 20% of the events have expected frequencies below 5. Where there is only 1
degree of freedom, the approximation is not reliable if expected frequencies are below 10. In this case, a better
approximation can be obtained by reducing the absolute value of each difference between observed and expected
frequencies by 0.5 before squaring; this is called Yates's correction for continuity.
In cases where the expected value, E, is found to be small (indicating a small underlying population probability,
and/or a small number of observations), the normal approximation of the multinomial distribution can fail, and in
such cases it is found to be more appropriate to use the G-test, a likelihood ratio-based test statistic. When the total
sample size is small, it is necessary to use an appropriate exact test, typically either the binomial test or (for
contingency tables) Fisher's exact test. This test uses the conditional distribution of the test statistic given the
marginal totals; however, it does not assume that the data were generated from an experiment in which the marginal
totals are fixed and is valid whether or not that is the case.
13
14
Notes
[1] . See 'Discovering Statistics Using SPSS' by Andy Field for assumptions on Chi Square. -
References
Chernoff, H.; Lehmann, E. L. (1954). "The Use of Maximum Likelihood Estimates in
Fit". The Annals of Mathematical Statistics 25 (3): 579586. doi: 10.1214/aoms/1177728726 (http://dx.doi.org/
10.1214/aoms/1177728726).
Plackett, R. L. (1983). "Karl Pearson and the Chi-Squared Test". International Statistical Review (International
Statistical Institute (ISI)) 51 (1): 5972. doi: 10.2307/1402731 (http://dx.doi.org/10.2307/1402731). JSTOR
1402731 (http://www.jstor.org/stable/1402731).
Greenwood, P.E.; Nikulin, M.S. (1996). A guide to chi-squared testing. New York: Wiley. ISBN0-471-55779-X.
Statistics
Statistics is the study of the collection,
organization, analysis, interpretation
and presentation of data.[1] It deals
with all aspects of data including the
planning of data collection in terms of
the design of surveys and experiments.
When analyzing data, it is possible to
use
one
of
two
statistics
methodologies: descriptive statistics or
inferential statistics.
Scope
Statistics is a mathematical body of
science that pertains to the collection,
analysis, interpretation or explanation,
More probability density is found as one gets closer to the expected (mean) value in a
[2]
normal distribution. Statistics used in standardized testing assessment are shown. The
and presentation of data,
or as a
scales include standard deviations, cumulative percentages, percentile equivalents,
[3]
branch of mathematics.
Some
Z-scores, T-scores, standard nines, and percentages in standard nines.
consider statistics to be a distinct
mathematical science rather than a
branch of mathematics.Wikipedia:Vagueness
Mathematical statistics
Mathematical statistics is the application of mathematics to statistics, which was originally conceived as the science
of the state the collection and analysis of facts about a country: its economy, land, military, population, and so
forth. Mathematical techniques which are used for this include mathematical analysis, linear algebra, stochastic
analysis, differential equations, and measure-theoretic probability theory.
Statistics
Overview
In applying statistics to e.g. a scientific, industrial, or societal problem, it is necessary to begin with a population or
process to be studied. Populations can be diverse topics such as "all persons living in a country" or "every atom
composing a crystal".
Ideally, statisticians compile data about the entire population (an operation called census). This may be organized by
governmental statistical institutes. Descriptive statistics can be used to summarize the population data. Numerical
descriptors include mean and standard deviation for continuous data types (like income), while frequency and
percentage are more useful in terms of describing categorical data (like race).
When a census is not feasible, a chosen subset of the population called a sample is studied. Once a sample that is
representative of the population is determined, data is collected for the sample members in an observational or
experimental setting. Again, descriptive statistics can be used to summarize the sample data. However, the drawing
of the sample has been subject to an element of randomness, hence the established numerical descriptors from the
sample are also due to uncertainty. In order to still draw meaningful conclusions about the entire population,
inferential statistics is needed. It uses patterns in the sample data to draw inferences about the population
represented, accounting for randomness. These inferences may take the form of: answering yes/no questions about
the data (hypothesis testing), estimating numerical characteristics of the data (estimation), describing associations
within the data (correlation) and modeling relationships within the data (for example, using regression analysis).
Inference can extend to forecasting, prediction and estimation of unobserved values either in or associated with the
population being studied; it can include extrapolation and interpolation of time series or spatial data, and can also
include data mining.
Data collection
Sampling
In case census data cannot be collected, statisticians collect data by developing specific experiment designs and
survey samples. Statistics itself also provides tools for prediction and forecasting the use of data through statistical
models. To use a sample as a guide to an entire population, it is important that it truly represent the overall
population. Representative sampling assures that inferences and conclusions can safely extend from the sample to
the population as a whole. A major problem lies in determining the extent that the sample chosen is actually
representative. Statistics offers methods to estimate and correct for any random trending within the sample and data
collection procedures. There are also methods of experimental design for experiments that can lessen these issues at
the outset of a study, strengthening its capability to discern truths about the population.
Sampling theory is part of the mathematical discipline of probability theory. Probability is used in "mathematical
statistics" (alternatively, "statistical theory") to study the sampling distributions of sample statistics and, more
generally, the properties of statistical procedures. The use of any statistical method is valid when the system or
population under consideration satisfies the assumptions of the method. The difference in point of view between
classic probability theory and sampling theory is, roughly, that probability theory starts from the given parameters of
a total population to deduce probabilities that pertain to samples. Statistical inference, however, moves in the
opposite directioninductively inferring from samples to the parameters of a larger or total population.
15
Statistics
16
Statistics
Observational study
An example of an observational study is one that explores the correlation between smoking and lung cancer. This
type of study typically uses a survey to collect observations about the area of interest and then performs statistical
analysis. In this case, the researchers would collect observations of both smokers and non-smokers, perhaps through
a case-control study, and then look for the number of cases of lung cancer in each group.
Type of data
Main articles: Statistical data type and Levels of measurement
Various attempts have been made to produce a taxonomy of levels of measurement. The psychophysicist Stanley
Smith Stevens defined nominal, ordinal, interval, and ratio scales. Nominal measurements do not have meaningful
rank order among values, and permit any one-to-one transformation. Ordinal measurements have imprecise
differences between consecutive values, but have a meaningful order to those values, and permit any
order-preserving transformation. Interval measurements have meaningful distances between measurements defined,
but the zero value is arbitrary (as in the case with longitude and temperature measurements in Celsius or Fahrenheit),
and permit any linear transformation. Ratio measurements have both a meaningful zero value and the distances
between different measurements defined, and permit any rescaling transformation.
Because variables conforming only to nominal or ordinal measurements cannot be reasonably measured numerically,
sometimes they are grouped together as categorical variables, whereas ratio and interval measurements are grouped
together as quantitative variables, which can be either discrete or continuous, due to their numerical nature. Such
distinctions can often be loosely correlated with data type in computer science, in that dichotomous categorical
variables may be represented with the Boolean data type, polytomous categorical variables with arbitrarily assigned
integers in the integral data type, and continuous variables with the real data type involving floating point
computation. But the mapping of computer science data types to statistical data types depends on which
categorization of the latter is being implemented.
Other categorizations have been proposed. For example, Mosteller and Tukey (1977)[5] distinguished grades, ranks,
counted fractions, counts, amounts, and balances. Nelder (1990)[6] described continuous counts, continuous ratios,
count ratios, and categorical modes of data. See also Chrisman (1998),[7] van den Berg (1991).[8]
The issue of whether or not it is appropriate to apply different kinds of statistical methods to data obtained from
different kinds of measurement procedures is complicated by issues concerning the transformation of variables and
the precise interpretation of research questions. "The relationship between the data and what they describe merely
reflects the fact that certain kinds of statistical statements may have truth values which are not invariant under some
transformations. Whether or not a transformation is sensible to contemplate depends on the question one is trying to
answer" (Hand, 2004, p.82).[9]
17
Statistics
A random variable which is a function of the random sample and of the unknown parameter, but whose probability
distribution does not depend on the unknown parameter is called a pivotal quantity or pivot. Widely used pivots
include the z-score, the chi square statistic and Student's t-value.
Between two estimators of a given parameter, the one with lower mean squared error is said to be more efficient.
Furthermore an estimator is said to be unbiased if it's expected value is equal to the true value of the unknown
parameter which is being estimated and asymptotically unbiased if its expected value converges at the limit to the
true value of such parameter.
Other desirable properties for estimators include: UMVUE estimators which have the lowest variance for all possible
values of the parameter to be estimated (this is usually an easier property to verify than efficiency) and consistent
estimators which converges in probability to the true value of such parameter.
This still leaves the question of how to obtain estimators in a given situation and carry the computation, several
methods have been proposed: the method of moments, the maximum likelihood method, the least squares method
and the more recent method of estimating equations.
Error
Working from a null hypothesis two basic forms of error are recognized:
Type I errors where the null hypothesis is falsely rejected giving a "false positive".
Type II errors where the null hypothesis fails to be rejected and an actual difference between populations is
missed giving a "false negative".
Standard deviation refers to the extent to which individual observations in a sample differ from a central value, such
as the sample or population mean, while Standard error refers to an estimate of difference between sample mean and
population mean.
A statistical error is the amount by which an observation differs from its expected value, a residual is the amount an
observation differs from the value the estimator of the expected value assumes on a given sample (also called
prediction).
Mean squared error is used for obtaining efficient estimators, a widely used class of estimators. Root mean square
error is simply the square root of mean squared error.
Many statistical methods seek to minimize the residual sum of squares, and these are called "methods of least
squares" in contrast to Least absolute deviations. The later gives equal weight to small and big errors, while the
former gives more weight to large errors. Residual sum of squares is also differentiable, which provides a handy
property for doing regression. Least squares applied to linear regression is called ordinary least squares method and
least squares applied to nonlinear regression is called non-linear least squares. Also in a linear regression model the
non deterministic part of the model is called error term, disturbance or more simply noise.
18
Statistics
Measurement processes that generate statistical data are also subject to error. Many of these errors are classified as
random (noise) or systematic (bias), but other important types of errors (e.g., blunder, such as when an analyst
reports incorrect units) can also be important. The presence of missing data and/or censoring may result in biased
estimates and specific techniques have been developed to address these problems.[11]
Interval estimation
Main article: Interval estimation
Most studies only sample part of a population, so results don't fully represent the whole population. Any estimates
obtained from the sample only approximate the population value. Confidence intervals allow statisticians to express
how closely the sample estimate matches the true value in the whole population. Often they are expressed as 95%
confidence intervals. Formally, a 95% confidence interval for a value is a range where, if the sampling and analysis
were repeated under the same conditions (yielding a different dataset), the interval would include the true
(population) value in 95% of all possible cases. This does not imply that the probability that the true value is in the
confidence interval is 95%. From the frequentist perspective, such a claim does not even make sense, as the true
value is not a random variable. Either the true value is or is not within the given interval. However, it is true that,
before any data are sampled and given a plan for how to construct the confidence interval, the probability is 95% that
the yet-to-be-calculated interval will cover the true value: at this point, the limits of the interval are
yet-to-be-observed random variables. One approach that does yield an interval that can be interpreted as having a
given probability of containing the true value is to use a credible interval from Bayesian statistics: this approach
depends on a different way of interpreting what is meant by "probability", that is as a Bayesian probability.
In principle confidence intervals can be symmetrical or asymmetrical. An interval can be asymmetrical because it
works as lower or upper bound for a parameter (left-sided interval or right sided interval), but it can also be
asymmetrical because the two sided interval is built violating symmetry around the estimate. Sometimes the bounds
for a confidence interval are reached asymptotically and these are used to approximate the true bounds.
Significance
Main article: Statistical significance
Statistics rarely give a simple Yes/No type answer to the question asked of them. Interpretation often comes down to
the level of statistical significance applied to the numbers and often refers to the probability of a value accurately
rejecting the null hypothesis (sometimes referred to as the p-value).
Referring to statistical significance does not necessarily mean that the overall result is significant in real world terms.
For example, in a large study of a drug it may be shown that the drug has a statistically significant but very small
beneficial effect, such that the drug is unlikely to help the patient noticeably.
Criticisms arise because the hypothesis testing approach forces one hypothesis (the null hypothesis) to be "favored,"
and can also seem to exaggerate the importance of minor differences in large studies. A difference that is highly
statistically significant can still be of no practical significance, but it is possible to properly formulate tests in account
for this. (See also criticism of hypothesis testing.)
One response involves going beyond reporting only the significance level to include the p-value when reporting
whether a hypothesis is rejected or accepted. The p-value, however, does not indicate the size of the effect. A better
and increasingly common approach is to report confidence intervals. Although these are produced from the same
calculations as those of hypothesis tests or p-values, they describe both the size of the effect and the uncertainty
surrounding it.
19
Statistics
Examples
Some well-known statistical tests and procedures are:
Misuse of statistics
Main article: Misuse of statistics
Misuse of statistics can produce subtle, but serious errors in description and interpretationsubtle in the sense that
even experienced professionals make such errors, and serious in the sense that they can lead to devastating decision
errors. For instance, social policy, medical practice, and the reliability of structures like bridges all rely on the proper
use of statistics.
Even when statistical techniques are correctly applied, the results can be difficult to interpret for those lacking
expertise. The statistical significance of a trend in the datawhich measures the extent to which a trend could be
caused by random variation in the samplemay or may not agree with an intuitive sense of its significance. The set
of basic statistical skills (and skepticism) that people need to deal with information in their everyday lives properly is
referred to as statistical literacy.
There is a general perception that statistical knowledge is all-too-frequently intentionally misused by finding ways to
interpret only the data that are favorable to the presenter.[12] A mistrust and misunderstanding of statistics is
associated with the quotation, "There are three kinds of lies: lies, damned lies, and statistics". Misuse of statistics can
be both inadvertent and intentional, and the book How to Lie with Statistics outlines a range of considerations. In an
attempt to shed light on the use and misuse of statistics, reviews of statistical techniques used in particular fields are
conducted (e.g. Warne, Lazo, Ramos, and Ritter (2012)).[13]
Ways to avoid misuse of statistics include using proper diagrams and avoiding bias. Misuse can occur when
conclusions are overgeneralized and claimed to be representative of more than they really are, often by either
deliberately or unconsciously overlooking sampling bias. Bar graphs are arguably the easiest diagrams to use and
understand, and they can be made either by hand or with simple computer programs. Unfortunately, most people do
not look for bias or errors, so they are not noticed. Thus, people may often believe that something is true even if it is
not well represented. To make data gathered from statistics believable and accurate, the sample taken must be
representative of the whole. According to Huff, "The dependability of a sample can be destroyed by [bias]... allow
yourself some degree of skepticism."
To assist in the understanding of statistics Huff proposed a series of questions to be asked in each case:
Who says so? (Does he/she have an axe to grind?)
How does he/she know? (Does he/she have the resources to know the facts?)
Whats missing? (Does he/she give us a complete picture?)
Did someone change the subject? (Does he/she offer us the right answer to the wrong problem?)
Does it make sense? (Is his/her conclusion logical and consistent with what we already know?)
20
Statistics
Misinterpretation: correlation
The concept of correlation is particularly noteworthy for the potential confusion it can cause. Statistical analysis of a
data set often reveals that two variables (properties) of the population under consideration tend to vary together, as if
they were connected. For example, a study of annual income that also looks at age of death might find that poor
people tend to have shorter lives than affluent people. The two variables are said to be correlated; however, they may
or may not be the cause of one another. The correlation phenomena could be caused by a third, previously
unconsidered phenomenon, called a lurking variable or confounding variable. For this reason, there is no way to
immediately infer the existence of a causal relationship between the two variables. (See Correlation does not imply
causation.)
21
Statistics
22
The final wave, which mainly saw the refinement and expansion of earlier developments, emerged from the
collaborative work between Egon Pearson and Jerzy Neyman in the 1930s. They introduced the concepts of "Type
II" error, power of a test and confidence intervals. Jerzy Neyman in 1934 showed that stratified random sampling
was in general a better method of estimation than purposive (quota) sampling.[17] Today, statistical methods are
applied in all fields that involve decision making, for making accurate inferences from a collated body of data and
for making decisions in the face of uncertainty based on statistical methodology. The use of modern computers has
expedited large-scale statistical computations, and has also made possible new methods that are impractical to
perform manually.
Trivia
Applied statistics, theoretical statistics and mathematical statistics
"Applied
statistics"
comprises
descriptive
statistics
and
the
application
of
inferential
statistics.[18]Wikipedia:Verifiability Theoretical statistics concerns both the logical arguments underlying
justification of approaches to statistical inference, as well encompassing mathematical statistics. Mathematical
statistics includes not only the manipulation of probability distributions necessary for deriving results related to
methods of estimation and inference, but also various aspects of computational statistics and the design of
experiments.
Statistics in society
Statistics is applicable to a wide variety of academic disciplines, including natural and social sciences, government,
and business. Statistical consultants can help organizations and companies that don't have in-house expertise relevant
to their particular questions.
Statistical computing
Main article: Computational statistics
The rapid and sustained increases in
computing power starting from the second
half of the 20th century have had a
substantial impact on the practice of
statistical science. Early statistical models
were almost always from the class of linear
models, but powerful computers, coupled
with suitable numerical algorithms, caused
an increased interest in nonlinear models
(such as neural networks) as well as the
creation of new types, such as generalized
linear models and multilevel models.
Increased computing power has also led to
the growing popularity of computationally
gretl, an example of an open source statistical package
Statistics
intensive methods based on resampling, such as permutation tests and the bootstrap, while techniques such as Gibbs
sampling have made use of Bayesian models more feasible. The computer revolution has implications for the future
of statistics with new emphasis on "experimental" and "empirical" statistics. A large number of both general and
special purpose statistical software are now available.
Specialized disciplines
Main article: List of fields of application of statistics
Statistical techniques are used in a wide range of types of scientific and social research, including: biostatistics,
computational biology, computational sociology, network biology, social science, sociology and social research.
Some fields of inquiry use applied statistics so extensively that they have specialized terminology. These disciplines
include:
Image processing
Medical Statistics
23
Statistics
Psychological statistics
Reliability engineering
Social statistics
In addition, there are particular types of statistical analysis that have also developed their own specialised
terminology and methodology:
Statistics form a key basis tool in business and manufacturing as well. It is used to understand measurement systems
variability, control processes (as in statistical process control or SPC), for summarizing data, and to make
data-driven decisions. In these roles, it is a key tool, and perhaps the only reliable tool.
References
[1]
[2]
[3]
[4]
[5]
[6]
Dodge, Y. (2006) The Oxford Dictionary of Statistical Terms, OUP. ISBN 0-19-920613-9
Moses, Lincoln E. (1986) Think and Explain with Statistics, Addison-Wesley, ISBN 978-0-201-15619-5 . pp. 13
Hays, William Lee, (1973) Statistics for the Social Sciences, Holt, Rinehart and Winston, p.xii, ISBN 978-0-03-077945-9
Freedman, D.A. (2005) Statistical Models: Theory and Practice, Cambridge University Press. ISBN 978-0-521-67105-7
Mosteller, F., & Tukey, J. W. (1977). Data analysis and regression. Boston: Addison-Wesley.
Nelder, J. A. (1990). The knowledge needed to computerise the analysis and interpretation of statistical information. In Expert systems and
artificial intelligence: the need for information about data. Library Association Report, London, March, 2327.
[7] Chrisman, Nicholas R. (1998). Rethinking Levels of Measurement for Cartography. Cartography and Geographic Information Science, vol.
25 (4), pp. 231242
[8] van den Berg, G. (1991). Choosing an analysis method. Leiden: DSWO Press
[9] Hand, D. J. (2004). Measurement theory and practice: The world through quantification. London, UK: Arnold.
[10] P. Elio, Probabilit e Statistica, Esculapio 2007
[11] Rubin, Donald B.; Little, Roderick J. A.,Statistical analysis with missing data, New York: Wiley 2002
[12] Huff, Darrell (1954) How to Lie with Statistics, WW Norton & Company, Inc. New York, NY. ISBN 0-393-31072-8
[13] Warne, R. Lazo, M., Ramos, T. and Ritter, N. (2012). Statistical Methods Used in Gifted Education Journals, 20062010. Gifted Child
Quarterly, 56(3) 134149.
[14] Willcox, Walter (1938) "The Founder of Statistics". Review of the International Statistical Institute 5(4):321328.
[15] J. Franklin, The Science of Conjecture: Evidence and Probability before Pascal,Johns Hopkins Univ Pr 2002
[16] Galton F (1877) Typical laws of heredity. Nature 15: 492553
[17] Neyman, J (1934) On the two different aspects of the representative method: The method of stratified sampling and the method of purposive
selection. Journal of the Royal Statistical Society 97 (4) 557625
[18] Anderson, D.R.; Sweeney, D.J.; Williams, T.A.. (1994) Introduction to Statistics: Concepts and Applications, pp. 59. West Group. ISBN
978-0-314-03309-3
24
25
26
License
License
Creative Commons Attribution-Share Alike 3.0
//creativecommons.org/licenses/by-sa/3.0/
27