Mathematical Relations
Mathematical Relations
Mathematical Relations
en.wikipedia.org
Chapter 1
In real analysis, a branch of mathematics, Bernsteins theorem states that every real-valued function on the half-line
[0, ) that is totally monotone is a mixture of exponential functions. In one important special case the mixture is a
weighted average, or expected value.
Total monotonicity (sometimes also complete monotonicity) of a function f means that f is continuous on [0, ),
innitely dierentiable on (0, ), and satises
dn
(1)n f (t) 0
dtn
for all nonnegative integers n and for all t > 0. Another convention puts the opposite inequality in the above denition.
The weighted average statement can be characterized thus: there is a non-negative nite Borel measure on [0, ),
with cumulative distribution function g, such that
f (t) = etx dg(x),
0
f (t) = a + bt + (1 etx )(dx)
0
(1 x)(dx) < .
0
In more abstract language, the theorem characterises Laplace transforms of positive Borel measures on [0,). In
this form it is known as the BernsteinWidder theorem, or HausdorBernsteinWidder theorem. Felix Haus-
dor had earlier characterised completely monotone sequences. These are the sequences occurring in the Hausdor
moment problem.
1.1 References
S. N. Bernstein (1928). Sur les fonctions absolument monotones. Acta Mathematica. 52: 166. doi:10.1007/BF02592679.
2
1.2. EXTERNAL LINKS 3
Rene Schilling, Renming Song and Zoran Vondracek (2010). Bernstein functions. De Gruyter.
In the mathematical subeld of numerical analysis, monotone cubic interpolation is a variant of cubic interpolation
that preserves monotonicity of the data set being interpolated.
Monotonicity is preserved by linear interpolation but not guaranteed by cubic interpolation.
Example showing non-monotone cubic interpolation (in red) and monotone cubic interpolation (in blue) of a monotone data set.
Monotone interpolation can be accomplished using cubic Hermite spline with the tangents mi modied to ensure the
monotonicity of the resulting Hermite spline.
An algorithm is also available for monotone quintic Hermite interpolation.
4
2.1. MONOTONE CUBIC HERMITE INTERPOLATION 5
for k = 1, . . . , n 1 .
2. Initialize the tangents at every data point as the average of the secants,
k1 +k
mk = 2
for k = 2, . . . , n 1 ; if k1 and k have dierent sign, set mk = 0 . These may be updated in further
steps. For the endpoints, use one-sided dierences:
m1 = 1 and mn = n1
3. For k = 1, . . . , n 1 , if k = 0 (if two successive yk = yk+1 are equal), then set mk = mk+1 = 0, as the
spline connecting these points must be at to preserve monotonicity. Ignore step 4 and 5 for those k .
4. Let k = mk /k and k = mk+1 /k . If k or k1 are computed to be less than zero, then the input data
points are not strictly monotone, and (xk , yk ) is a local extremum. In such cases, piecewise monotone curves
can still be generated by choosing mk = 0 , although global strict monotonicity is not possible.
5. To prevent overshoot and ensure monotonicity, at least one of the following conditions must be met:
(a) the function
(2+3)2
(, ) = 3(+2)
must have a value greater than or equal to zero;
(b) + 2 3 0 ; or
(c) 2 + 3 0 .
If monotonicity must be strict then (, ) must have a value strictly greater than zero.
One simple way to satisfy this constraint is to restrict the vector (k , k ) to a circle of radius 3. That is, if k2 +k2 > 9
, then set mk = k k k and mk+1 = k k k where k = 23 2 .
k +k
finterpolated (x) = ylower h00 (t) + hmlower h10 (t) + yupper h01 (t) + hmupper h11 (t)
where hii are the basis functions for the cubic Hermite spline.
6 CHAPTER 2. MONOTONE CUBIC INTERPOLATION
2.3 References
Fritsch, F. N.; Carlson, R. E. (1980). Monotone Piecewise Cubic Interpolation. SIAM Journal on Numerical
Analysis. SIAM. 17 (2): 238246. doi:10.1137/0717021.
Dougherty, R.L.; Edelman, A.; Hyman, J.M. (April 1989). Positivity-, monotonicity-, or convexity-preserving
cubic and quintic Hermite interpolation. Mathematics of Computation. 52 (186): 471494. doi:10.2307/2008477.
Monotonic function
Monotonicity redirects here. For information on monotonicity as it pertains to voting systems, see monotonicity
criterion.
Monotonic redirects here. For other uses, see Monotone (disambiguation).
In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves
Figure 1. A monotonically increasing function. It is strictly increasing on the left and right while just monotonic (unchanging) in the
middle.
7
8 CHAPTER 3. MONOTONIC FUNCTION
or reverses the given order. This concept rst arose in calculus, and was later generalized to the more abstract setting
of order theory.
decreasing are taken to include the possibility of repeating the same value at successive arguments, so one nds the
terms weakly increasing and weakly decreasing to stress this possibility.
The terms non-decreasing and non-increasing should not be confused with the (much weaker) negative quali-
cations not decreasing and not increasing. For example, the function of gure 3 rst falls, then rises, then falls
again. It is therefore not decreasing and not increasing, but it is neither non-decreasing nor non-increasing.
The term monotonic transformation can also possibly cause some confusion because it refers to a transformation
by a strictly increasing function. Notably, this is the case in economics with respect to the ordinal properties of a
utility function being preserved across a monotonic transform (see also monotone preferences).[1]
A function f (x) is said to be absolutely monotonic over an interval (a, b) if the derivatives of all orders of f are
nonnegative or all nonpositive at all points on the interval.
f has limits from the right and from the left at every point of its domain;
These properties are the reason why monotonic functions are useful in technical work in analysis. Two facts about
these functions are:
if f is a monotonic function dened on an interval I , then f is dierentiable almost everywhere on I , i.e. the
set {x : x I} of numbers x in I such that f is not dierentiable in x has Lebesgue measure zero. In addition,
this result cannot be improved to countable: see Cantor function.
if f is a monotonic function dened on an interval [a, b] , then f is Riemann integrable.
An important application of monotonic functions is in probability theory. If X is a random variable, its cumulative
distribution function FX (x) = Prob(X x) is a monotonically increasing function.
A function is unimodal if it is monotonically increasing up to some point (the mode) and then monotonically decreas-
ing.
When f is a strictly monotonic function, then f is injective on its domain, and if T is the range of f , then there is an
inverse function on T for f .
(T u T v, u v) 0 u, v X.
Kachurovskiis theorem shows that convex functions on Banach spaces have monotonic operators as their derivatives.
A subset G of X X is said to be a monotone set if for every pair [u1 , w1 ] and [u2 , w2 ] in G ,
(w1 w2 , u1 u2 ) 0.
G is said to be maximal monotone if it is maximal among all monotone sets in the sense of set inclusion. The graph
of a monotone operator G(T ) is a monotone set. A monotone operator is said to be maximal monotone if its graph
is a maximal monotone set.
for all x and y in its domain. The composite of two monotone mappings is also monotone.
A constant function is both monotone and antitone; conversely, if f is both monotone and antitone, and if the domain
of f is a lattice, then f must be constant.
Monotone functions are central in order theory. They appear in most articles on the subject and examples from special
applications are found in these places. Some notable special monotone functions are order embeddings (functions for
which x y if and only if f(x) f(y)) and order isomorphisms (surjective order embeddings).
This is a form of triangle inequality, with n, n', and the goal Gn closest to n. Because every monotonic heuristic is
also admissible, monotonicity is a stricter requirement than admissibility. In some heuristic algorithms, such as A*,
the algorithm can be considered optimal if it is monotonic.[2]
Pseudo-monotone operator
Total monotonicity
3.8 Notes
[1] See the section on Cardinal Versus Ordinal Utility in Simon & Blume (1994).
[2] Conditions for optimality: Admissibility and consistency pg. 94-95 (Russell & Norvig 2010).
12 CHAPTER 3. MONOTONIC FUNCTION
3.9 Bibliography
Bartle, Robert G. (1976). The elements of real analysis (second ed.).
Grtzer, George (1971). Lattice theory: rst concepts and distributive lattices. ISBN 0-7167-0442-0.
Pemberton, Malcolm; Rau, Nicholas (2001). Mathematics for economists: an introductory textbook. Manch-
ester University Press. ISBN 0-7190-3341-1.
Renardy, Michael & Rogers, Robert C. (2004). An introduction to partial dierential equations. Texts in
Applied Mathematics 13 (Second ed.). New York: Springer-Verlag. p. 356. ISBN 0-387-00444-0.
Riesz, Frigyes & Bla Szkefalvi-Nagy (1990). Functional Analysis. Courier Dover Publications. ISBN 978-
0-486-66289-3.
Russell, Stuart J.; Norvig, Peter (2010). Articial Intelligence: A Modern Approach (3rd ed.). Upper Saddle
River, New Jersey: Prentice Hall. ISBN 978-0-13-604259-4.
Simon, Carl P.; Blume, Lawrence (April 1994). Mathematics for Economists (rst ed.). ISBN 978-0-393-
95733-4. (Denition 9.31)
Convergence of a Monotonic Sequence by Anik Debnath and Thomas Roxlo (The Harker School), Wolfram
Demonstrations Project.
Pseudo-monotone operator
In mathematics, a pseudo-monotone operator from a reexive Banach space into its continuous dual space is one
that is, in some sense, almost as well-behaved as a monotone operator. Many problems in the calculus of variations
can be expressed using operators that are pseudo-monotone, and pseudo-monotonicity in turn implies the existence
of solutions to these problems.
4.1 Denition
Let (X, || ||) be a reexive Banach space. A map T : X X from X into its continuous dual space X is said to be
pseudo-monotone if T is a bounded operator (not necessarily continuous) and if whenever
uj u in X as j
4.3 References
Renardy, Michael & Rogers, Robert C. (2004). An introduction to partial dierential equations. Texts in
Applied Mathematics 13 (Second ed.). New York: Springer-Verlag. p. 367. ISBN 0-387-00444-0. (Denition
9.56, Theorem 9.57)
13
Chapter 5
A Spearman correlation of 1 results when the two variables being compared are monotonically related, even if their relationship is
not linear. This means that all data-points with greater x-values than that of a given data-point will have greater y-values as well.
In contrast, this does not give a perfect Pearson correlation.
In statistics, Spearmans rank correlation coecient or Spearmans rho, named after Charles Spearman and often
denoted by the Greek letter (rho) or as rs , is a nonparametric measure of rank correlation (statistical dependence
between the ranking of two variables). It assesses how well the relationship between two variables can be described
14
5.1. DEFINITION AND CALCULATION 15
When the data are roughly elliptically distributed and there are no prominent outliers, the Spearman correlation and Pearson corre-
lation give similar values.
The Spearman correlation is less sensitive than the Pearson correlation to strong outliers that are in the tails of both samples. That
is because Spearmans rho limits the outlier to the value of its rank.
cov(rgX ,rgY )
rs = rgX ,rgY = rgX rgY
where
denotes the usual Pearson correlation coecient, but applied to the rank variables.
cov(rgX , rgY ) is the covariance of the rank variables.
rgX and rgY are the standard deviations of the rank variables.
Only if all n ranks are distinct integers, it can be computed using the popular formula
2
6 di
rs = 1 n(n2 1) .
where
di = rg(Xi ) rg(Yi ) , is the dierence between the two ranks of each observation.
n is the number of observations
Identical values are usually[4] each assigned fractional ranks equal to the average of their positions in the ascending
order of the values, which is equivalent to averaging over all possible permutations.
5.2. RELATED QUANTITIES 17
If ties are present in the data set, this equation yields incorrect results: Only if in both variables all ranks are distinct,
then rgX rgY = Var rgX = Var rgY = n(n2 1)/6 (cf. tetrahedral number Tn1 ). The rst equation
normalizing by the standard deviationmay even be used even when ranks are normalized to [0;1] (relative ranks)
because it is insensitive both to translation and linear scaling.
This method should also not be used in cases where the data set is truncated; that is, when the Spearman correlation
coecient is desired for the top X records (whether by pre-change rank or post-change rank, or both), the user should
use the Pearson correlation coecient formula given above.
The standard error of the coecient () was determined by Pearson in 1907 and Gosset in 1920. It is
0.6325
rs =
n1
There are several other numerical measures that quantify the extent of statistical dependence between pairs of ob-
servations. The most common of these is the Pearson product-moment correlation coecient, which is a similar
correlation method to Spearmans rank, that measures the linear relationships between the raw numbers rather than
between their ranks.
An alternative name for the Spearman rank correlation is the grade correlation;[5] in this, the rank of an obser-
vation is replaced by the grade. In continuous distributions, the grade of an observation is, by convention, always
one half less than the rank, and hence the grade and rank correlations are the same in this case. More generally, the
grade of an observation is proportional to an estimate of the fraction of a population less than a given value, with
the half-observation adjustment at observed values. Thus this corresponds to one possible treatment of tied ranks.
While unusual, the term grade correlation is still in use.[6]
5.3 Interpretation
The sign of the Spearman correlation indicates the direction of association between X (the independent variable)
and Y (the dependent variable). If Y tends to increase when X increases, the Spearman correlation coecient is
positive. If Y tends to decrease when X increases, the Spearman correlation coecient is negative. A Spearman
correlation of zero indicates that there is no tendency for Y to either increase or decrease when X increases. The
Spearman correlation increases in magnitude as X and Y become closer to being perfect monotone functions of each
other. When X and Y are perfectly monotonically related, the Spearman correlation coecient becomes 1. A perfect
monotone increasing relationship implies that for any two pairs of data values Xi, Yi and Xj, Yj, that Xi Xj and
Yi Yj always have the same sign. A perfect monotone decreasing relationship implies that these dierences always
have opposite signs.
The Spearman correlation coecient is often described as being nonparametric. This can have two meanings: First,
a perfect Spearman correlation results when X and Y are related by any monotonic function. Contrast this with the
Pearson correlation, which only gives a perfect value when X and Y are related by a linear function. The other sense
in which the Spearman correlation is nonparametric in that its exact sampling distribution can be obtained without
requiring knowledge (i.e., knowing the parameters) of the joint probability distribution of X and Y.
5.4 Example
In this example, the raw data in the table below is used to calculate the correlation between the IQ of a person with
the number of hours spent in front of TV per week.
Firstly, evaluate d2i . To do so use the following steps, reected in the table below.
1. Sort the data by the rst column ( Xi ). Create a new column xi and assign it the ranked values 1,2,3,...n.
18 CHAPTER 5. SPEARMANS RANK CORRELATION COEFFICIENT
2. Next, sort the data by the second column ( Yi ). Create a fourth column yi and similarly assign it the ranked
values 1,2,3,...n.
3. Create a fth column di to hold the dierences between the two rank columns ( xi and yi ).
4. Create one nal column d2i to hold the value of column di squared.
With d2i found, add them
to nd d2i = 194 . The value of n is 10. These values can now be substituted back into
6 d2i
the equation: = 1 n(n2 1) . to give
6 194
=1
10(102 1)
which evaluates to = 29/165 = 0.175757575... with a P-value = 0.627188 (using the t distribution)
Chart of the data presented. It can be seen that there might be a negative correlation, but that the relationship does not appear
denitive.
This low value shows that the correlation between IQ and hours spent watching TV is very low, although the negative
value suggests that the longer the time spent watching television the lower the IQ. In the case of ties in the original
values, this formula should not be used; instead, the Pearson correlation coecient should be calculated on the ranks
(where ties are given ranks, as described above).
Another approach parallels the use of the Fisher transformation in the case of the Pearson product-moment correlation
coecient. That is, condence intervals and hypothesis tests relating to the population value can be carried out using
the Fisher transformation:
1 1+r
F (r) = ln = artanh(r).
2 1r
If F(r) is the Fisher transformation of r, the sample Spearman rank correlation coecient, and n is the sample size,
then
n3
z= F (r)
1.06
is a z-score for r which approximately follows a standard normal distribution under the null hypothesis of statistical
independence ( = 0).[7][8]
One can also test for signicance using
n2
t=r
1 r2
which is distributed approximately as Students t distribution with n 2 degrees of freedom under the null hypothe-
sis.[9] A justication for this result relies on a permutation argument.[10]
A generalization of the Spearman coecient is useful in the situation where there are three or more conditions, a
number of subjects are all observed in each of them, and it is predicted that the observations will have a particular
order. For example, a number of subjects might each be given three trials at the same task, and it is predicted that
performance will improve from trial to trial. A test of the signicance of the trend between conditions in this situation
was developed by E. B. Page[11] and is usually referred to as Pages trend test for ordered alternatives.
5.8 References
[1] Scale types
[2] Lehman, Ann (2005). Jmp For Basic Univariate And Multivariate Statistics: A Step-by-step Guide. Cary, NC: SAS Press.
p. 123. ISBN 1-59047-576-3.
[3] Myers, Jerome L.; Well, Arnold D. (2003). Research Design and Statistical Analysis (2nd ed.). Lawrence Erlbaum. p. 508.
ISBN 0-8058-4037-0.
20 CHAPTER 5. SPEARMANS RANK CORRELATION COEFFICIENT
[4] Dodge, Yadolah (2010). The Concise Encyclopedia of Statistics. Springer-Verlag New York. p. 502. ISBN 978-0-387-
31742-7.
[5] Yule, G. U.; Kendall, M. G. (1968) [1950]. An Introduction to the Theory of Statistics (14th ed.). Charles Grin & Co. p.
268.
[6] Piantadosi, J.; Howlett, P.; Boland, J. (2007). Matching the grade correlation coecient using a copula with maximum
disorder. Journal of Industrial and Management Optimization. 3 (2): 305312.
[7] Choi, S. C. (1977). Tests of Equality of Dependent Correlation Coecients. Biometrika. 64 (3): 645647. doi:10.1093/biomet/64.3.645.
[8] Fieller, E. C.; Hartley, H. O.; Pearson, E. S. (1957). Tests for rank correlation coecients. I. Biometrika. 44: 470481.
doi:10.1093/biomet/44.3-4.470.
[9] Press; Vettering; Teukolsky; Flannery (1992). Numerical Recipes in C: The Art of Scientic Computing (2nd ed.). p. 640.
[10] Kendall, M. G.; Stuart, A. (1973). The Advanced Theory of Statistics, Volume 2: Inference and Relationship. Grin. ISBN
0-85264-215-6. (Sections 31.19, 31.21)
[11] Page, E. B. (1963). Ordered hypotheses for multiple treatments: A signicance test for linear ranks. Journal of the
American Statistical Association. 58 (301): 216230. doi:10.2307/2282965.
[12] Kowalczyk, T.; Pleszczyska, E.; Ruland, F., eds. (2004). Grade Models and Methods for Data Analysis with Applications
for the Analysis of Data Populations. Studies in Fuzziness and Soft Computing. 151. Berlin Heidelberg New York: Springer
Verlag. ISBN 978-3-540-21120-4.
5.11.2 Images
File:Commons-logo.svg Source: https://upload.wikimedia.org/wikipedia/en/4/4a/Commons-logo.svg License: PD Contributors: ? Orig-
inal artist: ?
File:Fisher_iris_versicolor_sepalwidth.svg Source: https://upload.wikimedia.org/wikipedia/commons/4/40/Fisher_iris_versicolor_sepalwidth.
svg License: CC BY-SA 3.0 Contributors: en:Image:Fisher iris versicolor sepalwidth.png Original artist: en:User:Qwfp (original); Pbroks13
(talk) (redraw)
File:Folder_Hexagonal_Icon.svg Source: https://upload.wikimedia.org/wikipedia/en/4/48/Folder_Hexagonal_Icon.svg License: Cc-
by-sa-3.0 Contributors: ? Original artist: ?
File:MonotCubInt.png Source: https://upload.wikimedia.org/wikipedia/en/f/fe/MonotCubInt.png License: PD Contributors: ? Original
artist: ?
File:Monotonicity_example1.png Source: https://upload.wikimedia.org/wikipedia/commons/3/32/Monotonicity_example1.png License:
Public domain Contributors: Own work with Inkscape Original artist: Oleg Alexandrov
File:Monotonicity_example2.png Source: https://upload.wikimedia.org/wikipedia/commons/5/59/Monotonicity_example2.png License:
Public domain Contributors: self-made with en:Inkscape Original artist: Oleg Alexandrov
File:Monotonicity_example3.png Source: https://upload.wikimedia.org/wikipedia/commons/8/8c/Monotonicity_example3.png License:
Public domain Contributors: self-made with en:Inkscape Original artist: Oleg Alexandrov
File:People_icon.svg Source: https://upload.wikimedia.org/wikipedia/commons/3/37/People_icon.svg License: CC0 Contributors: Open-
Clipart Original artist: OpenClipart
File:Portal-puzzle.svg Source: https://upload.wikimedia.org/wikipedia/en/f/fd/Portal-puzzle.svg License: Public domain Contributors:
? Original artist: ?
22 CHAPTER 5. SPEARMANS RANK CORRELATION COEFFICIENT