Sigmoid function: Difference between revisions

Content deleted Content added
dab
m Reverted edits by Area2634 (talk) to last version by Mokewxst: spam
 
(7 intermediate revisions by 7 users not shown)
Line 1:
{{Short description|Mathematical function having a characteristic S-shaped curve or sigmoid curve}}
{{Use dmy dates|date=July 2022|cs1-dates=y}}
{{Use list-defined references|date=July 2022}}
[[File:Logistic-curve.svg|thumb|320px|right|The [[logistic curve]]]]
[[File:Error Function.svg|thumb|right|320px|Plot of the [[error function]]]]
 
A '''sigmoid function''' isrefers anyspecifically [[mathematicalto a function]] whose [[graphGraph of a function|graph]] hasfollows athe characteristic[[logistic S-shapedfunction]]. curveIt oris '''sigmoiddefined curve'''.by the formula:
:<math>\sigma(x) = \frac{1}{1 + e^{-x}} = \frac{e^x}{1 + e^x} = 1 - \sigma(-x).</math>
 
In many fields, especially in the context of [[Neural network (machine learning)|artificial neural networks]], the term "sigmoid function" is correctly recognized as a synonym for the logistic function. While other S-shaped curves, such as the [[Gompertz function|Gompertz curve]] or the [[Ogee|ogee curve]], may resemble sigmoid functions, but they are distinct mathematical functions with different properties and applications.
A common example of a sigmoid function is the [[logistic function]] shown in the first figure and defined by the formula:<ref name="Han-Morag_1995" />
:<math>\sigma(x) = \frac{1}{1 + e^{-x}} = \frac{e^x}{1 + e^x}=1-\sigma(-x).</math>
 
Sigmoid functions, particularly the logistic function, have a domain of all [[Real number|real numbers]] and typically produce output values in the range from 0 to 1, although some variations, like the [[Hyperbolic functions|hyperbolic tangent]], output values between −1 and 1. These functions are commonly used as [[Activation function|activation functions]] in artificial neurons and as [[Cumulative distribution function|cumulative distribution functions]] in [[statistics]]. The logistic sigmoid is also invertible, with its inverse being the [[Logit|logit function]].
Other standard sigmoid functions are given in the [[#Examples|Examples section]]. In some fields, most notably in the context of [[artificial neural network]]s, the term "sigmoid function" is used as an alias for the logistic function.
 
Special cases of the sigmoid function include the [[Gompertz curve]] (used in modeling systems that saturate at large values of x) and the [[ogee curve]] (used in the [[spillway]] of some [[dam]]s). Sigmoid functions have domain of all [[real number]]s, with return (response) value commonly [[monotonically increasing]] but could be decreasing. Sigmoid functions most often show a return value (y axis) in the range 0 to 1. Another commonly used range is from −1 to 1.
 
A wide variety of sigmoid functions including the logistic and [[hyperbolic tangent]] functions have been used as the [[activation function]] of [[artificial neuron]]s. Sigmoid curves are also common in statistics as [[cumulative distribution function]]s (which go from 0 to 1), such as the integrals of the [[logistic density]], the [[normal density]], and [[Student's t-distribution|Student's ''t'' probability density functions]]. The logistic sigmoid function is invertible, and its inverse is the [[logit]] function.
 
== Definition ==
A sigmoid function is a [[bounded function|bounded]], [[differentiable function|differentiable]], real function that is defined for all real input values and has a non-negative derivative at each point<ref name="Han-Morag_1995" /> <ref name="yibei" /> and exactly one [[inflection point]]. A sigmoid "function" and a sigmoid "curve" refer to the same object.
 
== Properties ==
Line 44 ⟶ 40:
* and in a more general form<ref name="Dunning-Kensler-Coudeville-Bailleux_2015" /> <math display="block"> f(x) = \frac{x}{\left(1 + |x|^{k}\right)^{1/k}} </math>
* Up to shifts and scaling, many sigmoids are special cases of <math display="block"> f(x) = \varphi(\varphi(x, \beta), \alpha) , </math> where <math display="block"> \varphi(x, \lambda) = \begin{cases} (1 - \lambda x)^{1/\lambda} & \lambda \ne 0 \\e^{-x} & \lambda = 0 \\ \end{cases} </math> is the inverse of the negative [[Box–Cox transformation]], and <math>\alpha < 1</math> and <math>\beta < 1</math> are shape parameters.<ref name="grex" />
* [[Non-analytic_smooth_function#Smooth_transition_functions|Smooth Interpolationtransition function]]<ref>{{Cite web|url=https://www.youtube.com/watch?v=vD5g8aVscUI|title=Smooth Transition Function in One Dimension &#124; Smooth Transition Function Series Part 1|via=www.youtube.com| date =16 August 2022|author=EpsilonDelta|at=13:29/14:04}}</ref> normalized to (-1,1) and <math>n</math> is the slope at zero:
<!--
<math display="block"> f(x) = \begin{cases}
Line 54 ⟶ 50:
<math display="block">\begin{align}f(x) &= \begin{cases}
{\displaystyle
\frac{2}{1+e^{-2n2m\frac{x}{1-x^2}}} - 1}, & |x| < 1 \\
\\
\sgn(x) & |x| \ge 1 \\
Line 60 ⟶ 56:
&= \begin{cases}
{\displaystyle
\tanh\left(nm\frac{x}{1-x^2}\right)}, & |x| < 1 \\
\\
\sgn(x) & |x| \ge 1 \\
\end{cases}\end{align}</math> using the hyperbolic tangent mentioned above. Here, <math>m</math> is a free parameter encoding the slope at <math>x=0</math>, which must be greater than or equal to <math>\sqrt{3}</math> because any smaller value will result in a function with multiple inflection points, which is therefore not a true sigmoid. This function is unusual because it actually attains the limiting values of -1 and 1 within a finite range, meaning that its value is constant at -1 for all <math>x \leq -1</math> and at 1 for all <math>x \geq 1</math>. Nonetheless, it is [[Smoothness|smooth]] (infinitely differentiable, <math>C^\infty</math>) ''everywhere'', including at <math>x = \pm 1</math>.
\end{cases}\end{align}</math> using the hyperbolic tangent mentioned above.
 
== Applications ==
Line 89 ⟶ 85:
{{Commons category|Sigmoid functions}}
{{div col|colwidth=30em}}
* [[{{annotated link|Step function]]}}
* [[{{annotated link|Sign function]]}}
* [[{{annotated link|Heaviside step function]]}}
* [[{{annotated link|Logistic regression]]}}
* [[{{annotated link|Logit]]}}
* [[{{annotated link|Softplus function]]}}
* [[{{annotated link|Soboleva modified hyperbolic tangent]]}}
* [[{{annotated link|Softmax function]]}}
* [[{{annotated link|Swish function]]}}
* [[{{annotated link|Weibull distribution]]}}
* [[{{annotated link|Fermi–Dirac statistics]]}}
{{div col end}}