Sigmoid function
From Wikipedia, the free encyclopedia
A sigmoid function is a mathematical function that produces a sigmoid curve — a curve having an "S" shape. Often, sigmoid function refers to the special case of the logistic function shown at right and defined by the formula:
- <math>P(t) = \frac{1}{1 + e^{-t}}</math>
Contents |
[edit] Members of the sigmoid family
In general, a sigmoid function is real-valued and differentiable, having a non-negative or non-positive first derivative, one local minimum, and one local maximum.
Besides the logistic function, sigmoid functions include the ordinary arc-tangent, the hyperbolic tangent, and the error function. The integral of any smooth, positive, "bump-shaped" function will be sigmoidal, thus the cumulative distribution functions for many common probability distributions are sigmoidal.
The logistic sigmoid function is related to the hyperbolic tangent, e.g., by
- <math>1-2\frac{1}{1+e^{-x}} = - \tanh\frac{x}{2}</math>
[edit] Sigmoid functions in neural networks
Sigmoid functions are often used in neural networks to introduce nonlinearity in the model and/or to make sure that certain signals remain within a specified range. A popular neural net element computes a linear combination of its input signals, and applies a bounded sigmoid function to the result; this model can be seen as a "smoothed" variant of the classical threshold neuron.
A reason for its popularity in neural networks is because the sigmoid function satisfies this property:
- <math>\frac{d}{dt}{\rm sig}(t) = {\rm sig}(t) \left ( 1 - {\rm sig}(t) \right ) </math>
This simple polynomial relationship between the derivative and itself is computationally easy to perform.
[edit] Double sigmoid function
The double sigmoid is a function similar to the sigmoid function with numerous applications. Its general formula is:
- <math> y = \mbox{sign}(x-d) \, \Bigg(1-\exp\bigg(-\bigg(\frac{x-d}{s}\bigg)^2\bigg)\Bigg), </math>
where d is its centre and s is the steepness factor.
It is based on the Gaussian curve and graphically it is similar to two identical sigmoids bonded together at the point x = d.
One of its applications is non-linear normalization of a sample, as it has the property of eliminating outliers.


