Francais | English | Espanõl

Chain rule

From Wikipedia, the free encyclopedia

Jump to: navigation, search
Topics in calculus

Fundamental theorem
Limits of functions
Continuity
Vector calculus</br>Tensor calculus
Mean value theorem

Differentiation

Product rule
Quotient rule
Chain rule
Implicit differentiation
Taylor's theorem
Related rates
Table of derivatives

Integration

Lists of integrals
Improper integrals
Integration by: parts, disks,
cylindrical shells, substitution,
trigonometric substitution

In calculus, the chain rule is a formula for the derivative of the composite of two functions.

Contents

[edit] Explanation

In intuitive terms, if a variable, y, depends on a second variable, u, which in turn depends on a third variable, x, then the rate of change of y with respect to x can be computed as the product of the rate of change of y with respect to u multiplied by the rate of change of u with respect to x.


The chain rule may be stated in any of several equivalent forms:

<math> (f \circ g)'(x) = (f(g(x)))' = f'(g(x)) g'(x),\,</math>

or in the Leibniz notation

<math>\frac {df}{dx} = \frac {df} {dg} \cdot \frac {dg}{dx},</math>

In integration, the counterpart to the chain rule is the substitution rule.

[edit] Example

Suppose, for example, that one is climbing a mountain at a rate of 0.5 kilometres per hour. The temperature is lower at higher elevations; suppose the rate by which it decreases is 6 °C per kilometre. If one multiplies 6 °C per kilometre by 0.5 kilometre per hour, one obtains 3 °C per hour. This calculation is a typical chain rule application.

[edit] The general power rule

The general power rule (GPR) is derivable via the chain rule.

[edit] Example I

Consider <math>f(x) = (x^2 + 1)^3</math>. We have <math>f(x)=h(g(x))</math> where <math>g(x) = x^2 + 1</math> and <math>h(x) = x^3.</math> Thus,

<math>f '(x) \,</math> <math>= 3(x^2 + 1)^2(2x) \,</math>
<math>= 6x(x^2 + 1)^2. \,</math>


In order to differentiate the trigonometric function

<math>f(x) = \sin(x^2),\,</math>

one can write <math>f(x) = h(g(x))</math> with <math>h(x) = \sin x</math> and <math>g(x) = x^2</math>. The chain rule then yields

<math>f'(x) = 2x \cos(x^2) \,</math>

since <math>h'(g(x)) = \cos (x^2)</math> and <math>g'(x) = 2x</math>.

[edit] Example II

Differentiate <math>\arctan\,\sin\,x</math>

<math>\frac{d}{dx}\arctan\,x\,=\,\frac{1}{1+x^2}</math>

<math>\frac{d}{dx}\arctan\,f(x)\,=\,\frac{f'(x)}{1+f^2(x)}</math>

<math>\frac{d}{dx}\arctan\,\sin\,x\,=\,\frac{\cos\,x}{1+\sin^2\,x}</math>

[edit] Chain rule for several variables

The chain rule works for functions of more than one variable. Consider the function <math>z = f(x,y)</math> where <math>x = g(t)</math> and <math>y = h(t)</math>, then

<math>{\partial z \over \partial t}={\partial f \over \partial x}{dx \over dt}+{\partial f \over \partial y}{dy \over dt}</math>

Suppose that each function of <math>z = f(u,v)</math> is a two-variable function such that <math>u = h(x,y)</math> and <math>v = g(x,y)</math>, and that these functions are all differentiable. Then the chain rule would look like:

<math>{\partial z \over \partial x}={\partial z \over \partial u}{\partial u \over \partial x}+{\partial z \over \partial v}{\partial v \over \partial x}</math>


<math>{\partial z \over \partial y}={\partial z \over \partial u}{\partial u \over \partial y}+{\partial z \over \partial v}{\partial v \over \partial y}</math>


If we considered <math>\vec r = (u,v)</math> above as a vector function, we can use vector notation to write the above equivalently as the dot product of the gradient of f and a derivative of <math>\vec r</math>:

<math>\frac{\partial f}{\partial x}=\vec \nabla f \cdot \frac{\partial \vec r}{\partial x}</math>

More generally, for functions of vectors to vectors, the chain rule says that the Jacobian matrix of a composite function is the product of the Jacobian matrices of the two functions:

<math>\frac{\partial(z_1,\ldots,z_m)}{\partial(x_1,\ldots,x_p)} = \frac{\partial(z_1,\ldots,z_m)}{\partial(y_1,\ldots,y_n)} \frac{\partial(y_1,\ldots,y_n)}{\partial(x_1,\ldots,x_p)}</math>

[edit] Proof of the chain rule

Let f and g be functions and let x be a number such that f is differentiable at g(x) and g is differentiable at x. Then by the definition of differentiability,

<math> g(x+\delta)-g(x)= \delta g'(x) + \epsilon(\delta)\delta \,</math> where <math> \epsilon(\delta) \to 0 \,</math> as <math>\delta\to 0.</math>

Similarly,

<math> f(g(x)+\alpha) - f(g(x)) = \alpha f'(g(x)) + \eta(\alpha)\alpha \,</math> where <math>\eta(\alpha) \to 0 \,</math> as <math>\alpha\to 0. \,</math>

Now

<math> f(g(x+\delta))-f(g(x))\, </math> <math>= f(g(x) + \delta g'(x)+\epsilon(\delta)\delta) - f(g(x)) \,</math>
<math> = \alpha_\delta f'(g(x)) + \eta(\alpha_\delta)\alpha_\delta \,</math>

where <math>\alpha_\delta = \delta g'(x) + \epsilon(\delta)\delta \,</math>. Observe that as <math>\delta\to 0,</math> <math>\frac{\alpha_\delta}{\delta}\to g'(x)</math> and <math>\alpha_\delta \to 0</math>, thus <math>\eta(\alpha_\delta)\to 0</math>. Therefore

<math> \frac{f(g(x+\delta))-f(g(x))}{\delta} \to g'(x)f'(g(x))\mbox{ as } \delta \to 0.</math>

[edit] The fundamental chain rule

The chain rule is a fundamental property of all definitions of derivative and is therefore valid in much more general contexts. For instance, if E, F and G are Banach spaces (which includes Euclidean space) and f : EF and g : FG are functions, and if x is an element of E such that f is differentiable at x and g is differentiable at f(x), then the derivative (the Fréchet derivative) of the composition g o f at the point x is given by

<math>\mbox{D}_x\left(g \circ f\right) = \mbox{D}_{f\left(x\right)}\left(g\right) \circ \mbox{D}_x\left(f\right).</math>

Note that the derivatives here are linear maps and not numbers. If the linear maps are represented as matrices (namely Jacobians), the composition on the right hand side turns into a matrix multiplication.

A particularly clear formulation of the chain rule can be achieved in the most general setting: let M, N and P be Ck manifolds (or even Banach-manifolds) and let

f : MN and g : NP

be differentiable maps. The derivative of f, denoted by df, is then a map from the tangent bundle of M to the tangent bundle of N, and we may write

<math>\mbox{d}\left(g \circ f\right) = \mbox{d}g \circ \mbox{d}f.</math>

In this way, the formation of derivatives and tangent bundles is seen as a functor on the category of C manifolds with C maps as morphisms.

[edit] Tensors and the chain rule

See tensor field for an advanced explanation of the fundamental role the chain rule plays in the geometric nature of tensors.

[edit] Higher derivatives

Faà di Bruno's formula generalizes the chain rule to higher derivatives. The first few derivatives are

<math>\frac{df}{dx} = \frac{df}{dg}\frac{dg}{dx}</math>
<math>
 \frac{d^2 f}{d x^2} 
 = \frac{d^2 f}{d g^2}\left(\frac{dg}{dx}\right)^2 
   + \frac{df}{dg}\frac{d^2 g}{dx^2}

</math>

<math>
 \frac{d^3 f}{d x^3} 
 = \frac{d^3 f}{d g^3} \left(\frac{dg}{dx}\right)^3 
   + 3 \frac{d^2 f}{d g^2} \frac{dg}{dx} \frac{d^2 g}{d x^2}
   + \frac{df}{dg} \frac{d^3 g}{d x^3} 

</math>

<math>
 \frac{d^4 f}{d x^4}
 =\frac{d^4 f}{dg^4} \left(\frac{dg}{dx}\right)^4 
   + 6 \frac{d^3 f}{d g^3} \left(\frac{dg}{dx}\right)^2 \frac{d^2 g}{d x^2} 
   + \frac{d^2 f}{d g^2} \left\{ 4 \frac{dg}{dx} \frac{d^3 g}{dx^3} + 3\left(\frac{d^2 g}{dx^2}\right)^2\right\}
     
   + \frac{df}{dg}\frac{d^4 g}{dx^4} 

</math>

[edit] See also

ar:قاعدة السلسلة de:Kettenregel es:Regla de la cadena fr:Règle de dérivation en chaîne ko:연쇄 법칙 he:כלל השרשרת nl:Kettingregel pl:Reguła łańcuchowa pt:Regra da cadeia fi:Ketjusääntö sv:Kedjeregeln th:กฎลูกโซ่ tr:Zincir kuralı

Personal tools