The Chain Rule

Previously, we have defined differentiation, roughly speaking, as gradient-calculation. That is, the function y = f(x) has a derivative f'(t) at x = t if the tangent line to the curve y = f(x) at (t, f(t)) has equation

y = f'(t) (x-t) + f(t).

In this case, we write

\displaystyle \frac{\mathrm d }{ \mathrm dx }(f(x)) = f'(x).

The simplest kind of function would be the powers of x (i.e. the power rule): for any rational number n,

\displaystyle \frac{\mathrm d}{\mathrm dx}(x^n) = nx^{n-1}.

Indeed, this result is meaningful by adapting the calculations in this exercise.

Differentiation is “splittable” over addition:

\displaystyle \frac{\mathrm d}{\mathrm dx}(f(x) + g(x)) = \frac{\mathrm d}{\mathrm dx}(f(x)) + \frac{\mathrm d}{\mathrm dx}(g(x)).

It even works for functions scaled by a constant:

\displaystyle \frac{\mathrm d}{\mathrm dx}(c \cdot f(x) ) = c \cdot \frac{\mathrm d}{\mathrm dx}(f(x)).

That is, differentiation satisfies linearity.

But as discussed previously, we do not get splitting over products

\displaystyle \frac{\mathrm d}{\mathrm dx}(f(x) \cdot g(x)) \neq \frac{\mathrm d}{\mathrm dx}(f(x)) \cdot \frac{\mathrm d}{\mathrm dx}(g(x)).

Nor do we get splitting over function-in-function combinations (i.e. compositions),

\displaystyle \frac{\mathrm d}{\mathrm dx}(f( g(x) )) \neq f'( g(x) ).

However, it is still possible to evaluate their derivatives.

Example 1. Define f(x) = x^2 and g(x) = x^3 + 1. Show that

\displaystyle  f( g(x) ) = (x^3 + 1)^2,

and hence, check that

\displaystyle \frac{\mathrm d}{\mathrm dx} ( f( g(x) ) ) = f'( g(x) ) \cdot g'(x).

Proof. By definition of the individual functions,

\begin{aligned} f( g(x) ) &= g(x)^2 \\ &= ( x^3 + 1 )^2 \\ &= (x^3)^2 + 2 \cdot x^3 \cdot 1 + 1^2 \\ &= x^6 + 2x^3 + 1. \end{aligned}

By the linearity of differentiation,

\begin{aligned} \frac{\mathrm d}{\mathrm dx}( f( g(x) ) ) &= \frac{\mathrm d}{\mathrm dx} (x^6 + 2x^3 + 1) \\ &= \frac{\mathrm d}{\mathrm dx}( x^6 ) + 2 \cdot \frac{\mathrm d}{\mathrm dx} ( x^3 ) + \frac{\mathrm d}{\mathrm dx} ( 1 ) \\ &= 6x^5 + 2 \cdot 3x^2 + 0 \\ &= 6x^5 + 6x^2.\end{aligned}

On the other hand,

f'(x) = \displaystyle \frac{\mathrm d}{\mathrm dx}(f(x)) = \frac{\mathrm d}{\mathrm dx}(x^2) = 2x

and using linearity,

\begin{aligned} g'(x) = \frac{\mathrm d}{\mathrm dx}(g(x)) &= \frac{\mathrm d}{\mathrm dx}(x^3+1) \\ &= \frac{\mathrm d}{\mathrm dx}(x^3) + \frac{\mathrm d}{\mathrm dx}(1) \\ &= 3x^2 + 0 = 3x^2. \end{aligned}

Therefore,

\begin{aligned} f'( g(x) ) \cdot g'(x) &= 2 \cdot g(x) \cdot g'(x) \\ &= 2 \cdot (x^3 + 1) \cdot 3x^2 \\ &= 6x^2 \cdot (x^3 + 1) \\ &= 6x^2 \cdot x^3 + 6x^2 \cdot 1 \\ &= 6x^5 + 6x^2. \end{aligned}

Hence,

\displaystyle \frac{\mathrm d}{\mathrm dx}( f( g(x) ) ) = 6x^5 + 6x^2 = f'(g(x)) \cdot g(x).

This result is true in general, and known as the chain rule.

Theorem 1 (Chain Rule). For functions f(x), g(x) with derivatives f'(x), g'(x),

\displaystyle \frac{\mathrm d}{\mathrm dx} ( f( g(x) ) )  = f'( g(x) ) \cdot g'(x).

Writing u = g(x) so that \displaystyle \frac{ \mathrm du }{ \mathrm dx } = g'(x),

\displaystyle \frac{\mathrm d}{\mathrm dx} ( f( u ) )  = f'( u ) \cdot \frac{ \mathrm du }{ \mathrm dx }.

Proof. See this post.

Example 2. Show that \displaystyle \frac{\mathrm d}{\mathrm dx}(h(x)^n) = n \cdot h(x)^{n-1} \cdot h'(x).

Solution. Setting f(x) = x^n and g(x) = h(x), using the chain rule,

\begin{aligned} \frac{\mathrm d}{\mathrm dx}(h(x)^n)= \frac{\mathrm d}{\mathrm dx}( f(g(x)) ) &= f'(g(x)) \cdot g'(x) \\ &= f'(h(x)) \cdot h'(x). \end{aligned}

Using the power rule, f'(x) = nx^{n-1} implies that f'(h(x)) = n \cdot h(x)^{n-1}:

\begin{aligned} \frac{\mathrm d}{\mathrm dx}(h(x)^2) &= f'(h(x)) \cdot h(x) \\ &= n \cdot h(x)^{n-1} \cdot h'(x). \end{aligned}

Remark 1. In particular, setting n = 2 and n = -1 respectively,

\begin{aligned} \frac{\mathrm d}{\mathrm dx}(h(x)^2) &= 2 \cdot h(x) \cdot h'(x), \\ \frac{\mathrm d}{\mathrm dx} \left( \frac 1{ h(x) }\right) &= - \frac{ h'(x) }{ h(x)^2 }. \end{aligned}

Example 3. Define h(x) = f(x) + g(x). Show that

\begin{aligned} \frac{\mathrm d}{\mathrm dx}( h(x)^2 ) &= \frac{\mathrm d}{\mathrm dx}(f(x)^2 ) + \frac{\mathrm d}{\mathrm dx}(g(x)^2 ) \\ &\phantom{==} + 2 \cdot (f'(x) \cdot g(x) + g'(x) \cdot f(x) ).\end{aligned}

Solution. Using Remark 1,

\begin{aligned} \frac{\mathrm d}{\mathrm dx}( h(x)^2 ) &= 2 \cdot h(x) \cdot h'(x). \end{aligned}

Using linearity,

\begin{aligned} h'(x) = \frac{\mathrm d}{\mathrm dx}(h(x)) &= \frac{\mathrm d}{\mathrm dx}( f(x) + g(x) ) \\ &= \frac{\mathrm d}{\mathrm dx}(f(x)) + \frac{\mathrm d}{\mathrm dx}(g(x)) \\ &= f'(x) + g'(x). \end{aligned}

Together with h(x) = f(x) + g(x),

\begin{aligned} \frac{\mathrm d}{\mathrm dx}( h(x)^2 ) &= 2 \cdot (f(x) + g(x)) \cdot (f'(x) + g'(x)) \\ &= 2 \cdot f(x) \cdot f'(x) + 2 \cdot g(x) \cdot g'(x) \\ &\phantom{==} + 2 \cdot (f'(x) \cdot g(x) + g'(x) \cdot f(x)). \end{aligned}

By Remark 1 again,

\displaystyle \frac{\mathrm d}{\mathrm dx}(f(x)^2) = 2 \cdot f(x) \cdot f'(x), \quad \frac{\mathrm d}{\mathrm dx}(g(x)^2) = 2 \cdot g(x) \cdot g'(x).

Hence,

\begin{aligned} \frac{\mathrm d}{\mathrm dx}( h(x)^2 ) &= \frac{\mathrm d}{\mathrm dx}(f(x)^2 ) + \frac{\mathrm d}{\mathrm dx}(g(x)^2 ) \\ &\phantom{==} + 2 \cdot (f'(x) \cdot g(x) + g'(x) \cdot f(x) ).\end{aligned}

Remark 2. Example 3 helps us prove the product rule, which, in turn, together with the second result in Remark 1, helps us prove the quotient rule. We will visit both results next time.

The chain rule empowers us to differentiate all sorts of functions.

Example 4. Evaluate \displaystyle \frac{\mathrm d}{\mathrm dx}((x^{67} + 89)^{100}).

Solution. While terrifying and tragically anti-funny, the chain rule renders this problem trivial. By Example 2 and linearity,

\begin{aligned} \frac{\mathrm d}{\mathrm dx}((x^{67} + 89)^{100}) &=  100 (x^{67} + 89)^{99} \cdot \frac{\mathrm d}{\mathrm dx}( (x^{67} + 89) ) \\ &= 100 (x^{67} + 89)^{99} \cdot \left( \frac{\mathrm d}{\mathrm dx}( x^{67} ) + \frac{\mathrm d}{\mathrm dx} ( 89) \right) \\ &= 100 (x^{67} + 89)^{99} \cdot ( 67x^{66} + 0 ) \\ &= 100 (x^{67} + 89)^{99} \cdot 67x^{66} \\ &= 6700 x^{66} (x^{67} + 89)^{100}. \end{aligned}

Example 5. For any positive constant r > 0, evaluate \displaystyle \frac{\mathrm d}{\mathrm dx}(\sqrt{r^2 - x^2}).

Solution. Recall that using the power rule,

\begin{aligned} \frac{\mathrm d}{\mathrm dx}(\sqrt{x}) &= \frac{\mathrm d}{\mathrm dx}(x^{1/2}) = \frac 12 x^{-1/2} = \frac 1{2 \sqrt x}. \end{aligned}

Hence, using the chain rule (or Example 2) and linearity,

\begin{aligned} \frac{\mathrm d}{\mathrm dx}(\sqrt{r^2 - x^2}) &= \frac 1{ 2\sqrt{ r^2 - x^2} } \cdot \frac{\mathrm d}{\mathrm dx}(r^2-x^2) \\ &= \frac 1{ 2\sqrt{ r^2- x^2} } \cdot \left( \frac{\mathrm d}{\mathrm dx}(r^2)-\frac{\mathrm d}{\mathrm dx}(x^2) \right) \\ &= \frac 1{ 2\sqrt{ r^2 - x^2} } \cdot \left( 0 - 2x \right) \\ &= -\frac{2x}{ 2\sqrt{ r^2 - x^2}  } \\ &= -\frac{x}{\sqrt{ r^2-x^2}}.\end{aligned}

Remark 3. Example 5 gives us yet another proof that the radius of a circle must be perpendicular to its tangent.

The chain rule is, arguably, the most powerful theorem pertaining differentiation. We can use it to prove the product rule and the quotient rule, and these latter rules help us compute expressions such as

\displaystyle \frac{\mathrm d}{\mathrm dx} ( f(x) \cdot g(x) )  \quad \text{and} \quad \frac{\mathrm d}{\mathrm dx} \left( \frac{ f(x) }{ g(x) } \right)

correctly. These we visit next time.

—Joel Kindiak, 8 Jan 26, 1925H

,

Published by


Leave a comment