Francais | English | Espanõl

Power iteration

From Wikipedia, the free encyclopedia

(Redirected from Power method)
Jump to: navigation, search

In mathematics, power iteration is an eigenvalue algorithm: given a matrix A, the algorithm will produce a number λ (the eigenvalue) and a nonzero vector v (the eigenvector), such that Av = λv.

The power iteration is a very simple algorithm. It does not compute a matrix decomposition, and hence it can be used when A is very large sparse matrix. However, it will find only one eigenvalue (the one with the greatest absolute value) and it may converge only slowly.

Contents

[edit] The method

The power iteration algorithm starts with a vector b0, which may be an approximation to the dominant eigenvector or a random vector. The method is described by the iteration

<math> b_{k+1} = \frac{Ab_k}{\|Ab_k\|}. </math>

So, at every iteration, the vector bk is multiplied by the matrix A and normalized.

Under the assumptions:

  • A has an eigenvalue that is strictly greater in magnitude than its other eigenvalues
  • The starting vector <math>b_{0}</math> has a nonzero component in the direction of an eigenvector associated with the dominant eigenvalue.

then:

  • A subsequence of <math>\left( b_{k} \right)</math> converges to an eigenvector associated with the dominant eigenvalue

Note that the sequence <math>\left( b_{k} \right)</math> does not necessarily converge. It can be shown that:
<math>b_{k} = e^{i \phi k} v_{1} + r_{k}</math> where: <math>v_{1}</math> is an eigenvector associated with the dominant eigenvalue, and <math> \| r_{k} \| \rightarrow 0</math>. The presence of the term <math>e^{i \phi k}</math> implies that <math>\left( b_{k} \right) </math> does not converge unless <math>e^{i \phi k} = 1</math> Under the two assumptions listed above, the sequence <math>\left( \mu_{k} \right)</math> defined by: <math>\mu_{k} = \frac{b_{k}^{*}Ab_{k}}{b_{k}^{*}b_{k}}</math> converges to the dominant eigenvalue.


The method can also be used to calculate the spectral radius of a matrix by computing the Rayleigh quotient

<math> \frac{b_k^\top A b_k}{b_k^\top b_k} = \frac{b_{k+1}^\top b_k}{b_k^\top b_k}. </math>

[edit] Analysis

Let <math>A</math> be decomposed into its Jordan canonical form: <math>A=VJV^{-1}</math>, where the first column of <math>V</math> is an eigenvector of <math>A</math> corresponding to the dominant eigenvalue <math>\lambda_{1}</math>. Since the dominant eigenvalue of <math>A</math> is unique, the first Jordan block of <math>J</math> is the <math>1 \times 1</math> matrix <math>\begin{bmatrix} \lambda_{1} \end{bmatrix} </math>, where <math>\lambda_{1}</math> is the largest eigenvalue of A in magnitude. The starting vector <math>b_{0}</math> can be written as a linear combination of the columns of V: <math>b_{0} = c_{1}v_{1} + c_{2}v_{2} + \cdots + c_{n}v_{n}</math>. By assumption, <math>b_{0}</math> has a nonzero component in the direction of the dominant eigenvalue, so <math>c_{1} \ne 0</math>. The computationally useful recurrence relation for <math>b_{k+1}</math> can be rewritten as: <math>b_{k+1}=\frac{Ab_{k}}{\|Ab_{k}\|}=\frac{A^{k+1}b_{0}}{\|A^{k+1}b_{0}\|}</math>, where the expression: <math>\frac{A^{k+1}b_{0}}{\|A^{k+1}b_{0}\|}</math> is more amenable to the following analysis.
<math> \begin{matrix} b_{k} &=& \frac{A^{k}b_{0}}{\| A^{k} b_{0} \|} \\

     &=& \frac{\left( VJV^{-1} \right)^{k} b_{0}}{\|\left( VJV^{-1} \right)^{k}b_{0}\|} \\
     &=& \frac{ VJ^{k}V^{-1} b_{0}}{\| V J^{k} V^{-1} b_{0}\|} \\
     &=& \frac{ VJ^{k}V^{-1} \left( c_{1}v_{1} + c_{2}v_{2} + \cdots + c_{n}v_{n} \right)}
              {\| V J^{k} V^{-1} \left( c_{1}v_{1} + c_{2}v_{2} + \cdots + c_{n}v_{n} \right)\|} \\
     &=& \frac{ VJ^{k}\left( c_{1}e_{1} + c_{2}e_{2} + \cdots + c_{n}e_{n} \right)}
               {\| V J^{k} \left( c_{1}e_{1} + c_{2}e_{2} + \cdots + c_{n}e_{n} \right) \|} \\
     &=& \left( \frac{\lambda_{1}}{|\lambda_{1}|} \right)^{k} \frac{c_{1}}{|c_{1}|}
         \frac{ v_{1} + \frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} 
                     \left( c_{2}e_{2} +  \cdots + c_{n}e_{n} \right)}
              {\| v_{1} + \frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} 
                     \left( c_{2}e_{2} +  \cdots + c_{n}e_{n} \right) \| }
          

\end{matrix} </math>
The expression above simplifies as <math>k \rightarrow \infty </math>
<math> \left( \frac{1}{\lambda_{1}} J \right)^{k} = \begin{bmatrix} [1] & & & & \\ & \left( \frac{1}{\lambda_{1}} J_{2} \right)^{k}& & & \\ & & \ddots & \\ & & & \left( \frac{1}{\lambda_{1}} J_{m} \right)^{k} \\ \end{bmatrix} \rightarrow \begin{bmatrix} 1 & & & & \\ & 0 & & & \\ & & \ddots & \\ & & & 0 \\ \end{bmatrix} </math> as <math> k \rightarrow \infty </math>.
The limit follows from the fact that the eigenvalue of <math> \frac{1}{\lambda_{1}} J_{i} </math> is less than in 1 in magnitude, so <math> \left( \frac{1}{\lambda_{1}} J_{i} \right)^{k} \rightarrow 0 </math> as <math> k \rightarrow \infty </math>
It follows that:
<math> \frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} \left( c_{2}e_{2} + \cdots + c_{n}e_{n} \right) \rightarrow 0 </math> as <math> k \rightarrow \infty </math>
Using this fact, <math>b_{k}</math> can be written in a form that emphasizes its relationship with <math>v_{1}</math> when k is large:
<math> \begin{matrix} b_{k} &=& \left( \frac{\lambda_{1}}{|\lambda_{1}|} \right)^{k} \frac{c_{1}}{|c_{1}|}

         \frac{ v_{1} + \frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} 
                     \left( c_{2}e_{2} +  \cdots + c_{n}e_{n} \right)}
              {\| v_{1} + \frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} 
                     \left( c_{2}e_{2} +  \cdots + c_{n}e_{n} \right) \| }
     &=& e^{i \phi k} \frac{c_{1}}{|c_{1}|} v_{1} + r_{k}

\end{matrix} </math> where <math> e^{i \phi} = \lambda_{1} / |\lambda_{1}|</math> and <math> \| r_{k} \| \rightarrow 0 </math> as <math>k \rightarrow \infty </math>
The sequence <math> \left( b_{k} \right)</math> is bounded, so it contains a convergent subsequence. Note that the eigenvector corresponding to the dominant eigenvalue is only unique up to a scalar, so although the sequence <math>\left(b_{k}\right)</math> may not converge, <math>b_{k}</math> is nearly an eigenvector of A for large k.

Alternatively, if A is diagonalizable, then the following proof yields the same result
Let λ1, λ2, …, λm be the m eigenvalues (counted with multiplicity) of A and let v1, v2, …, vm be the corresponding eigenvectors. Suppose that <math>\lambda_1</math> is the dominant eigenvalue, so that <math>|\lambda_1| > |\lambda_j|</math> for <math>j>1</math>.

The initial vector <math>b_0</math> can be written:

<math>b_0 = c_{1}v_{1} + c_{2}v_{2} + \cdots + c_{m}v_{m}.</math>

If <math>b_0</math> is chosen randomly (with uniform probability), then c1 ≠ 0 with probability 1. Now,

<math>\begin{matrix}A^{k}b_0 & = & c_{1}A^{k}v_{1} + c_{2}A^{k}v_{2} + \cdots + c_{m}A^{k}v_{m} \\

& = & c_{1}\lambda_{1}^{k}v_{1} + c_{2}\lambda_{2}^{k}v_{2} + \cdots + c_{m}\lambda_{m}^{k}v_{m} \\ & = & c_{1}\lambda_{1}^{k} \left( v_{1} + \frac{c_{2}}{c_{1}}\left(\frac{\lambda_{2}}{\lambda_{1}}\right)^{k}v_{2} + \cdots + \frac{c_{m}}{c_{1}}\left(\frac{\lambda_{m}}{\lambda_{1}}\right)^{k}v_{m}\right). \end{matrix}</math>

The expression within parentheses converges to <math>v_1</math> because <math>|\lambda_j/\lambda_1| < 1</math> for <math>j>1</math>. On the other hand, we have

<math> b_k = \frac{A^kb_0}{\|A^kb_0\|}. </math>

Therefore, <math>b_k</math> converges to (a multiple of) the eigenvector <math>v_1</math>. The convergence is geometric, with ratio

<math> \left| \frac{\lambda_1}{\lambda_2} \right|, </math>

where <math>\lambda_2</math> denotes the second dominant eigenvalue. Thus, the method converges slowly if there is an eigenvalue close in magnitude to the dominant eigenvalue.

[edit] Applications

Power iteration is not used very much because it can find only the dominant eigenvalue. Nevertheless, the algorithm is very useful in some specific situations. For instance, Google uses it to calculate the page rank of documents in their search engine. <ref>Ipsen, Ilse, and Rebecca M. Wills, Analysis and Computation of Google's PageRank, 7th IMACS International Symposium on Iterative Methods in Scientific Computing, Fields Institute, Toronto, Canada, 5–8 May 2005.</ref>

Some of the more advanced eigenvalue algorithms can be understood as variations of the power iteration. For instance, the inverse iteration method applies power iteration to the matrix <math>A^{-1}</math>. Other algorithms look at the whole subspace generated by the vectors <math>b_k</math>. This subspace is known as the Krylov subspace. It can be computed by Arnoldi iteration or Lanczos iteration.

[edit] References

<references />

[edit] External links

  • Power method, part of lecture notes on numerical linear algebra by E. Bruce Pitman, State University of New York.
  • Power method on www.math.gatech.edu
Personal tools