Francais | English | Espanõl

Metropolis-Hastings algorithm

From Wikipedia, the free encyclopedia

Jump to: navigation, search

The Proposal distribution Q proposes the next point that the random walk might move to.

In mathematics and physics, the Metropolis-Hastings algorithm is an algorithm used to generate a sequence of samples from a probability distribution that is difficult to directly sample from. This sequence can be used in Markov chain Monte Carlo to approximate the distribution (as with a histogram), or to compute an integral (such as an expected value). It is named after Nicholas Metropolis, who published it in 1953 for the specific case of the Boltzmann distribution, and W.K. Hastings,[1] who generalized it in 1970. The Gibbs sampling algorithm is a special case of the Metropolis-Hastings algorithm.

The Metropolis-Hastings algorithm can draw samples from any probability distribution <math>p(x)</math>, requiring only that the density can be calculated at <math>x</math>. The algorithm generates a Markov chain in which each state <math>x^t</math> depends only on the previous state <math>x^{t-1}</math>. The algorithm uses a proposal density <math>Q(x'; x^t)</math>, which depends on the current state <math>x^t</math>, to generate a new proposed sample <math>x'</math>. This proposal is 'accepted' as the next value (<math>x^{t+1}</math>=<math>x'</math>) if <math>u</math> drawn from <math>U(0,1)</math> is

<math>

u < \frac{P(x')Q(x^t|x')}{P(x^t)Q(x'|x^t)} </math>

For example, the proposal density could be a Gaussian function centred on the current state <math>x^t</math>

<math>

Q( x'; x^t ) \sim N( x^t, \sigma^2 I) </math>

reading <math>Q(x'; x^t)</math> as the probability density function for <math>x'</math> given the previous value <math>x^t</math>. This proposal density would generate samples centred around the current state with variance <math>\sigma^2 I</math>. The original Metropolis algorithm calls for the proposal density to be symmetric ( <math>Q(x; y) = Q(y; x)</math> ); generalization by Hastings lifts this restriction. It is allowed for <math>Q(x', x^t)</math> not to depend on <math>x'</math> at all, in which case the algorithm is called "Independence Chain Metropolis-Hastings" ( as opposed to "Random Walk Metropolis-Hastings" ). Independence chain M-H algorithm with suitable proposal density function can offer higher accuracy than random walk version, but it requires some a priori knowledge of the distribution.

Now, we draw a new proposal state <math>x'</math> with probability <math>Q(x'; x^t)</math> and then calculate a value

<math>

a = a_1 a_2\, </math>

where

<math>

a_1 = \frac{P(x')}{P(x^t)} </math>

is the likelihood ratio between the proposed sample <math>x'</math> and the previous sample <math>x^t</math>, and

<math>

a_2 = \frac{Q( x^t; x' )}{Q(x';x^t)} </math>

is the ratio of the proposal density in two directions (from <math>x^t</math> to <math>x'</math> and vice versa). This is equal to 1 if the proposal density is symmetric. Then the new state <math>x^{t+1}</math> is chosen with the rule

<math>

x^{t+1} = \left \{

\begin{matrix}
 x'                            & \mbox{if }a > 1 \\
 x'\mbox{ with probability }a, & \mbox{if }a < 1
\end{matrix}

\right. </math>

The Markov chain is started from a random initial value <math>x^0</math> and the algorithm is run for many iterations until this initial state is "forgotten". These samples, which are discarded, are known as burn-in. The algorithm works best if the proposal density matches the shape of the target distribution <math>p(x)</math>, that is <math>Q(x'; x^t) \approx p(x')</math>, but in most cases this is unknown. If a Gaussian proposal is used the variance parameter <math>\sigma^2</math> has to be tuned during the burn-in period. This is usually done by calculating the acceptance rate, which is the fraction of proposed samples that is accepted in a window of the last <math>N</math> samples. This is usually set to be around 60%. If the proposal steps are too small the chain will mix slowly (i.e., it will move around the space slowly and converge slowly to <math>p(x)</math>). If the proposal steps are too large the acceptance rate will be very low because the proposals are likely to land in regions of much lower probability density so <math>a_1</math> will be very small.

[edit] See also

[edit] References

  • Bernd A. Berg. "Markov Chain Monte Carlo Simulations and Their Statistical Analysis". Singapore, World Scientific 2004.
  • Chib, Siddhartha and Edward Greenberg: "Understanding the Metropolis–Hastings Algorithm". American Statistician, 49(4), 327–335, 1995
  • W.K. Hastings. "Monte Carlo Sampling Methods Using Markov Chains and Their Applications", Biometrika, 57(1):97-109, 1970.
  • N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller. "Equations of State Calculations by Fast Computing Machines". Journal of Chemical Physics, 21(6):1087-1092, 1953. [2]

[edit] External links

it:Algoritmo di Metropolis-Hastings ja:メトロポリス法

Personal tools