language-icon Old Web
English
Sign In

Forward–backward algorithm

The forward–backward algorithm is an inference algorithm for hidden Markov models which computes the posterior marginals of all hidden state variables given a sequence of observations/emissions o 1 : T := o 1 , … , o T {displaystyle o_{1:T}:=o_{1},dots ,o_{T}} , i.e. it computes, for all hidden state variables X t ∈ { X 1 , … , X T } {displaystyle X_{t}in {X_{1},dots ,X_{T}}} , the distribution P ( X t   |   o 1 : T ) {displaystyle P(X_{t} | o_{1:T})} . This inference task is usually called smoothing. The algorithm makes use of the principle of dynamic programming to efficiently compute the values that are required to obtain the posterior marginal distributions in two passes. The first pass goes forward in time while the second goes backward in time; hence the name forward–backward algorithm. The forward–backward algorithm is an inference algorithm for hidden Markov models which computes the posterior marginals of all hidden state variables given a sequence of observations/emissions o 1 : T := o 1 , … , o T {displaystyle o_{1:T}:=o_{1},dots ,o_{T}} , i.e. it computes, for all hidden state variables X t ∈ { X 1 , … , X T } {displaystyle X_{t}in {X_{1},dots ,X_{T}}} , the distribution P ( X t   |   o 1 : T ) {displaystyle P(X_{t} | o_{1:T})} . This inference task is usually called smoothing. The algorithm makes use of the principle of dynamic programming to efficiently compute the values that are required to obtain the posterior marginal distributions in two passes. The first pass goes forward in time while the second goes backward in time; hence the name forward–backward algorithm. The term forward–backward algorithm is also used to refer to any algorithm belonging to the general class of algorithms that operate on sequence models in a forward–backward manner. In this sense, the descriptions in the remainder of this article refer but to one specific instance of this class. In the first pass, the forward–backward algorithm computes a set of forward probabilities which provide, for all t ∈ { 1 , … , T } {displaystyle tin {1,dots ,T}} , the probability of ending up in any particular state given the first t {displaystyle t} observations in the sequence, i.e. P ( X t   |   o 1 : t ) {displaystyle P(X_{t} | o_{1:t})} . In the second pass, the algorithm computes a set of backward probabilities which provide the probability of observing the remaining observations given any starting point t {displaystyle t} , i.e. P ( o t + 1 : T   |   X t ) {displaystyle P(o_{t+1:T} | X_{t})} . These two sets of probability distributions can then be combined to obtain the distribution over states at any specific point in time given the entire observation sequence: The last step follows from an application of the Bayes' rule and the conditional independence of o t + 1 : T {displaystyle o_{t+1:T}} and o 1 : t {displaystyle o_{1:t}} given X t {displaystyle X_{t}} .

[ "Variable-order Markov model", "Markov property" ]
Parent Topic
Child Topic
    No Parent Topic