Bellman Equation Derivation


Using math equations as magic has never been my way. For deep concept understanding, a key point is to first understand all the prior information related with the new concept that you are trying to catch.

In Reinforcement Learning, a important equation is the Bellman Equation. In this video Constantin B├╝rgi presents a very clear derivation of the equation transforming the original infinite horizon problem into a dynamic programming one.

Original problem

We want to solve the next infinite horizon optimization problem:

under the constrain

being:

  • capital invested in t+1
  • capital consumption in t
  • capital invested in t
  • interest rate

By definition can also be expressed by

The infinite horizon problem, akas can be expressed by:

Leavind a part the first element, t=0, can be expressed as:

Now we can express as the sum of an expression plus :

is a function that depends only on the value of , for its part, depends solely of .

Bellman equation or value function

Changing the name of variables , , we obtain the expression of the Bellman Equation:

Back to blog