Markov Decision Processes and Partially Observable Markov Decision Processes

22 Apr

Decision-Making Under Uncertainty and Incomplete Information

A Markov Decision Process (MDP) models decision-making in uncertain environments, providing a formal mathematical framework for sequential choices where outcomes are probabilistic rather than deterministic.

Structure

An MDP consists of:

states
actions
transition probabilities
rewards

Objective

The goal is to maximise expected cumulative reward over time.

This makes MDPs fundamentally about optimising decisions across a sequence of uncertain future outcomes, rather than making isolated, one-off choices.

Application at MorMag

MDP frameworks are conceptually relevant for:

sequential decision-making
dynamic allocation strategies
adaptive portfolio management

They align naturally with the idea that markets are evolving systems where decisions must continuously adjust as new information arrives.

Extending the Framework: Partial Observability

In many real-world systems, however, the assumption of fully observable states breaks down.

This leads to the extension of MDPs into Partially Observable Markov Decision Processes (POMDPs). With the key difference being that instead of observing states directly; decisions are based on beliefs about states

This introduces an additional layer of complexity, where the decision-maker must operate not only under uncertainty about outcomes, but also uncertainty about the current state itself.

Relevance to Markets

Financial markets are inherently partially observable:

the true market state is unknown
signals are noisy and incomplete

Market participants never observe the full underlying system. Instead, they infer it through price movements, macro data, flows, and other imperfect signals. This makes a pure MDP framework insufficient on its own for realistic modelling.

Application at MorMag

POMDP thinking informs:

probabilistic modelling
signal interpretation
decision-making under uncertainty

It reinforces the idea that investment decisions should not be based on assumed certainty, but on probabilistic belief distributions that are continuously updated as new data arrives.

Conclusion

MDPs provide a formal structure for decision-making under uncertainty, aligning with probabilistic investment frameworks.

POMDPs extend this into a more realistic domain, accounting for incomplete information and noisy observations.

Together, they form a powerful conceptual foundation for modelling decision-making in complex, uncertain environments; particularly in financial markets, where both outcomes and states are inherently uncertain.

Thomas Morgan-Magraw

Markov Decision Processes and Partially Observable Markov Decision Processes

Decision-Making Under Uncertainty and Incomplete Information

Structure

Objective

Application at MorMag

Extending the Framework: Partial Observability

Relevance to Markets

Application at MorMag

Conclusion

MorMag Asset Management

Legal

Location

Contact

Markov Decision Processes and Partially Observable Markov Decision Processes

Decision-Making Under Uncertainty and Incomplete Information

Structure

Objective

Application at MorMag

Extending the Framework: Partial Observability

Relevance to Markets

Application at MorMag

Conclusion

How MorMag Uses Advanced Sampling Methods End-to-End

Gibbs Sampling

MorMag Asset Management

Legal

Location

Contact