Optimally solving Markov decision processes with total expected discounted reward function: Linear programming revisited

Oguzhan Alagoz*, Mehmet U.S. Ayvaci, Jeffrey T. Linderoth

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

We compare the computational performance of linear programming (LP) and the policy iteration algorithm (PIA) for solving discrete-time infinite-horizon Markov decision process (MDP) models with total expected discounted reward. We use randomly generated test problems as well as a real-life health-care problem to empirically show that, unlike previously reported, barrier methods for LP provide a viable tool for optimally solving such MDPs. The dimensions of comparison include transition probability matrix structure, state and action size, and the LP solution method.

Original languageEnglish (US)
Pages (from-to)311-316
Number of pages6
JournalComputers and Industrial Engineering
Volume87
DOIs
StatePublished - Jun 10 2015
Externally publishedYes

Keywords

  • Linear programming
  • MDP
  • Markov decision process
  • Policy iteration
  • Total expected discounted reward
  • Treatment optimization

ASJC Scopus subject areas

  • Computer Science(all)
  • Engineering(all)

Fingerprint Dive into the research topics of 'Optimally solving Markov decision processes with total expected discounted reward function: Linear programming revisited'. Together they form a unique fingerprint.

Cite this