TY - JOUR
T1 - Optimally solving Markov decision processes with total expected discounted reward function
T2 - Linear programming revisited
AU - Alagoz, Oguzhan
AU - Ayvaci, Mehmet U.S.
AU - Linderoth, Jeffrey T.
N1 - Publisher Copyright:
© 2015 Elsevier Ltd. All rights reserved.
PY - 2015/6/10
Y1 - 2015/6/10
N2 - We compare the computational performance of linear programming (LP) and the policy iteration algorithm (PIA) for solving discrete-time infinite-horizon Markov decision process (MDP) models with total expected discounted reward. We use randomly generated test problems as well as a real-life health-care problem to empirically show that, unlike previously reported, barrier methods for LP provide a viable tool for optimally solving such MDPs. The dimensions of comparison include transition probability matrix structure, state and action size, and the LP solution method.
AB - We compare the computational performance of linear programming (LP) and the policy iteration algorithm (PIA) for solving discrete-time infinite-horizon Markov decision process (MDP) models with total expected discounted reward. We use randomly generated test problems as well as a real-life health-care problem to empirically show that, unlike previously reported, barrier methods for LP provide a viable tool for optimally solving such MDPs. The dimensions of comparison include transition probability matrix structure, state and action size, and the LP solution method.
KW - Linear programming
KW - MDP
KW - Markov decision process
KW - Policy iteration
KW - Total expected discounted reward
KW - Treatment optimization
UR - http://www.scopus.com/inward/record.url?scp=84930651795&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84930651795&partnerID=8YFLogxK
U2 - 10.1016/j.cie.2015.05.031
DO - 10.1016/j.cie.2015.05.031
M3 - Article
AN - SCOPUS:84930651795
SN - 0360-8352
VL - 87
SP - 311
EP - 316
JO - Computers and Industrial Engineering
JF - Computers and Industrial Engineering
ER -