Abstract
We consider a problem in which we seek to optimally design a Markov decision process (MDP). That is, subject to resource constraints we first design the action sets that will be available in each state when we later optimally control the process. The control policy is subject to additional constraints governing state-action pair frequencies, and we allow randomized policies. When the design decision is made, we are uncertain of some of the parameters governing the MDP, but we assume a distribution for these stochastic parameters is known. We focus on transient MDPs with a finite number of states and actions. We formulate, analyze and solve a two-stage stochastic integer program that yields an optimal design. A simple example threads its way through the paper to illustrate the development. The paper concludes with a larger application involving optimal design of malaria intervention strategies in Nigeria.
Original language | English (US) |
---|---|
Pages (from-to) | 167-193 |
Number of pages | 27 |
Journal | Operations Research/ Computer Science Interfaces Series |
Volume | 47 |
DOIs | |
State | Published - Dec 1 2009 |
Keywords
- Action space design
- Markov decision process
- Stochastic optimization
ASJC Scopus subject areas
- Computer Science(all)
- Management Science and Operations Research