ALTERNATIVE SCHEMES FOR INVESTIGATING MARKOV DECISION PROCESSES

This report is concerned with the problem of finding optimal policies for infinite-horizon Markov decision models, i.e., systems for which termination is not in sight. For such systems the optimal policy is the one that maximizes the expected gain per transition. Five algorithms, applicable to discrete time, discrete state, single chain Markov processes with undiscounted rewards, are described: The simple iterative method, Howard's policy-iteration algorithm, Schweitzer's policy-iteration algorithm, White's iterative method, and an algorithm which is called 'the iterative method with limits.' The validity of the last of these methods is proved and its properties are discussed. The method consists of finding upper and lower limits for the optimal gain per transition of the infinite-horizon process. These limits improve on each iteration of the algorithm, and converge to the final value of the optimal gain, ultimately. Simultaneously, the algorithms produces the 'relative value' variables, as well as the optimal policy. The computational efficiency of the aforementioned algorithms was compared by solving several test probelms of varying size. It was found that for problems for which the number of alternatives per state is considerably less than the number of states, the iterative method with limits enjoys a serious speed advantage over the policy iterative algorithms. However, for problems for which the number of alternatives per state approaches the number of states, this speed advantage disappears. The iterative method with limits also produces very accurate results with very little programming effort. Finally, the applicability of the various algorithms to the pertinent classes of problems is discussed briefly.

  • Corporate Authors:

    Massachusetts Institute of Technology

    Sloan School of Management, Operations Research Center
    Cambridge, MA  USA  02139
  • Authors:
    • Odoni, A
  • Publication Date: 1967-6

Subject/Index Terms

Filing Info

  • Accession Number: 00074421
  • Record Type: Publication
  • Source Agency: FLIGHT TRANSPORTATION LABORATORY, MIT DEPT. OF AERONAUTICS AND ASTRONAUTICS
  • Report/Paper Numbers: 28
  • Files: TRIS
  • Created Date: Sep 5 1974 12:00AM