https://nap.nationalacademies.org/catalog/27432/critical-issues-in-transportation-for-2024-and-beyond

SOLUTION PROCEDURES FOR PARTIALLY OBSERVED MARKOV DECISION PROCESSES

The authors present three algorithms to solve the infinite horizon, expected discounted total reward partially observed Markov decision process (POMDP). Each algorithm integrates a successive approximations algorithm for the POMDP due to A. Smallwood and E. Sondik with an appropriately generalized numerical technique that has been shown to reduce CPU time until convergence for the completely observed case. The first technique is reward revision. The second technique is reward revision integrated with modified policy iteration. The third is a standard extrapolation. A numerical study indicates the potentially significant computational value of these algorithms.

Corporate Authors:
Operations Research Society of America

Mount Royal and Guilford Avenue
Baltimore, MD United States 21202
Authors:
- WHITE III, C C
- Scherer, W T
Publication Date: 1989-9

Media Info

Features: References; Tables;
Pagination: p. 791-797
Serial:
- OPERATIONS RESEARCH
- Volume: 37
- Issue Number: 5
- Publisher: NEWELL, GORDON F.

Subject/Index Terms

TRT Terms: Algorithms; Decision making; Dynamic programming; Markov processes; Planning
Subject Areas: Highways; Planning and Forecasting; I72: Traffic and Transport Planning;

Filing Info

Accession Number: 00489584
Record Type: Publication
Files: TRIS
Created Date: Nov 30 1989 12:00AM