Q-Learning for Flexible Learning of Daily Activity Plans

Q-learning is a method from artificial intelligence to solve the reinforcement learning problem (RLP), defined as follows. An agent is faced with a set of states, S. For each state s there is a set of actions, A(s), that the agent can take and that takes the agent (deterministically or stochastically) to another state. For each state the agent receives a (possibly stochastic) reward. The task is to select actions such that the reward is maximized. Activity generation is for demand generation in the context of transportation simulation. For each member of a synthetic population, a daily activity plan stating a sequence of activities (e.g., home-work-shop-home), including locations and times, needs to be found. Activities at different locations generate demand for transportation. Activity generation can be modeled as an RLP with the states given by the triple (type of activity, starting time of activity, time already spent at activity). The possible actions are either to stay at a given activity or to move to another activity. Rewards are given as “utility per time slice,” which corresponds to a coarse version of marginal utility. Q-learning has the property that, by repeating similar experiences over and over again, the agent looks forward in time; that is, the agent can also go on paths through state space in which high rewards are given only at the end. This paper presents computational results with such an algorithm for daily activity planning.

Language

  • English

Media Info

Subject/Index Terms

Filing Info

  • Accession Number: 01023224
  • Record Type: Publication
  • ISBN: 0309094097
  • Files: TRIS, TRB, ATRI
  • Created Date: Apr 25 2006 5:03PM