Forootani, Ali, Tipaldi, Massimo, Zarch, Majid Ghaniee, Liuzza, Davide and Glielmo, Luigi (2020) A Least-Squares Temporal Difference based method for solving resource allocation problems. IFAC Journal of Systems and Control, 13. p. 100106. ISSN 24686018
Preview
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (2MB) | Preview
Abstract
Value function approximation has a central role in Approximate Dynamic Programming (ADP) to
overcome the so-called curse of dimensionality associated to real stochastic processes. In this regard,
we propose a novel Least-Squares Temporal Difference (LSTD) based method: the ‘‘Multi-trajectory
Greedy LSTD’’ (MG-LSTD). It is an exploration-enhanced recursive LSTD algorithm with the policy
improvement embedded within the LSTD iterations. It makes use of multi-trajectories Monte Carlo
simulations in order to enhance the system state space exploration. This method is applied for solving resource allocation problems modeled via a constrained
Stochastic Dynamic Programming (SDP) based framework. In particular, such problems are formulated
as a set of parallel Birth–Death Processes (BDPs). Some operational scenarios are defined and solved
to show the effectiveness of the proposed approach. Finally, we provide some experimental evidence
on the MG-LSTD algorithm convergence properties in function of its key-parameters.
© 2020 Elsevier Ltd. All right
| Item Type: | Article |
|---|---|
| Additional Information: | Cite as: Ali Forootani, Massimo Tipaldi, Majid Ghaniee Zarch, Davide Liuzza, Luigi Glielmo, A Least-Squares Temporal Difference based method for solving resource allocation problems, IFAC Journal of Systems and Control, Volume 13, 2020, 100106, ISSN 2468-6018, https://doi.org/10.1016/j.ifacsc.2020.100106. |
| Keywords: | Least-squares temporal difference; Approximate dynamic programming; Monte Carlo simulations; Markov decision process; Birth–death process; |
| Academic Unit: | Faculty of Science and Engineering > Research Institutes > Hamilton Institute |
| Item ID: | 16105 |
| Identification Number: | 10.1016/j.ifacsc.2020.100106 |
| Depositing User: | Ali Forootani |
| Date Deposited: | 15 Jun 2022 11:54 |
| Journal or Publication Title: | IFAC Journal of Systems and Control |
| Publisher: | Elsevier |
| Refereed: | Yes |
| Related URLs: | |
| Use Licence: | This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available here |
Downloads
Downloads per month over past year
Share and Export
Share and Export