Forootani, Ali and Tipaldi, Massimo and Zarch, Majid Ghaniee and Liuzza, Davide and Glielmo, Luigi
(2020)
A Least-Squares Temporal Difference based method for solving resource allocation problems.
IFAC Journal of Systems and Control, 13.
p. 100106.
ISSN 24686018
Abstract
Value function approximation has a central role in Approximate Dynamic Programming (ADP) to
overcome the so-called curse of dimensionality associated to real stochastic processes. In this regard,
we propose a novel Least-Squares Temporal Difference (LSTD) based method: the ‘‘Multi-trajectory
Greedy LSTD’’ (MG-LSTD). It is an exploration-enhanced recursive LSTD algorithm with the policy
improvement embedded within the LSTD iterations. It makes use of multi-trajectories Monte Carlo
simulations in order to enhance the system state space exploration. This method is applied for solving resource allocation problems modeled via a constrained
Stochastic Dynamic Programming (SDP) based framework. In particular, such problems are formulated
as a set of parallel Birth–Death Processes (BDPs). Some operational scenarios are defined and solved
to show the effectiveness of the proposed approach. Finally, we provide some experimental evidence
on the MG-LSTD algorithm convergence properties in function of its key-parameters.
© 2020 Elsevier Ltd. All right
Item Type: |
Article
|
Additional Information: |
Cite as: Ali Forootani, Massimo Tipaldi, Majid Ghaniee Zarch, Davide Liuzza, Luigi Glielmo,
A Least-Squares Temporal Difference based method for solving resource allocation problems,
IFAC Journal of Systems and Control,
Volume 13,
2020,
100106,
ISSN 2468-6018,
https://doi.org/10.1016/j.ifacsc.2020.100106. |
Keywords: |
Least-squares temporal difference;
Approximate dynamic programming; Monte Carlo simulations;
Markov decision process;
Birth–death process; |
Academic Unit: |
Faculty of Science and Engineering > Research Institutes > Hamilton Institute |
Item ID: |
16105 |
Identification Number: |
https://doi.org/10.1016/j.ifacsc.2020.100106 |
Depositing User: |
Ali Forootani
|
Date Deposited: |
15 Jun 2022 11:54 |
Journal or Publication Title: |
IFAC Journal of Systems and Control |
Publisher: |
Elsevier |
Refereed: |
Yes |
URI: |
|
Use Licence: |
This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available
here |
Repository Staff Only(login required)
|
Item control page |
Downloads per month over past year
Origin of downloads