Variational Dynamic Programming for Stochastic Optimal Control - CAO et robotique (CAOR) Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2024

Variational Dynamic Programming for Stochastic Optimal Control

Résumé

We consider the problem of stochastic optimal control where the state-feedback control policies take the form of a probability distribution, and where a penalty on the entropy is added. By viewing the cost function as a Kullback-Leibler (KL) divergence between two Markov chains, we bring the tools from variational inference to bear on our optimal control problem. This allows for deriving a dynamic programming principle, where the value function is defined as a KL divergence again. We then resort to Gaussian distributions to approximate the control policies, and apply the theory to control affine nonlinear systems with quadratic costs. This results in closed-form recursive updates, which generalize LQR control and the backward Riccati equation. We illustrate this novel method on the simple problem of stabilizing an inverted pendulum.
Fichier principal
Vignette du fichier
VariationalDP.pdf (645.35 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04553255 , version 1 (22-04-2024)
hal-04553255 , version 2 (23-04-2024)

Licence

Paternité

Identifiants

  • HAL Id : hal-04553255 , version 1

Citer

Marc Lambert, Silvère Bonnabel, Francis Bach. Variational Dynamic Programming for Stochastic Optimal Control. 2024. ⟨hal-04553255v1⟩
0 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More