Variational Dynamic Programming for Stochastic Optimal Control

Marc Lambert; Silvère Bonnabel; Francis Bach

Pré-Publication, Document De Travail Année : 2024

Variational Dynamic Programming for Stochastic Optimal Control

(1, 2) , (3, 4) , (1)

1
2
3
4

Marc Lambert

Fonction : Auteur
PersonId : 1377005

Statistical Machine Learning and Parsimony

DGA

Silvère Bonnabel

Fonction : Auteur
PersonId : 866182
IdHAL : silvere-bonnabel
ORCID : 0000-0002-6001-7766

Mines Paris - PSL (École nationale supérieure des mines de Paris)

Centre de Robotique

Francis Bach

Fonction : Auteur
PersonId : 863126

Statistical Machine Learning and Parsimony

Résumé

We consider the problem of stochastic optimal control where the state-feedback control policies take the form of a probability distribution, and where a penalty on the entropy is added. By viewing the cost function as a Kullback-Leibler (KL) divergence between two Markov chains, we bring the tools from variational inference to bear on our optimal control problem. This allows for deriving a dynamic programming principle, where the value function is defined as a KL divergence again. We then resort to Gaussian distributions to approximate the control policies, and apply the theory to control affine nonlinear systems with quadratic costs. This results in closed-form recursive updates, which generalize LQR control and the backward Riccati equation. We illustrate this novel method on the simple problem of stabilizing an inverted pendulum.

Mots clés

Variational Approximation Linear Quadratic Control Dynamic Programming DP Maximum entropy

Domaines

Optimisation et contrôle [math.OC] Automatique / Robotique

Fichier principal

VariationalDP.pdf (645.35 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Marc Lambert : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-04553255

Soumis le : lundi 22 avril 2024-15:26:59

Dernière modification le : lundi 29 avril 2024-03:14:40

Dates et versions

hal-04553255 , version 1 (22-04-2024)

hal-04553255 , version 2 (23-04-2024)

Licence

Paternité

Identifiants

HAL Id : hal-04553255 , version 1

Citer

Marc Lambert, Silvère Bonnabel, Francis Bach. Variational Dynamic Programming for Stochastic Optimal Control. 2024. ⟨hal-04553255v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

0 Consultations

0 Téléchargements

Variational Dynamic Programming for Stochastic Optimal Control

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Partager