Publications

More Publications


Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets.
Presented at 37th Conference on Neural Information Processing Systems (NeurIPS), 2023.

Link PDF


Model-based Offline Reinforcement Learning with Local Misspecification.
Oral at 37th AAAI Conference on Artificial Intelligence (AAAI), 2023.

Link PDF


Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data.
Presented at 36th Conference on Neural Information Processing Systems (NeurIPS).
Oral at AAAI 2023 Workshop on Reinforcement Learning Ready for Production, 2023.

Link PDF Slides


Offline Policy Optimization with Eligible Actions.
Presented at 38th Conference on Uncertainty in Artificial Intelligence (UAI), 2022.

Link PDF


SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics.
Presented at 5th Conference on Reinforcement Learning and Decision Making (RLDM), 2022.

Link PDF PDF (extended version)


Sample-Efficient Deep Reinforcement Learning for Control, Exploration and Safety.
PhD Thesis, 2021.

Link PDF


Adversarially Guided Actor-Critic.
Presented at 9th International Conference on Learning Representations (ICLR), 2021.

Link PDF Slides


Learning Value Functions in Deep Policy Gradients using Residual Variance.
Presented at 9th International Conference on Learning Representations (ICLR), 2021.

Link PDF Slides


Only Relevant Information Matters: Filtering Out Noisy Samples to Boost RL.
Presented at 29th International Joint Conference on Artificial Intelligence (IJCAI), 2020.

Link PDF


Temperature Decreases Spread Parameters of the New Covid-19 Case Dynamics.
Biology, 9(5), p.94, 2020.

Link PDF

Invited Talks

Efficient Actor-Critics under the Prism of Variance
July, 2021
Oral & Panel Discussion: Do we control the algorithms we create?
November, 2019
Improving Policy Gradient Updates with MERL and SAUNA
October, 2019
Deep Reinforcement Learning at Scale
April, 2019
QA and Deep Learning for Language Understanding
November, 2017

Selected Software

rlberry

A Reinforcement Learning Library for Research and Education (PyTorch)

adversarially-guided-actor-critic

AGAC: Adversarially Guided Actor-Critic (PyTorch & TensorFlow)

actor-with-variance-estimated-critic

AVEC: Actor with Variance Estimated Critic (TensorFlow)

rlss-2019

Materials for the Reinforcement Learning Summer School 2019: Bandits, RL & Deep RL (PyTorch)

Teaching

Reinforcement Learning - Fall 2019 - MVA - ENS Paris-Saclay

Teaching Assistant
Instructors: Alessandro Lazaric, Matteo Pirotta

Reinforcement Learning Summer School 2019

Teaching Assistant
Instructors: Felix Berkenkamp, Tristan Cazenave, Ludovic Denoyer, Gabriel Dulac-Arnold, Audrey Durand, Vincent François-Lavet, Matteo Hessel, Emilie Kaufmann, Marc Lanctot, Max Lapan, Alessandro Lazaric, Odalric-Ambrym Maillard, Jérémie Mary, Gerhard Neumann, Guillaume Obozinski, Olivier Pietquin, Bilal Piot, Matteo Pirotta, Bruno Scherrer, Florian Strub, Eleni Vasilaki, Oriol Vinyals

Professional Experience

 
 
 
 
 
January 2022 – Present
Palo Alto, CA, USA

Postdoctoral Scholar

Stanford University

 
 
 
 
 
November 2017 – October 2018
Nantes, FR

Machine Learning Engineer

iAdvize

Designed and executed product-focused research agendas, which led to building a conversational model for human/machine interface using deep learning.
 
 
 
 
 
August 2017 – November 2017
Copenhagen, DK

Research Assistant

DTU

Research work at DTU Compute laboratory focusing on deep convolutional neural network models for image classification and generative adversarial network models for image generation from a mixture of human artworks and photographs.
 
 
 
 
 
March 2017 – August 2017
Copenhagen, DK

Machine Learning Researcher

Soply (part-time during MSc)

Defined with the co-founders a roadmap for ML projects in the company, which led to building a system to recommend artists according to their photographic style and three artworks classification models (content, style & type) in collaboration with the National Gallery of Denmark.
 
 
 
 
 
November 2015 – January 2017
Copenhagen, DK

Machine Learning Engineer

EasyTranslate (part-time during MSc)

Several research projects in collaboration with the product team including a seq2seq machine translation model for specialized text and a recommendation system for human translators using LDA models trained on Wikipedia, deployed on AWS.

More

Projects, Summer Schools, etc.