Yannis Flet-Berliac

Postdoc at Stanford

Associate Program Chair at ICML

Stanford AI Lab

Welcome!

I am a Postdoctoral Scholar at Stanford AI Lab where I work with Emma Brunskill on Offline Reinforcement Learning and Deep RL algorithms. I am particularly interested in using Offline RL as a driving force to achieve practical LLM alignment through RL from human feedback.

I completed my PhD at Inria where my research focused on sample-efficient Deep RL algorithms for control, exploration and safety. I graduated from DTU and École Centrale with two MSc.

In the professional sphere, I have worked as a ML Engineer at iAdvize, specializing in NLP, and have gained experience in three startups in Denmark, with roles spanning NLP, Computer Vision, and Speech Processing.

I am honored to serve as an Associate Program Chair at the 40th International Conference on Machine Learning (ICML) conference, which will be held in Hawaii.

Interests

AI <> Society
LLMs & (Offline) RL
Deep RL
Running marathons 🏃🏽‍♂️, skate racing skiing, windsurfing
Producing music & videos

Education

Postdoc in Computer Science, 2022

Stanford University (Emma Brunskill lab), CA, USA
PhD in Computer Science, 2021

Inria (SequeL team), Lille, FR
MSc in Computer Science, 2017

Technical University of Denmark, Copenhagen, DK
MSc in General Engineering, 2017

École Centrale, Nantes, FR

Publications

More Publications

A. Badrinath, Y. Flet-Berliac, A. Nie, E. Brunskill
Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets.
Presented at 37th Conference on Neural Information Processing Systems (NeurIPS), 2023.

Link PDF

K. Dong, Y. Flet-Berliac, A. Nie, E. Brunskill
Model-based Offline Reinforcement Learning with Local Misspecification.
Oral at 37th AAAI Conference on Artificial Intelligence (AAAI), 2023.

Link PDF

A. Nie, Y. Flet-Berliac, D. Richmond, W. Steenbergen, E. Brunskill
Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data.
Presented at 36th Conference on Neural Information Processing Systems (NeurIPS).
Oral at AAAI 2023 Workshop on Reinforcement Learning Ready for Production, 2023.

Link PDF Slides

Y. Liu, Y. Flet-Berliac, E. Brunskill
Offline Policy Optimization with Eligible Actions.
Presented at 38th Conference on Uncertainty in Artificial Intelligence (UAI), 2022.

Link PDF

Y. Flet-Berliac, D. Basu
SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics.
Presented at 5th Conference on Reinforcement Learning and Decision Making (RLDM), 2022.

Link PDF PDF (extended version)

Y. Flet-Berliac
Jury: Ann Nowé, Bruno Scherrer, Luce Brotcorne, Anders Jonsson, Joëlle Pineau, Adam White
Sample-Efficient Deep Reinforcement Learning for Control, Exploration and Safety.
PhD Thesis, 2021.

Link PDF

Y. Flet-Berliac, J. Ferret, O. Pietquin, P. Preux, M. Geist
Adversarially Guided Actor-Critic.
Presented at 9th International Conference on Learning Representations (ICLR), 2021.

Link PDF Slides

Y. Flet-Berliac, R. Ouhamma, O.-A. Maillard, P. Preux
Learning Value Functions in Deep Policy Gradients using Residual Variance.
Presented at 9th International Conference on Learning Representations (ICLR), 2021.

Link PDF Slides

Y. Flet-Berliac, P. Preux
Only Relevant Information Matters: Filtering Out Noisy Samples to Boost RL.
Presented at 29th International Joint Conference on Artificial Intelligence (IJCAI), 2020.

Link PDF

J. Demongeot, Y. Flet-Berliac, H. Seligmann
Temperature Decreases Spread Parameters of the New Covid-19 Case Dynamics.
Biology, 9(5), p.94, 2020.

Link PDF

Invited Talks

Efficient Actor-Critics under the Prism of Variance

July, 2021

UC Berkeley Robot Learning Lab

Adversarially Guided Actor-Critic

February, 2021

DeepMind

Slides

Learning Value Functions using Residual Variance in Deep Policy Gradients

October, 2020

DeepMind

Slides

Oral & Panel Discussion: Do we control the algorithms we create?

November, 2019

Open Forum Art and Research

Improving Policy Gradient Updates with MERL and SAUNA

October, 2019

Inria Seminars (SequeL)

Deep Reinforcement Learning at Scale

April, 2019

HPC - BigData Inria Project Lab

QA and Deep Learning for Language Understanding

November, 2017

Machine Learning Meetup

Selected Software

rlberry

A Reinforcement Learning Library for Research and Education (PyTorch)

adversarially-guided-actor-critic

AGAC: Adversarially Guided Actor-Critic (PyTorch & TensorFlow)

actor-with-variance-estimated-critic

AVEC: Actor with Variance Estimated Critic (TensorFlow)

rlss-2019

Materials for the Reinforcement Learning Summer School 2019: Bandits, RL & Deep RL (PyTorch)

Teaching

Reinforcement Learning - Fall 2019 - MVA - ENS Paris-Saclay

Teaching Assistant
Instructors: Alessandro Lazaric, Matteo Pirotta

Reinforcement Learning Summer School 2019

Teaching Assistant
Instructors: Felix Berkenkamp, Tristan Cazenave, Ludovic Denoyer, Gabriel Dulac-Arnold, Audrey Durand, Vincent François-Lavet, Matteo Hessel, Emilie Kaufmann, Marc Lanctot, Max Lapan, Alessandro Lazaric, Odalric-Ambrym Maillard, Jérémie Mary, Gerhard Neumann, Guillaume Obozinski, Olivier Pietquin, Bilal Piot, Matteo Pirotta, Bruno Scherrer, Florian Strub, Eleni Vasilaki, Oriol Vinyals

Professional Experience

January 2022 – Present

Palo Alto, CA, USA

Postdoctoral Scholar

Stanford University

November 2017 – October 2018

Nantes, FR

Machine Learning Engineer

iAdvize

Designed and executed product-focused research agendas, which led to building a conversational model for human/machine interface using deep learning.

August 2017 – November 2017

Copenhagen, DK

Research Assistant

DTU

Research work at DTU Compute laboratory focusing on deep convolutional neural network models for image classification and generative adversarial network models for image generation from a mixture of human artworks and photographs.

March 2017 – August 2017

Copenhagen, DK

Machine Learning Researcher

Soply (part-time during MSc)

Defined with the co-founders a roadmap for ML projects in the company, which led to building a system to recommend artists according to their photographic style and three artworks classification models (content, style & type) in collaboration with the National Gallery of Denmark.

November 2015 – January 2017

Copenhagen, DK

Machine Learning Engineer

EasyTranslate (part-time during MSc)

Several research projects in collaboration with the product team including a seq2seq machine translation model for specialized text and a recommendation system for human translators using LDA models trained on Wikipedia, deployed on AWS.