Co-founder of Reliant AI, a generative AI startup based in Montréal, Canada and Berlin, Germany.
Core industry member and Canada CIFAR AI Chair at Mila. My PhD research introduced the Atari 2600 as a large-scale benchmark for reinforcement learning research. It led to the emergence of the field now known as deep reinforcement learning (our paper in Nature). In more recent work my research group has continued to push the frontiers of applied deep reinforcement learning, including the first major commercial application of RL as part of a joint project with Loon. My PhD advisors are Michael Bowling and Joel Veness.
I previously led the reinforcement learning efforts of the Google Brain team in Montréal and before then worked as a research scientist at DeepMind in the UK.
Our book surveys the core elements of distributional reinforcement learning, which seeks to understand how the various sources of randomness in an environment combine to produce complex distributions of outcomes, and how these distributions can be estimated from experience. Among others, the theory has been used as a model of dopaminergic neurons in the brain, to reduce the risk of failure in robotic grasping, and to achieve state-of-the-art performance in simulated car racing and video-game playing. With Will Dabney and Mark Rowland. Draft available at http://distributional-rl.org.
As part of a collaboration with Loon, we used deep reinforcement learning to improve the navigation capabilities of stratospheric balloons. Based on a 13-balloon, 39-day controlled experiment over the Pacific Ocean, we found evidence of significantly improved power efficiency, increased time within range of a designated station, and determined that the controller had discovered new navigation techniques. Training the controller was made possible by a statistically plausible simulator that could model the wind field's "known unknowns" and the effect of the diurnal cycle on power availability. Read our 2020 paper here.
In 2022 we open-sourced a high-fidelity replica of the original simulator, offering a unique challenge for reinforcement learning algorithms. Read the blog post.
The Arcade Learning Environment (ALE) is a reinforcement-learning interface that enables artificial agents to play Atari 2600 games. We released the first complete version of the benchmark in 2012 (see paper in the Journal of Artificial Intelligence Research). The ALE was popularized by the release of the highly-successful DQN algorithm (see our 2015 paper in Nature) and continues to support deep reinforcement learning research today.
My academic research has focused on two complementary problems in reinforcement learning. First comes the problem of representation: How should a learning system structure and update its knowledge about the environment it operates in? The second problem is concerned with exploration: How should the same learning system organize its decisions to be maximally effective at discovering its environment, and in particular to be able to rapidly acquire information to build better representations?