From LEGO competitions to DeepMind's robotics lab
A blogpost featuring the journey of a software engineer on DeepMind's robotics team, from LEGO competitions to joining the company and experiencing a typical day at work.
From LEGO competitions to DeepMind's robotics lab
A personal story about overcoming self-doubt to work at DeepMind's robotics lab.
DALL·E 2 research preview update
Exploring the latest in DALL·E 2 research advancements.
Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning
Exploring how reinforcement learning agents learn bartering behavior and economic decision-making in a multi-agent environment.
In Trading, Machine Learning Benchmarks Don’t Track What You Care About
The blogpost discusses how machine learning benchmarks in trading do not track what traders care about.
Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning
Exploring how populations of deep RL agents learn microeconomic behaviours, such as production, consumption, and trading of goods.
A Generalist Agent
Applying a large-scale language model approach to build a single generalist agent capable of multi-modal, multi-task, multi-embodiment interactions.
A Generalist Agent
Applying large-scale language modelling to build a multi-modal, multi-task, multi-embodiment generalist agent capable of playing games, captioning images, chatting, and performing physical tasks.
Active offline policy selection
Active offline policy selection using offline data, special kernel, and Bayesian optimization for efficient RL policy evaluation
Active offline policy selection
Using active offline policy selection (A-OPS) to improve the selection of RL policies for real-world applications by leveraging prerecorded datasets and limited interactions with the environment.
OpenAI leadership team update
Exploring the latest developments in the OpenAI leadership team
Tackling multiple tasks with a single visual language model
Flamingo, a single visual language model, sets a new state of the art in few-shot learning on a wide range of multimodal tasks, outperforming fine-tuned and data-intensive methods.