OpenAIGathering human feedback
Optimizing AI models through human feedback loop integration
SoundCloudInside a SoundCloud Microservice
Exploring the inner workings of a microservice at SoundCloud
OpenAIBetter exploration with parameter noise
Improving learning efficiency through the use of parameter noise
OpenAIProximal Policy Optimization
Understanding Proximal Policy Optimization for Enhanced Reinforcement Learning
OpenAIRobust adversarial inputs
Exploring strategies to defend against robust adversarial inputs in machine learning models
OpenAIHindsight Experience Replay
Exploring the benefits of hindsight experience replay in machine learning algorithms
OpenAITeacher–student curriculum learning
Exploring teacher-student curriculum learning methods in the context of education
OpenAIFaster physics in Python
Optimizing Python code performance for physics simulations
SoundCloudRemote device sign-in
A method for signing in to a device without a keyboard using a game controller and onscreen keyboard.
SoundCloudA Better Model of Data Ownership
Defining ownership of datasets and ensuring the right teams own the right datasets for better data management.
OpenAILearning from human preferences
Leveraging insights from human preferences to enhance user experiences.
OpenAILearning to cooperate, compete, and communicate
Exploring the dynamics of cooperation, competition, and communication