Nov 1, 2025.
In this post, I’ll introduce a reinforcement learning (RL) algorithm based on an “alternative” paradigm: divide and conquer. Unlike traditional methods, this algorithm is not based on temporal difference (TD) learning (which has scalability challenges),... Continue
Sep 1, 2025.
What exactly does word2vec learn, and how? Answering this question amounts to understanding representation learning in a minimal yet interesting language modeling task. Despite the fact that word2vec is a well-known precursor to modern language... Continue
Jul 1, 2025.
× Predicting Ego-centric Video from human Actions (PEVA). Given past video frames and an action specifying a desired change in 3D pose, PEVA predicts the next video frame. Our results show that, given the first... Continue
Apr 11, 2025.
Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated applications. However, as LLMs have improved, so have the attacks against them. Prompt injection attack is listed as the #1 threat by OWASP to LLM-integrated... Continue
Apr 8, 2025.
PLAID is a multimodal generative model that simultaneously generates protein 1D sequence and 3D structure, by learning the latent space of protein folding models. The awarding of the 2024 Nobel Prize to AlphaFold2 marks an... Continue
Mar 25, 2025.
Training Diffusion Models with Reinforcement Learning We deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion and reduce fuel consumption for everyone. Our goal is to tackle "stop-and-go" waves, those frustrating... Continue
Nov 12, 2024.
We introduce Anthology, a method for conditioning LLMs to representative, consistent, and diverse virtual personas by generating and utilizing naturalistic backstories with rich details of individual values and experience. What does it mean for large... Continue
Sep 20, 2024.
Sample language model responses to different varieties of English and native speaker reactions. ChatGPT does amazingly well at communicating with people in English. But whose English? Only 15% of ChatGPT users are from the US,... Continue
Aug 28, 2024.
When we began studying jailbreak evaluations, we found a fascinating paper claiming that you could jailbreak frontier LLMs simply by translating forbidden prompts into obscure languages. Excited by this result, we attempted to reproduce it... Continue
Jul 20, 2024.
Humans excel at processing vast arrays of visual information, a skill that is crucial for achieving artificial general intelligence (AGI). Over the decades, AI researchers have developed Visual Question Answering (VQA) systems to interpret scenes... Continue
May 29, 2024.
The ability of LLMs to execute commands through plain language (e.g. English) has enabled agentic systems that can complete a user query by orchestrating the right set of tools (e.g. ToolFormer, Gorilla). This, along with... Continue
Mar 21, 2024.
As computer vision researchers, we believe that every pixel can tell a story. However, there seems to be a writer’s block settling into the field when it comes to dealing with large images. Large images... Continue
Mar 11, 2024.
Every year, the Berkeley Artificial Intelligence Research (BAIR) Lab graduates some of the most talented and innovative minds in artificial intelligence and machine learning. Our Ph.D. graduates have each expanded the frontiers of AI research... Continue
Feb 18, 2024.
AI caught everyone’s attention in 2023 with Large Language Models (LLMs) that can be instructed to perform general tasks, such as translation or coding, just by prompting. This naturally led to an intense focus on... Continue
Nov 14, 2023.
The structure of Ghostbuster, our new state-of-the-art method for detecting AI-generated text. Large language models like ChatGPT write impressively well—so well, in fact, that they’ve become a problem. Students have begun using these models to... Continue
Nov 14, 2023.
Asymmetric Certified Robustness via Feature-Convex Neural Networks TLDR: We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios. This focused setting allows us to introduce... Continue
Oct 17, 2023.
Goal Representations for Instruction Following A longstanding goal of the field of robot learning has been to create generalist agents that can perform tasks for humans. Natural language has the potential to be an easy-to-use... Continue
Oct 16, 2023.
Rethinking the Role of PPO in RLHF TL;DR: In RLHF, there’s tension between the reward learning phase, which uses human preference in the form of comparisons, and the RL fine-tuning phase, which optimizes a single,... Continue
Jul 14, 2023.
Training Diffusion Models with Reinforcement Learning replay Diffusion models have recently emerged as the de facto standard for generating complex, high-dimensional outputs. You may know them for their ability to produce stunning AI art and... Continue
Jul 10, 2023.
Figure 1: stepwise behavior in self-supervised learning. When training common SSL algorithms, we find that the loss descends in a stepwise fashion (top left) and the learned embeddings iteratively increase in dimensionality (bottom left). Direct... Continue
Jun 29, 2023.
Figure 1: CoarsenConf architecture. Molecular conformer generation is a fundamental task in computational chemistry. The objective is to predict stable low-energy 3D molecular structures, known as conformers, given the 2D molecule. Accurate molecular conformations are... Continue
May 23, 2023.
TL;DR: Text Prompt -> LLM -> Intermediate Representation (such as an image layout) -> Stable Diffusion -> Image. Recent advancements in text-to-image generation with diffusion models have yielded remarkable results synthesizing highly realistic and diverse... Continue
Apr 6, 2023.
Figure 1: “Interactive Fleet Learning” (IFL) refers to robot fleets in industry and academia that fall back on human teleoperators when necessary and continually learn from them over time. In the last few years we... Continue
Apr 3, 2023.
In this post, we introduce Koala, a chatbot trained by fine-tuning Meta’s LLaMA on dialogue data gathered from the web. We describe the dataset curation and training process of our model, and also present the... Continue
Jan 20, 2023.
Reinforcement learning provides a conceptual framework for autonomous agents to learn from experience, analogously to how one might train a pet with treats. But practical applications of reinforcement learning are often far from natural: instead... Continue
Sep 19, 2022.
To regulate the distribution shift experience by learning-based controllers, we seek a mechanism for constraining the agent to regions of high data density throughout its trajectory (left). Here, we present an approach which achieves this... Continue
Aug 29, 2022.
Deep neural networks have enabled technological wonders ranging from voice recognition to machine transition to protein engineering, but their design and application is nonetheless notoriously unprincipled. The development of tools and methods to guide this... Continue
Jul 10, 2022.
In cooperative multi-agent reinforcement learning (MARL), due to its on-policy nature, policy gradient (PG) methods are typically believed to be less sample efficient than value decomposition (VD) methods, which are off-policy. However, some recent empirical... Continue
Jun 30, 2022.
FIGS (Fast Interpretable Greedy-tree Sums): A method for building interpretable models by simultaneously growing an ensemble of decision trees in competition with one another. Recent machine-learning advances have led to increasingly complex predictive models, often... Continue
May 20, 2022.
We recently published the Berkeley Crossword Solver (BCS), the current state of the art for solving American-style crossword puzzles. The BCS combines neural question answering and probabilistic inference to achieve near-perfect performance on most American-style... Continue
May 3, 2022.
Figure 1: In real-world applications, we think there exist a human-machine loop where humans and machines are mutually augmenting each other. We call it Artificial Augmented Intelligence. How do we build and evaluate an AI... Continue
Apr 29, 2022.
Deep reinforcement learning (DRL) is transitioning from a research field focused on game playing to a technology with real-world applications. Notable examples include DeepMind’s work on controlling a nuclear reactor or on improving Youtube video... Continue
Apr 25, 2022.
Figure 1: Summary of our recommendations for when a practitioner should BC and various imitation learning style methods, and when they should use offline RL approaches. Offline reinforcement learning allows learning policies from previously collected... Continue
Apr 20, 2022.
A demonstration of the RvS policy we learn with just supervised learning and a depth-two MLP. It uses no TD learning, advantage reweighting, or Transformers! Offline reinforcement learning (RL) is conventionally approached using value-based methods... Continue
Mar 21, 2022.
Figure 1: Airmass measurements (clouds) over Ukraine from February 18, 2022 - March 01, 2022 from the SEVIRI instrument. Data accessed via the EUMETSAT Viewer. Satellite imagery is a critical source of information during the... Continue
Feb 23, 2022.
Large-scale semantic image annotation is a significant challenge for learning-based perception systems in robotics. Supervised learning requires labeled data, and a common approach is for humans to hand-label images with segmentation masks, keypoints, and class... Continue
Feb 23, 2022.
Unsupervised Reinforcement Learning (RL), where RL agents pre-train with self-supervised rewards, is an emerging paradigm for developing RL agents that are capable of generalization. Recently, we released the Unsupervised RL Benchmark (URLB) which we covered... Continue
Feb 2, 2022.
imodels: A python package with cutting-edge techniques for concise, transparent, and accurate predictive modeling. All sklearn-compatible and easy to use. Recent machine-learning advances have led to increasingly complex predictive models, often at the cost of... Continue
Dec 15, 2021.
The shortcomings of supervised RL Reinforcement Learning (RL) is a powerful paradigm for solving many problems of interest in AI, such as controlling autonomous vehicles, digital assistants, and resource allocation to name a few. We’ve... Continue
Nov 19, 2021.
Sequence Modeling Solutions for Reinforcement Learning Problems Long-horizon predictions of (top) the Trajectory Transformer compared to those of (bottom) a single-step dynamics model. Modern machine learning success stories often have one thing in common: they... Continue
Nov 19, 2021.
Processing raw sensory inputs is crucial for applying deep RL algorithms to real-world problems. For example, autonomous vehicles must make decisions about how to drive safely given information flowing from cameras, radar, and microphones about... Continue
Nov 18, 2021.
Fig. 1: The BRIDGE dataset contains 7200 demonstrations of kitchen-themed manipulation tasks across 71 tasks in 10 domains. Note that any GIF compression artifacts in this animation are not present in the dataset itself. When... Continue
Nov 8, 2021.
Cross-posted from Bounded Regret. To understand neural networks, researchers often use similarity metrics to measure how similar or different two neural networks are to each other. For instance, they are used to compare vision transformers... Continue
Nov 5, 2021.
Many experimental works have observed that generalization in deep RL appears to be difficult: although RL agents can learn to perform very complex tasks, they don’t seem to generalize over diverse task distributions as well... Continue
Nov 3, 2021.
An example of our method deployed on a Clearpath Jackal ground robot (left) exploring a suburban environment to find a visual target (inset). (Right) Egocentric observations of the robot. Imagine you’re in an unfamiliar neighborhood... Continue
Nov 1, 2021.
Many experimental works have observed that generalization in deep RL appears to be difficult: although RL agents can learn to perform very complex tasks, they don’t seem to generalize over diverse task distributions as well... Continue
Oct 25, 2021.
Figure 1: Offline Model-Based Optimization (MBO): The goal of offline MBO is to optimize an unknown objective function $f(x)$ with respect to $x$, provided access to only as static, previously-collected dataset of designs. Machine learning... Continue
Oct 25, 2021.
Fig 1. Measures of generalization performance for neural networks trained on four different boolean functions (colors) with varying training set size. For both MSE (left) and learnability (right), theoretical predictions (curves) closely match true performance... Continue
Oct 22, 2021.
Diagram of MURAL, our method for learning uncertainty-aware rewards for RL. After the user provides a few examples of desired outcomes, MURAL automatically infers a reward function that takes into account these examples and the... Continue
Oct 14, 2021.
Cross-posted from Bounded Regret. Earlier this year, my research group commissioned 6 questions for professional forecasters to predict about AI. Broadly speaking, 2 were on geopolitical aspects of AI and 4 were on future capabilities:... Continue
Oct 6, 2021.
Fig. 1: Given the original image $\mathbf{x}$, we would like to generate a compressed image $\hat{\mathbf{x}}$ such that the user's action $\mathbf{a}$ upon seeing the compressed image is similar to what it would have been... Continue
Sep 29, 2021.
Along with researchers from Google Brain and OpenAI, we are releasing a paper on Unsolved Problems in ML Safety. Due to emerging safety challenges in ML, such as those introduced by recent large-scale models, we... Continue
Sep 28, 2021.
Fig 1. A wavelet adapting to new data. Recent deep neural networks (DNNs) often predict extremely well, but sacrifice interpretability and computational efficiency. Interpretability is crucial in many disciplines, such as science and medicine, where... Continue
Sep 24, 2021.
How do humans become so skillful? Well, initially we are not, but from infancy, we discover and practice increasingly complex skills through self-supervised play. But this play is not random - the child development literature... Continue
Jul 22, 2021.
We consider a problem: Can a machine learn from a few labeled pixels to predict every pixel in a new image? This task is extremely challenging (see Fig. 1) as a single body part could... Continue
Jul 14, 2021.
Recent years have demonstrated the potential of deep multi-agent reinforcement learning (MARL) to train groups of AI agents that can collaborate to solve complex tasks - for instance, AlphaStar achieved professional-level performance in the Starcraft... Continue
Jul 8, 2021.
TL;DR: We are launching a NeurIPS competition and benchmark called BASALT: a set of Minecraft environments and a human evaluation protocol that we hope will stimulate research and investigation into solving tasks with no pre-specified... Continue
May 3, 2021.
Reinforcement learning (RL) has been used successfully for solving tasks which have a well defined reward function – think AlphaZero for Go, OpenAI Five for Dota, or AlphaStar for StarCraft. However, in many practical situations... Continue
Apr 20, 2021.
Cross-posted from the DeepMind Safety blog. In many reinforcement learning problems the objective is too complex to be specified procedurally, and a reward function must instead be learned from user data. However, how can you... Continue
Apr 19, 2021.
Model-based reinforcement learning (MBRL) is a variant of the iterative learning framework, reinforcement learning, that includes a structured component of the system that is solely optimized to model the environment dynamics. Learning a model is... Continue
Mar 23, 2021.
Transformers have been successfully applied to a wide variety of modalities: natural language, vision, protein modeling, music, robotics, and more. A common trend with using large models is to train a transformer on a large... Continue
Mar 9, 2021.
Nearly all real-world applications of reinforcement learning involve some degree of shift between the training environment and the testing environment. However, prior work has observed that even small shifts in the environment cause most RL... Continue
Feb 25, 2021.
Our method learns a task in a fixed, simulated environment and quickly adapts to new environments (e.g. the real world) solely from online interaction during deployment. The ability for humans to generalize their knowledge and... Continue
Jan 5, 2021.
The Successor Representation, Gamma-Models, and Infinite-Horizon Prediction Standard single-step models have a horizon of one. This post describes a method for training predictive dynamics models in continuous state spaces with an infinite, probabilistic horizon. Reinforcement... Continue
Jan 1, 2021.
This is a template for BAIR blog posts. Here is an example image. Figure title. Figure caption. This image is centered and set to 50% page width. The content here after the excerpt separator will... Continue
Dec 20, 2020.
Most likely not. Yet, OpenAI’s GPT-2 language model does know how to reach a certain Peter W--- (name redacted for privacy). When prompted with a short snippet of Internet text, the model accurately generates Peter’s... Continue
Dec 7, 2020.
Deep reinforcement learning has made significant progress in the last few years, with success stories in robotic control, game playing and science problems. While RL methods present a general paradigm where an agent learns from... Continue
Nov 20, 2020.
Many tasks that we do on a regular basis, such as navigating a city, cooking a meal, or loading a dishwasher, require planning over extended periods of time. Accomplishing these tasks may seem simple to... Continue
Nov 18, 2020.
Multi-agent interacting systems are prevalent in the world, from purely physical systems to complicated social dynamic systems. The interactions between entities / components can give rise to very complex behavior patterns at the level of... Continue
Nov 16, 2020.
Current machine learning methods provide unprecedented accuracy across a range of domains, from computer vision to natural language processing. However, in many important high-stakes applications, such as medical diagnosis or autonomous driving, rare mistakes can... Continue
Nov 13, 2020.
Goodhart’s Law is an adage which states the following: “When a measure becomes a target, it ceases to be a good measure.” This is particularly pertinent in machine learning, where the source of many of... Continue
Nov 5, 2020.
Imagine that you are building the next generation machine learning model for handwriting transcription. Based on previous iterations of your product, you have identified a key challenge for this rollout: after deployment, new end users... Continue
Oct 13, 2020.
The two most common perspectives on Reinforcement learning (RL) are optimization and dynamic programming. Methods that compute the gradients of the non-differentiable expected reward objective, such as the REINFORCE trick are commonly grouped into the... Continue
Oct 6, 2020.
This post is cross-listed on the CMU ML blog. To operate successfully in unstructured open-world environments, autonomous intelligent agents need to solve many different tasks and learn new tasks quickly. Reinforcement learning has enabled artificial... Continue
Sep 10, 2020.
Our method learns complex behaviors by training offline from prior datasets (expert demonstrations, data from previous experiments, or random exploration data) and then fine-tuning quickly with online interaction. Robots trained with reinforcement learning (RL) have... Continue
Aug 16, 2020.
Editor’s Note: The following blog is a special guest post by a recent graduate of Berkeley BAIR’s AI4ALL summer program for high school students. AI4ALL is a nonprofit dedicated to increasing diversity and inclusion in... Continue
Aug 3, 2020.
The case fatality rate quantifies how dangerous COVID-19 is, and how risk of death varies with strata like geography, age, and race. Current estimates of the COVID-19 case fatality rate (CFR) are biased for dozens... Continue
Jul 24, 2020.
Despite recent advances in artificial intelligence (AI) research, human children are still by far the best learners we know of, learning impressive skills like language and high-level reasoning from very little data. Children’s learning is... Continue
Jul 19, 2020.
A remarkable characteristic of human intelligence is our ability to learn tasks quickly. Most humans can learn reasonably complex skills like tool-use and gameplay within just a few hours, and understand the basics after only... Continue
Jul 11, 2020.
Many neural network architectures that underlie various artificial intelligence systems today bear an interesting similarity to the early computers a century ago. Just as early computers were specialized circuits for specific purposes like solving linear... Continue
Jun 25, 2020.
In the last decade, one of the biggest drivers for success in machine learning has arguably been the rise of high-capacity models such as neural networks along with large datasets such as ImageNet to produce... Continue
Jun 14, 2020.
The World is Continuously Varying Imagine we want to train a self-driving car in New York so that we can take it all the way to Seattle without tediously driving it for over 48 hours.... Continue
May 14, 2020.
Human thumb next to our OmniTact sensor, and a US penny for scale. Touch has been shown to be important for dexterous manipulation in robotics. Recently, the GelSight sensor has caught significant interest for learning-based... Continue
May 5, 2020.
Humans manipulate 2D deformable structures such as fabric on a daily basis, from putting on clothes to making beds. Can robots learn to perform similar tasks? Successful approaches can advance applications such as dressing assistance... Continue
May 1, 2020.
This post is cross-listed on the CMU ML blog. The history of machine learning has largely been a story of increasing abstraction. In the dawn of ML, researchers spent considerable effort engineering features. As deep... Continue
Apr 27, 2020.
Robots have been useful in environments that can be carefully controlled, such as those commonly found in industrial settings (e.g. assembly lines). However, in unstructured settings like the home, we need robotic systems that are... Continue
Apr 23, 2020.
The interpretability of neural networks is becoming increasingly necessary, as deep learning is being adopted in settings where accurate and justifiable predictions are required. These applications range from finance to medical imaging. However, deep neural... Continue
Apr 3, 2020.
Quadruped robot learning locomotion skills by imitating a dog. Whether it’s a dog chasing after a ball, or a monkey swinging through the trees, animals can effortlessly perform an incredibly rich repertoire of agile locomotion... Continue
Mar 27, 2020.
Deep reinforcement learning (RL) has achieved superhuman performance in problems ranging from data center cooling to video games. RL policies may soon be widely deployed, with research underway in autonomous driving, negotiation and automated trading.... Continue
Mar 16, 2020.
Reinforcement learning has seen a great deal of success in solving complex decision making problems ranging from robotics to games to supply chain management to recommender systems. Despite their success, deep reinforcement learning algorithms can... Continue
Mar 12, 2020.
Look at the images above. If I asked you to bring me a picnic blanket in the grassy field, would you be able to? Of course. If I asked you to bring over a cart... Continue
Mar 5, 2020.
Model Training Can Be Slow In deep learning, using more compute (e.g., increasing model size, dataset size, or training steps) often leads to higher accuracy. This is especially true given the recent success of unsupervised... Continue
Jan 16, 2020.
In this blog post, we share our experiences in developing two critical software libraries that many BAIR researchers use to execute large-scale AI experiments: Ray Tune and the Ray Cluster Launcher, both of which now... Continue
Dec 18, 2019.
All living organisms carve out environmental niches within which they can maintain relative predictability amidst the ever-increasing entropy around them (1), (2). Humans, for example, go to great lengths to shield themselves from surprise —... Continue
Dec 16, 2019.
People give massive amounts of their personal data to companies every day and these data are used to generate tremendous business values. Some economists and politicians argue that people should be paid for their contributions—but... Continue
Dec 13, 2019.
This work presents AVID, a method that allows a robot to learn a task, such as making coffee, directly by watching a human perform the task. One of the most important markers of intelligence is... Continue
Dec 12, 2019.
Reinforcement learning systems can make decisions in one of two ways. In the model-based approach, a system uses a predictive model of the world to ask questions of the form “what will happen if I... Continue
Dec 5, 2019.
One of the primary factors behind the success of machine learning approaches in open world settings, such as image recognition and natural language processing, has been the ability of high-capacity deep neural network function approximators... Continue
Nov 26, 2019.
This post is cross-listed at the SAIL Blog and the CMU ML blog. In the last decade, we’ve seen learning-based systems provide transformative solutions for a wide range of perception and reasoning problems, from recognizing... Continue
Nov 22, 2019.
Prof. Anca Dragan gave a talk as part of the WIRED25 summit, explaining some of the challenges robots face when interacting with people. First, robots that share space with people, from autonomous cars to quadrotors... Continue
Nov 4, 2019.
The incredible success of BERT in Natural Language Processing (NLP) showed that large models trained on unlabeled data are able to learn powerful representations of language. These representations have been shown to encode information about... Continue
Oct 28, 2019.
When learning to follow natural language instructions, neural networks tend to be very data hungry – they require a huge number of examples pairing language with actions in order to learn effectively. This post is... Continue
Oct 21, 2019.
AI agents have learned to play Dota, StarCraft, and Go, by training to beat an automated system that increases in difficulty as the agent gains skill at the game: in vanilla self-play, the AI agent... Continue
Oct 14, 2019.
In this blog post, we explore a functional paradigm for implementing reinforcement learning (RL) algorithms. The paradigm will be that developers write the numerics of their algorithm as independent, pure functions, and then use a... Continue
Sep 30, 2019.
Figure 1: Our approach (PDDM) can efficiently and effectively learn complex dexterous manipulation skills in both simulation and the real world. Here, the learned model is able to control the 24-DoF Shadow Hand to rotate... Continue
Sep 26, 2019.
In this post, we share some recent promising results regarding the applications of Deep Learning in analog IC design. While this work targets a specific application, the proposed methods can be used in other black... Continue
Sep 24, 2019.
UPDATE (15 Feb 2020): Documentation is now available for rlpyt! See it at rlpyt.readthedocs.io. It describes program flow, code organization, and implementation details, including class, method, and function references for all components. The code examples still introduce... Continue
Sep 19, 2019.
We introduce Bit-Swap, a scalable and effective lossless data compression technique based on deep learning. It extends previous work on practical compression with latent variable models, based on bits-back coding and asymmetric numeral systems. In... Continue
Aug 13, 2019.
It is important whenever designing new technologies to ask “how will this affect people’s privacy?” This topic is especially important with regard to machine learning, where machine learning models are often trained on sensitive user... Continue
Jun 10, 2019.
To operate successfully in a complex and changing environment, learning agents must be able to acquire new skills quickly. Humans display remarkable skill in this area — we can learn to recognize a new object... Continue
Jun 7, 2019.
Effect of Population Based Augmentation applied to images, which differs at different percentages into training. In this blog post we introduce Population Based Augmentation (PBA), an algorithm that quickly and efficiently learns a state-of-the-art approach... Continue
Jun 3, 2019.
We are in the midst of an unprecedented convergence of two rapidly growing trends on our roadways: sharply increasing congestion and the deployment of autonomous vehicles. Year after year, highways get slower and slower: famously,... Continue
May 28, 2019.
Communicating the goal of a task to another person is easy: we can use language, show them an image of the desired outcome, point them to a how-to video, or use some combination of all... Continue
May 20, 2019.
Imagine a robot trying to learn how to stack blocks and push objects using visual inputs from a camera feed. In order to minimize cost and safety concerns, we want our robot to learn these... Continue
May 13, 2019.
Existing Computer Vision Setting v.s. Real-World Scenario One day, an ecologist came to us. He wanted to use modern computer vision techniques to perform automatic animal identification in his wildlife camera trap image datasets. We... Continue
May 6, 2019.
Figure 1: Our model-based meta reinforcement learning algorithm enables a legged robot to adapt online in the face of an unexpected system malfunction (note the broken front right leg). Humans have the ability to seamlessly... Continue
Apr 11, 2019.
In many animals, tool-use skills emerge from a combination of observational learning and experimentation. For example, by watching one another, chimpanzees can learn how to use twigs to “fish” for insects. Similarly, capuchin monkeys demonstrate... Continue
Mar 27, 2019.
We all dream of a future in which autonomous cars can drive us to every corner of the world. Numerous researchers and companies are working day and night to chase this dream by overcoming scientific... Continue
Mar 24, 2019.
Last updated November 2020. The University of California Berkeley Artificial Intelligence Research (BAIR) Lab is pleased to announce the BAIR Open Research Commons, a new industrial affiliate program launched to accelerate cutting-edge AI research. AI... Continue
Mar 21, 2019.
Guiding our fingers while typing, enabling us to nimbly strike a matchstick, and inserting a key in a keyhole all rely on our sense of touch. It has been shown that the sense of touch... Continue
Mar 18, 2019.
TL;DR We present a benchmark for studying generalization in deep reinforcement learning (RL). Systematic empirical evaluation shows that vanilla deep RL algorithms generalize better than specialized deep RL algorithms designed specifically for generalization. In other... Continue
Feb 15, 2019.
“Scientific research has changed the world. Now it needs to change itself.”- The Economist, 2013 There has been a growing concern about the validity of scientific findings. A multitude of journals, papers and reports have... Continue
Feb 11, 2019.
It would be great if we could all have household robots do our chores for us. Chores are tasks that we want done to make our houses cater more to our preferences; they are a... Continue
Dec 14, 2018.
We are announcing the release of our state-of-the-art off-policy model-free reinforcement learning algorithm, soft actor-critic (SAC). This algorithm has been developed jointly at UC Berkeley and Google, and we have been using it internally for... Continue
Dec 12, 2018.
An earlier version of this post is on the RISELab blog. It is posted here with the permission of the authors. We just rolled out general support for multi-agent reinforcement learning in Ray RLlib 0.6.0.... Continue
Dec 5, 2018.
Figure: An artistic representation of single-cell RNA sequencing. The stars in the sky represent cells in a heterogeneous tissue. The projection of the stars onto the river reveals relationships among them that are not apparent... Continue
Nov 30, 2018.
With very little explicit supervision and feedback, humans are able to learn a wide range of motor skills by simply interacting with and observing the world through their senses. While there has been significant progress... Continue
Nov 26, 2018.
Figure 1: (left) LED Array Microscope constructed using a standard commercial microscope and an LED array. (middle) Close up on the LED array dome mounted on the microscope. (right) LED array displaying patterns at 100Hz.... Continue
Nov 14, 2018.
In many tasks in machine learning, it is common to want to answer questions given fixed, pre-collected datasets. In some applications, however, we are not given data a priori; instead, we must collect the data... Continue
Oct 23, 2018.
Top left: image of a 3D cube. Top right: example depth image, with darker points representing areas closer to the camera (source: Wikipedia). Next two rows: examples of depth and RGB image pairs for grasping... Continue
Oct 9, 2018.
Simulated characters imitating skills from YouTube videos. Whether it’s everyday tasks like washing our hands or stunning feats of acrobatic prowess, humans are able to learn an incredible array of skills by watching other humans.... Continue
Sep 6, 2018.
We want to build agents that can accomplish arbitrary goals in unstructured complex environments, such as a personal robot that can perform household chores. A promising approach is to use deep reinforcement learning, which is... Continue
Aug 31, 2018.
In this post, we demonstrate how deep reinforcement learning (deep RL) can be used to learn how to control dexterous hands for a variety of manipulation tasks. We discuss how such methods can learn to... Continue
Aug 6, 2018.
An earlier version of this post was published on Off the Convex Path. It is reposted here with the author’s permission. In the last few years, deep learning practitioners have proposed a litany of different... Continue
Jun 28, 2018.
Learning a new skill by observing another individual, the ability to imitate, is a key part of intelligence in human and animals. Can we enable a robot to do the same, learning to manipulate a... Continue
Jun 18, 2018.
We are excited by the interest and excitement generated by our BDD100K dataset. Our data release and blog post were covered in an unsolicited article by the UC Berkeley newspaper, the Daily Cal, which was... Continue
May 30, 2018.
Update 06/18/2018: please also check our follow-up blog post after reading this. TL;DR, we released the largest and most diverse driving video dataset with rich annotations called BDD100K. You can access the data for research... Continue
May 17, 2018.
Machine learning systems trained to minimize prediction error may often exhibit discriminatory behavior based on sensitive characteristics such as race and gender. One reason could be due to historical bias in the data. In various... Continue
Apr 26, 2018.
You’ve decided that you want to bike from your house by UC Berkeley to the Golden Gate Bridge. It’s a nice 20 mile ride, but there’s a problem: you’ve never ridden a bike before! To... Continue
Apr 18, 2018.
A blind, autonomous pilot (left), suboptimal human pilot (center), and combined human-machine team (right) play the Lunar Lander game. Imagine a drone pilot remotely flying a quadrotor, using an onboard camera to navigate and land.... Continue
Apr 10, 2018.
Simulated humanoid performing a variety of highly dynamic and acrobatic skills. Motion control problems have become standard benchmarks for reinforcement learning, and deep RL methods have been shown to be effective for a diverse suite... Continue
Mar 13, 2018.
Left: Given movie poster, Right: New movie title generated by MC-GAN. Text is a prominent visual element of 2D design. Artists invest significant time into designing glyphs that are visually compatible with other elements in... Continue
Feb 6, 2018.
Humans physically interact with each other every day – from grabbing someone’s hand when they are about to spill their drink, to giving your friend a nudge to steer them in the right direction, physical... Continue
Jan 23, 2018.
Feature selection is a common method for dimensionality reduction that encourages model interpretability. With large data sets becoming ever more prevalent, feature selection has seen widespread usage across a variety of real-world tasks in recent... Continue
Jan 9, 2018.
As machine learning algorithms and techniques have advanced, more and more machine learning applications require multiple machines and must exploit parallelism. However, the infrastructure for doing machine learning on clusters remains ad-hoc. While good solutions... Continue
Dec 30, 2017.
This post is based on recent research by Ivan Evtimov, Kevin Eykholt, Earlence Fernandes, Tadayoshi Kohno, Bo Li, Atul Prakash, Amir Rahmati, Dawn Song, and Florian Tramèr. Deep neural networks (DNNs) have enabled great progress... Continue
Dec 20, 2017.
Reinforcement Learning (RL) is a powerful technique capable of solving complex tasks such as locomotion, Atari games, racing games, and robotic manipulation tasks, all through training an agent to optimize behaviors over a reward function.... Continue
Dec 12, 2017.
Democratization of Robots in Factories In modern factories, human workers and robots are two major workforces. For safety concerns, the two are normally separated with robots confined in metal cages, which limits the productivity as... Continue
Dec 5, 2017.
The Problem: Fast and Safe Motion Planning Real time autonomous motion planning and navigation is hard, especially when we care about safety. This becomes even more difficult when we have systems with complicated dynamics, external... Continue
Nov 30, 2017.
Fig 1. A learned neural network dynamics model enables a hexapod robot to learn to run and follow desired trajectories, using just 17 minutes of real-world experience. Enabling robots to act autonomously in the real-world... Continue
Nov 9, 2017.
Why we need Attention What we see through our eyes is only a very small part of the world around us. At any given time our eyes are sampling only a fraction of the surrounding... Continue
Oct 26, 2017.
Toyota HSR Trained with DART to Make a Bed. In Imitation Learning (IL), also known as Learning from Demonstration (LfD), a robot learns a control policy from analyzing demonstrations of the policy performed by an... Continue
Oct 17, 2017.
Deep imitation learning and deep reinforcement learning have potential to learn robot control policies that map high-dimensional sensor inputs to controls. While these approaches have been very successful at learning short duration tasks, such as... Continue
Oct 6, 2017.
Deep reinforcement learning (deep RL) has achieved success in many tasks, such as playing video games from raw pixels (Mnih et al., 2015), playing the game of Go (Silver et al., 2016), and simulated robotic... Continue
Sep 12, 2017.
Since we posted our paper on “Learning to Optimize” last year, the area of optimizer learning has received growing attention. In this article, we provide an introduction to this line of work and share our... Continue
Sep 5, 2017.
Consider looking at a photograph of a chair. We humans have the remarkable capacity of inferring properties about the 3D shape of the chair from this single photograph even if we might not have seen... Continue
Aug 31, 2017.
This post was initially published on Off the Convex Path. It is reposted here with authors’ permission. A core, emerging problem in nonconvex optimization involves the escape of saddle points. While recent research has shown... Continue
Aug 23, 2017.
Digitally reconstructing 3D geometry from images is a core problem in computer vision. There are various applications, such as movie productions, content generation for video games, virtual and augmented reality, 3D printing and many more.... Continue
Aug 17, 2017.
Be careful what you reward “Be careful what you wish for!” – we’ve all heard it! The story of King Midas is there to warn us of what might happen when we’re not. Midas, a... Continue
Aug 8, 2017.
Given an image, humans can easily infer the salient entities in it, and describe the scene effectively, such as, where objects are located (in a forest or in a kitchen?), what attributes an object has... Continue
Aug 2, 2017.
Over the last few years we have experienced an enormous data deluge, which has played a key role in the surge of interest in AI. A partial list of some large datasets: ImageNet, with over... Continue
Jul 18, 2017.
A key aspect of intelligence is versatility – the capability of doing many different things. Current AI systems excel at mastering a single skill, such as Go, Jeopardy, or even helicopter aerobatics. But, when you... Continue
Jul 11, 2017.
Given only a single 2D image, humans are able to effortlessly infer the rich 3D structure of the underlying scene. Since inferring 3D from 2D is an ambiguous task by itself (see e.g. the left... Continue
Jul 6, 2017.
(Based on joint work with David Held, Aviv Tamar, and Pieter Abbeel.) Deep reinforcement learning (RL) has enabled some remarkable achievements in hard control problems: with deep RL, agents have learned to play video games... Continue
Jun 27, 2017.
Reliable robot grasping across many objects is challenging due to sensor noise and occlusions that lead to uncertainty about the precise shape, position, and mass of objects. The Dexterity Network (Dex-Net) 2.0 is a project... Continue
Jun 20, 2017.
(Joint work with Ronghang Hu, Marcus Rohrbach, Trevor Darrell, Dan Klein and Kate Saenko.) Suppose we’re building a household robot, and want it to be able to answer questions about its surroundings. We might ask... Continue
Jun 20, 2017.
Berkeley AI Research (BAIR) brings together researchers at UC Berkeley across the areas of computer vision, machine learning, natural language processing, planning, and robotics, and each year we publish cutting edge research across all of... Continue