Archive

Find an article within this site using search terms:

All Blog Posts in Reverse Chronological Order

29 Jul 2026 » From CUDA to MLX: How K-Search Brings Decades of Kernel Expertise to Apple Silicon
26 Jul 2026 » Teaching LLMs to Update Beliefs for Efficient Long-Horizon Interaction
07 Jul 2026 » Intelligence is Free, Now What?
Data Systems for, of, and by Agents
01 Jul 2026 » 2026 BAIR Graduate Showcase
08 May 2026 » Adaptive Parallel Reasoning: The Next Paradigm in Efficient Inference Scaling
20 Apr 2026 » Gradient-based Planning for World Models at Longer Horizons
13 Mar 2026 » Identifying Interactions at Scale for LLMs
10 Jan 2026 » Information-Driven Design of Imaging Systems
01 Nov 2025 » RL without TD learning
01 Sep 2025 » What exactly does word2vec learn?
01 Jul 2025 » Whole-Body Conditioned Egocentric Video Prediction
11 Apr 2025 » Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign)
08 Apr 2025 » Repurposing Protein Folding Models for Generation with Latent Diffusion
25 Mar 2025 » Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment
12 Nov 2024 » Virtual Personas for Language Models via an Anthology of Backstories
20 Sep 2024 » Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination
28 Aug 2024 » How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark
20 Jul 2024 » Are We Ready for Multi-Image Reasoning? Launching VHs: The Visual Haystacks Benchmark!
29 May 2024 » TinyAgent: Function Calling at the Edge
21 Mar 2024 » Modeling Extremely Large Images with xT
11 Mar 2024 » 2024 BAIR Graduate Directory
18 Feb 2024 » The Shift from Models to Compound AI Systems
14 Nov 2023 » Ghostbuster: Detecting Text Ghostwritten by Large Language Models
14 Nov 2023 » Asymmetric Certified Robustness via Feature-Convex Neural Networks
17 Oct 2023 » Goal Representations for Instruction Following
16 Oct 2023 » Rethinking the Role of PPO in RLHF
14 Jul 2023 » Training Diffusion Models with
Reinforcement Learning
10 Jul 2023 » On the Stepwise Nature of
Self-Supervised Learning
29 Jun 2023 » Generating 3D Molecular Conformers via Equivariant Coarse-Graining and Aggregated Attention
23 May 2023 » GPT-4 + Stable-Diffusion = ?: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
06 Apr 2023 » Interactive Fleet Learning
03 Apr 2023 » Koala: A Dialogue Model for Academic Research
20 Jan 2023 » Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation
19 Sep 2022 » Keeping Learning-Based Control Safe by Regulating Distributional Shift
29 Aug 2022 » Reverse engineering the NTK: towards first-principles architecture design
10 Jul 2022 » Why do Policy Gradient Methods work so well in Cooperative MARL? Evidence from Policy Representation
30 Jun 2022 » FIGS: Attaining XGBoost-level performance with the interpretability and speed of CART
20 May 2022 » The Berkeley Crossword Solver
03 May 2022 » Rethinking Human-in-the-Loop for Artificial Augmented Intelligence
29 Apr 2022 » Designing Societally Beneficial Reinforcement Learning Systems
25 Apr 2022 » Should I Use Offline RL or Imitation Learning?
20 Apr 2022 » Offline RL Made Easier: No TD Learning, Advantage Reweighting, or Transformers
21 Mar 2022 » Accelerating Ukraine Intelligence Analysis with Computer Vision on Synthetic Aperture Radar Imagery
23 Feb 2022 » Unsupervised Skill Discovery with Contrastive Intrinsic Control
02 Feb 2022 » imodels: leveraging the unreasonable effectiveness of rules
15 Dec 2021 » The Unsupervised Reinforcement Learning Benchmark
19 Nov 2021 » Sequence Modeling Solutions
for Reinforcement Learning Problems
19 Nov 2021 » Which Mutual Information Representation Learning Objectives are Sufficient for Control?
18 Nov 2021 » Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets
05 Nov 2021 » Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability
03 Nov 2021 » RECON: Learning to Explore the Real World with a Ground Robot
01 Nov 2021 » Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability
25 Oct 2021 » Designs from Data: Offline Black-Box Optimization via Conservative Training
25 Oct 2021 » A First-Principles Theory of Neural
Network Generalization
22 Oct 2021 » Making RL Tractable by Learning More Informative Reward Functions: Example-Based Control, Meta-Learning, and Normalized Maximum Likelihood
14 Oct 2021 » Updates and Lessons from AI Forecasting
06 Oct 2021 » PICO: Pragmatic Compression for Human-in-the-Loop Decision-Making
29 Sep 2021 » Unsolved ML Safety Problems
28 Sep 2021 » Distilling neural networks into wavelet models using interpretations
24 Sep 2021 » What Can I Do Here? Learning New Skills by Imagining Visual Affordances
22 Jul 2021 » Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive Learning
14 Jul 2021 » The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games
08 Jul 2021 » BASALT: A Benchmark for
Learning from Human Feedback
03 May 2021 » Learning What To Do by Simulating the Past
20 Apr 2021 » An EPIC way to evaluate reward functions
19 Apr 2021 » The Importance of Hyperparameter Optimization for Model-based Reinforcement Learning
23 Mar 2021 » Pretrained Transformers as Universal Computation Engines
09 Mar 2021 » Maximum Entropy RL (Provably) Solves Some Robust RL Problems
25 Feb 2021 » Self-Supervised Policy Adaptation during Deployment
05 Jan 2021 » The Successor Representation, $\gamma$-Models,
and Infinite-Horizon Prediction
20 Dec 2020 » Does GPT-2 Know Your Phone Number?
07 Dec 2020 » Offline Reinforcement Learning: How Conservative Algorithms Can Enable New Applications
20 Nov 2020 » Learning State Abstractions for Long-Horizon Planning
13 Nov 2020 » Goodhart’s Law, Diversity and a Series of Seemingly Unrelated Toy Problems
05 Nov 2020 » Adapting on the Fly to Test Time Distribution Shift
13 Oct 2020 » Reinforcement learning is supervised learning on optimized data
06 Oct 2020 » Plan2Explore: Active Model-Building for Self-Supervised Visual Reinforcement Learning
10 Sep 2020 » AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
16 Aug 2020 » AI Will Change the World.
Who Will Change AI?
We Will.
03 Aug 2020 » Estimating the fatality rate is difficult but doable with better data
24 Jul 2020 » Exploring Exploration: Comparing Children with RL Agents in Unified Environments
19 Jul 2020 » Can RL From Pixels be as Efficient as RL From State?
11 Jul 2020 » Decentralized Reinforcement Learning:
Global Decision-Making via
Local Economic Transactions
25 Jun 2020 » D4RL: Building Better Benchmarks for Offline Reinforcement Learning
14 Jun 2020 » Open Compound Domain Adaptation
14 May 2020 » OmniTact: A Multi-Directional High-Resolution Touch Sensor
05 May 2020 » Four Novel Approaches to Manipulating Fabric using Model-Free and Model-Based Deep Learning in Simulation
01 May 2020 » Unsupervised Meta-Learning: Learning to Learn without Supervision
27 Apr 2020 » The Ingredients of Real World Robotic Reinforcement Learning
23 Apr 2020 » Making Decision Trees Accurate Again: Explaining What Explainable AI Did Not
03 Apr 2020 » Robots Learning to Move like Animals
27 Mar 2020 » Physically Realistic Attacks on Deep Reinforcement Learning
16 Mar 2020 » Does On-Policy Data Collection Fix Errors in Off-Policy Reinforcement Learning?
12 Mar 2020 » BADGR:
The Berkeley Autonomous Driving Ground Robot
05 Mar 2020 » Speeding Up Transformer Training and Inference By Increasing Model Size
16 Jan 2020 » Large Scale Training at BAIR with Ray Tune
18 Dec 2019 » Emergent Behavior by Minimizing Chaos
16 Dec 2019 » What is My Data Worth?
13 Dec 2019 » Learning to Imitate Human Demonstrations via CycleGAN
12 Dec 2019 » Model-Based Reinforcement Learning:
Theory and Practice
05 Dec 2019 » Data-Driven Deep Reinforcement Learning
26 Nov 2019 » RoboNet: A Dataset for Large-Scale Multi-Robot Learning
22 Nov 2019 » Prof. Anca Dragan Talks About Human-Robot Interaction for WIRED
04 Nov 2019 » Can We Learn the Language of Proteins?
28 Oct 2019 » Look then Listen: Pre-Learning Environment Representations for Data-Efficient Neural Instruction Following
21 Oct 2019 » Collaborating with Humans Requires Understanding Them
14 Oct 2019 » Functional RL with Keras and Tensorflow Eager
30 Sep 2019 » Deep Dynamics Models for Dexterous Manipulation
26 Sep 2019 » Sample Efficient Evolutionary Algorithm for Analog Circuit Design
24 Sep 2019 » rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch
19 Sep 2019 » A Deep Learning Approach to Data Compression
13 Aug 2019 » Evaluating and Testing Unintended Memorization in Neural Networks
10 Jun 2019 » Learning to Learn with Probabilistic Task Embeddings
07 Jun 2019 » 1000x Faster Data Augmentation
03 Jun 2019 » Autonomous Vehicles for Social Good: Learning to Solve Congestion
28 May 2019 » End-to-End Deep Reinforcement Learning
without Reward Engineering
20 May 2019 » Model-Based Reinforcement Learning from Pixels with Structured Latent Variable Models
13 May 2019 » Large-Scale Long-Tailed Recognition in an Open World
06 May 2019 » Robots that Learn to Adapt
11 Apr 2019 » Robots that Learn to Use Improvised Tools
27 Mar 2019 » CVPR 2019 Challenges on Domain Adaptation in Autonomous Driving
24 Mar 2019 » Announcing the BAIR Open Research Commons
21 Mar 2019 » Manipulation By Feel
18 Mar 2019 » Assessing Generalization in Deep Reinforcement Learning
15 Feb 2019 » Controlling False Discoveries in Large-Scale Experimentation: Challenges and Solutions
11 Feb 2019 » Learning Preferences by Looking at the World
14 Dec 2018 » Soft Actor Critic—Deep Reinforcement Learning with Real-World Robots
12 Dec 2018 » Scaling Multi-Agent Reinforcement Learning
05 Dec 2018 » Building Gene Expression Atlases with Deep Generative Models for Single-cell Transcriptomics
30 Nov 2018 » Visual Model-Based Reinforcement Learning as a Path towards Generalist Robots
26 Nov 2018 » Physics-Based Learned Design: Teaching a Microscope How to Image
14 Nov 2018 » AdaSearch: A Successive Elimination Approach to Adaptive Search
23 Oct 2018 » Drilling Down on Depth Sensing and Deep Learning
09 Oct 2018 » Learning Acrobatics by Watching YouTube
06 Sep 2018 » Visual Reinforcement Learning with Imagined Goals
31 Aug 2018 » Dexterous Manipulation with Reinforcement Learning: Efficient, General, and Low-Cost
06 Aug 2018 » When Recurrent Models Don’t Need to be Recurrent
28 Jun 2018 » One-Shot Imitation from Watching Videos
18 Jun 2018 » BDD100K Blog Update
30 May 2018 » BDD100K: A Large-scale Diverse Driving Video Database
17 May 2018 » Delayed Impact of Fair Machine Learning
26 Apr 2018 » TDM: From Model-Free to Model-Based Deep Reinforcement Learning
18 Apr 2018 » Shared Autonomy via Deep Reinforcement Learning
10 Apr 2018 » Towards a Virtual Stuntman
13 Mar 2018 » Transfer Your Font Style with GANs
06 Feb 2018 » Learning Robot Objectives from Physical Human Interaction
23 Jan 2018 » Kernel Feature Selection via Conditional Covariance Minimization
09 Jan 2018 » Ray: A Distributed System for AI
30 Dec 2017 » Physical Adversarial Examples Against Deep Neural Networks
20 Dec 2017 » Reverse Curriculum Generation for Reinforcement Learning Agents
12 Dec 2017 » Towards Intelligent Industrial Co-robots
05 Dec 2017 » FaSTrack: Ensuring Safe Real-Time Navigation of Dynamic Systems
30 Nov 2017 » Model-based Reinforcement Learning with Neural Network Dynamics
09 Nov 2017 » The Emergence of a Fovea while Learning to Attend
26 Oct 2017 » DART: Noise Injection for Robust Imitation Learning
17 Oct 2017 » Learning Long Duration Sequential Task Structure From Demonstrations with Application in Surgical Robotics
06 Oct 2017 » Learning Diverse Skills via Maximum Entropy Deep Reinforcement Learning
12 Sep 2017 » Learning to Optimize with Reinforcement Learning
05 Sep 2017 » Learning a Multi-View Stereo Machine
31 Aug 2017 » How to Escape Saddle Points Efficiently
23 Aug 2017 » High Quality 3D Object Reconstruction from a Single Color Image
17 Aug 2017 » Cooperatively Learning Human Values
08 Aug 2017 » Captioning Novel Objects in Images
02 Aug 2017 » Minibatch Metropolis-Hastings
18 Jul 2017 » Learning to Learn
11 Jul 2017 » The Confluence of Geometry and Learning
06 Jul 2017 » Constrained Policy Optimization
27 Jun 2017 » Releasing the Dexterity Network (Dex-Net) 2.0 Dataset for Deep Grasping
20 Jun 2017 » Learning to Reason with Neural Module Networks
20 Jun 2017 » Introducing the BAIR Blog