For our first episode in 2022, we are joined with our friends from the Towards Data Science podcast to discuss our thoughts about the AI-related trends and events that happened in 2021.
Some things we discuss are:
Foundation models continue to grow, but one interesting trend is the focus on efficiency along with (instead of?) scale. For example, while DeepMind’s Gopher model has fewer than twice the parameters of GPT-3, it’s reportedly 25 times more efficient, meaning that much more value is being squeezed out of the same training data and compute. AI21Labs’ Jurrassic models are also equal to GPT-3 on a parameter count basis, but reflect a focus on architecture optimization over raw scaling that we expect to persist into 2022. (That’s not to say significant scaling won’t happen, or that it hasn’t happened already; Microsoft Turing-NLG, released a few months ago, is over half a trillion parameters in size. But it’s safe to say that scaling won’t be done without simultaneous efficiency optimizations that were less of a focus in late-2020.)
Procedural environment generation has been a big theme in reinforcement learning. In Open-Ended Learning Leads to Generally Capable Agents, the team at DeepMind showed how training RL agents on a wide range of environments can lead to emergent behaviour associated with generalization, like trial and error and cooperation with friendly agents.
Open-ended learning (OEL) seems like an interesting wildcard, which some researchers think might be an important ingredient in the final AGI recipe. We spoke with OpenAI’s head of open-ended learning, Ken Stanley, about what role OEL might play in the future of AI on this episode of the TDS podcast.
A NeurIPS spotlight paper titled Optimal Policies Tend to Seek Power, and subsequent work by the same author, are showing that we should expect highly capable AI systems to engage in dangerous behaviour that’s misaligned with human values, by default. Specifically, highly competent agents will tend to search for states that are powerful, in the sense that they offer many downstream options. This finding makes a compelling case that AI alignment ought to be prioritized, particularly given the rate of progress we’re seeing in AI capabilities more broadly. If it really is the case that capable AI systems will be dangerous by default, active effort must be invested in safety research.
Outline:
0:00 Intro
2:15 Rise of multi-modal models
7:40 Growth of hardware and compute
13:20 Reinforcement learning
20:45 Open-ended learning
26:15 Power seeking paper
32:30 Safety and assumptions
35:20 Intrinsic vs. extrinsic motivation
42:00 Mapping natural language
46:20 Timnit Gebru’s research institute
49:20 Wrap-up
Share this post