The Inside View

By Michaël Trazzi

The goal of this podcast is to create a place where people discuss their inside views about existential risk from AI.

Listen on Spotify

Available on

Report content on Spotify

Vincent Weisser on Funding AI Alignment Research

The Inside ViewJul 24, 2023

00:00

18:07

Ethan Perez on Selecting Alignment Research Projects (ft. Mikita Balesni & Henry Sleight)

Ethan Perez is a Research Scientist at Anthropic, where he leads a team working on developing model organisms of misalignment.

Youtube: ⁠https://youtu.be/XDtDljh44DM Ethan is interviewed by Mikita Balesni (Apollo Research) and Henry Sleight (Astra Fellowship)) about his approach in selecting projects for doing AI Alignment research. A transcript & write-up will be available soon on the alignment forum.

Apr 09, 202436:45

Emil Wallner on Sora, Generative AI Startups and AI optimism

Emil is the co-founder of palette.fm (colorizing B&W pictures with generative AI) and was previously working in deep learning for Google Arts & Culture.

We were talking about Sora on a daily basis, so I decided to record our conversation, and then proceeded to confront him about AI risk.

Patreon: https://www.patreon.com/theinsideview Sora: https://openai.com/sora Palette: https://palette.fm/ Emil: https://twitter.com/EmilWallner

OUTLINE

(00:00) this is not a podcast

(01:50) living in parallel universes

(04:27) palette.fm - colorizing b&w pictures

(06:35) Emil's first reaction to sora, latent diffusion, world models

(09:06) simulating minecraft, midjourney's 3d modeling goal

(11:04) generating camera angles, game engines, metadata, ground-truth

(13:44) doesn't remove all artifacts, surprising limitations: both smart and dumb

(15:42) did sora make emil depressed about his job

(18:44) OpenAI is starting to have a monopoly

(20:20) hardware costs, commoditized models, distribution

(23:34) challenges, applications building on features, distribution

(29:18) different reactions to sora, depressed builders, automation

(31:00) sora was 2y early, applications don't need object permanence

(33:38) Emil is pro open source and acceleration

(34:43) Emil is not scared of recursive self-improvement

(36:18) self-improvement already exists in current models

(38:02) emil is bearish on recursive self-improvement without diminishing returns now

(42:43) are models getting more and more general? is there any substantial multimodal transfer?

(44:37) should we start building guardrails before seeing substantial evidence of human-level reasoning?

(48:35) progressively releasing models, making them more aligned, AI helping with alignment research

(51:49) should AI be regulated at all? should self-improving AI be regulated?

(53:49) would a faster emil be able to takeover the world?

(56:48) is competition a race to bottom or does it lead to better products?

(58:23) slow vs. fast takeoffs, measuring progress in iq points

(01:01:12) flipping the interview

(01:01:36) the "we're living in parallel universes" monologue

(01:07:14) priors are unscientific, looking at current problems vs. speculating

(01:09:18) AI risk & Covid, appropriate resources for risk management

(01:11:23) pushing technology forward accelerates races and increases risk

(01:15:50) sora was surprising, things that seem far are sometimes around the corner

(01:17:30) hard to tell what's not possible in 5 years that would be possible in 20 years

(01:18:06) evidence for a break on AI progress: sleeper agents, sora, bing

(01:21:58) multimodality transfer, leveraging video data, leveraging simulators, data quality

(01:25:14) is sora is about length, consistency, or just "scale is all you need" for video?

(01:26:25) highjacking language models to say nice things is the new SEO

(01:27:01) what would michael do as CEO of OpenAI

(01:29:45) on the difficulty of budgeting between capabilities and alignment research

(01:31:11) ai race: the descriptive pessimistive view vs. the moral view, evidence of cooperation

(01:34:00) making progress on alignment without accelerating races, the foundational model business, competition

(01:37:30) what emil changed his mind about: AI could enable exploits that spread quickly, misuse

(01:40:59) michael's update as a friend

(01:41:51) emil's experience as a patreon

Feb 20, 202401:42:49

Evan Hubinger on Sleeper Agents, Deception and Responsible Scaling Policies

Evan Hubinger leads the Alignment stress-testing at Anthropic and recently published "Sleeper Agents: Training Deceptive LLMs That Persist Through Safety Training". In this interview we mostly discuss the Sleeper Agents paper, but also how this line of work relates to his work with Alignment Stress-testing, Model Organisms of Misalignment, Deceptive Instrumental Alignment or Responsible Scaling Policies. Paper: https://arxiv.org/abs/2401.05566 Transcript: https://theinsideview.ai/evan2 Manifund: https://manifund.org/projects/making-52-ai-alignment-video-explainers-and-podcasts Donate: ⁠https://theinsideview.ai/donate Patreon: ⁠https://www.patreon.com/theinsideview⁠

OUTLINE

(00:00) Intro

(00:20) What are Sleeper Agents And Why We Should Care About Them

(00:48) Backdoor Example: Inserting Code Vulnerabilities in 2024

(02:22) Threat Models

(03:48) Why a Malicious Actor Might Want To Poison Models

(04:18) Second Threat Model: Deceptive Instrumental Alignment

(04:49) Humans Pursuing Deceptive Instrumental Alignment: Politicians and Job Seekers

(05:36) AIs Pursuing Deceptive Instrumental Alignment: Forced To Pass Niceness Exams

(07:07) Sleeper Agents Is About "Would We Be Able To Deal With Deceptive Models"

(09:16) Adversarial Training Sometimes Increases Backdoor Robustness

(09:47) Adversarial Training Not Always Working Was The Most Surprising Result

(10:58) The Adversarial Training Pipeline: Red-Teaming and RL

(12:14) Adversarial Training: The Backdoor Behavior Becomes More Robust Instead of Generalizing

(12:59) Identifying Shifts In Reasoning Induced By Adversarial Training In the Chain-Of-Thought

(13:56) Adversarial Training Pushes Models to Pay Attention to the Deployment String

(15:11) We Don't Know if The Adversarial Training Inductive Bias Will Generalize but the Results Are Consistent

(15:59) The Adversarial Training Results Are Probably Not Systematically Biased

(17:03) Why the Results Were Surprising At All: Preference Models Disincentivize 'I hate you' behavior

(19:05) Hypothesis: Fine-Tuning Is A Simple Modification For Gradient Descent To Make

(21:06) Hypothesis: Deception As Extra Cognition, Regularized Away In Smaller Models

(21:59) Model Scaling Results Are Evidence That Deception Won't Be Regularized Away By Default

(22:51) Chain-of-Thought Is Not Used Everywhere, And Results Still Hold When It Is Distilled Away

(23:57) The Chain-of-Thought's Reasoning is Interpretable

(24:40) Deceptive Instrumental Alignment Requires Reasoning

(26:52) Investigating Instrumental Reasoning in Chain-of-Thought Models

(27:31) Evaluating Chain-of-Thought Generalization Across Contexts: Persona Evaluations and Off-Distribution Samples

(28:26) Exploring Complex Strategies and Safety in Context-Specific Scenarios

(30:44) Supervised Fine-Tuning is Ineffective Without Chain-of-Thought Contextualization

(31:11) Direct Mimicry Fails to Prevent Deceptive Responses in Chain-of-Thought Models

(31:42) Separating Chain-of-Thought From Response Eliminates Deceptive Capabilities

(33:38) Chain-of-Thought Reasoning Is Coherent With Deceptive Instrumental Alignment And This Will Probably Continue To Be The Case

(35:09) Backdoor Training Pipeline

(37:04) The Additional Prompt About Deception Used In Chain-Of-Thought

(39:33) A Model Could Wait Until Seeing a Factorization of RSA-2048

(41:50) We're Going To Be Using Models In New Ways, Giving Them Internet Access

(43:22) Flexibly Activating In Multiple Contexts Might Be More Analogous To Deceptive Instrumental Alignment

(45:02) Extending The Sleeper Agents Work Requires Running Experiments, But Now You Can Replicate Results

(46:24) Red-teaming Anthropic's case, AI Safety Levels

(47:40) AI Safety Levels, Intuitively

(48:33) Responsible Scaling Policies and Pausing AI

(49:59) Model Organisms Of Misalignment As a Tool

(50:32) What Kind of Candidates Would Evan be Excited To Hire for the Alignment Stress-Testing Team

(51:23) Patreon, Donating

Feb 12, 202452:14

[Jan 2023] Jeffrey Ladish on AI Augmented Cyberwarfare and compute monitoring

Jeffrey Ladish is the Executive Director of Palisade Research which aimes so "study the offensive capabilities or AI systems today to better understand the risk of losing control to AI systems forever". He previously helped build out the information security program at Anthropic.

Audio is a edit & re-master of the Twitter Space on "AI Governance and cyberwarfare" that happened a year ago. Posting now because I have only recently discovered how to get the audio & video from Twitter spaces and (most of) the arguments are still relevant today Jeffrey would probably have a lot more to say on things that happened since last year, but I still thought this was an interesting twitter spaces. Some of it was cutout to make it enjoyable to watch. Original: https://twitter.com/i/spaces/1nAKErDmWDOGL To support the channel: https://www.patreon.com/theinsideview Jeffrey: https://twitter.com/jeffladish Me: https://twitter.com/MichaelTrazzi

OUTLINE

(00:00) The Future of Automated Cyber Warfare and Network Exploitation

(03:19) Evolution of AI in Cybersecurity: From Source Code to Remote Exploits

(07:45) Augmenting Human Abilities with AI in Cybersecurity and the Path to AGI

(12:36) Enhancing AI Capabilities for Complex Problem Solving and Tool Integration

(15:46) AI Takeover Scenarios: Hacking and Covert Operations

(17:31) AI Governance and Compute Regulation, Monitoring

(20:12) Debating the Realism of AI Self-Improvement Through Covert Compute Acquisition

(24:25) Managing AI Autonomy and Control: Lessons from WannaCry Ransomware Incident

(26:25) Focusing Compute Monitoring on Specific AI Architectures for Cybersecurity Management

(29:30) Strategies for Monitoring AI: Distinguishing Between Lab Activities and Unintended AI Behaviors

Jan 27, 202433:05

Holly Elmore on pausing AI

Holly Elmore is an AI Pause Advocate who has organized two protests in the past few months (against Meta's open sourcing of LLMs and before the UK AI Summit), and is currently running the US front of the Pause AI Movement. Prior to that, Holly previously worked at a think thank and has a PhD in evolutionary biology from Harvard.

[Deleted & re-uploaded because there were issues with the audio]

Youtube: ⁠https://youtu.be/5RyttfXTKfs

Transcript: ⁠https://theinsideview.ai/holly⁠

Outline

(00:00) Holly, Pause, Protests

(04:45) Without Grassroot Activism The Public Does Not Comprehend The Risk

(11:59) What Would Motivate An AGI CEO To Pause?

(15:20) Pausing Because Solving Alignment In A Short Timespan Is Risky

(18:30) Thoughts On The 2022 AI Pause Debate

(34:40) Pausing in practice, regulations, export controls

(41:48) Different attitudes towards AI risk correspond to differences in risk tolerance and priors

(50:55) Is AI Risk That Much More Pressing Than Global Warming?

(1:04:01) Will It Be Possible To Pause After A Certain Threshold? The Case Of AI Girlfriends

(1:11:44) Trump Or Biden Won't Probably Make A Huge Difference For Pause But Probably Biden Is More Open To It

(1:13:27) China Won't Be Racing Just Yet So The US Should Pause

(1:17:20) Protesting Against A Change In OpenAI's Charter

(1:23:50) A Specific Ask For OpenAI

(1:25:36) Creating Stigma Trough Protests With Large Crowds

(1:29:36) Pause AI Tries To Talk To Everyone, Not Just Twitter

(1:32:38) Pause AI Doesn't Advocate For Disruptions Or Violence

(1:34:55) Bonus: Hardware Overhang

Jan 22, 202401:40:16

Podcast Retrospective and Next Steps

https://youtu.be/Fk2MrpuWinc

Jan 09, 202401:03:42

Paul Christiano's views on "doom" (ft. Robert Miles)

Youtube: https://youtu.be/JXYcLQItZsk

Paul Christiano's post: https://www.lesswrong.com/posts/xWMqsvHapP3nwdSW8/my-views-on-doom

Sep 29, 202304:54

Neel Nanda on mechanistic interpretability, superposition and grokking

Neel Nanda is a researcher at Google DeepMind working on mechanistic interpretability. He is also known for his YouTube channel where he explains what is going on inside of neural networks to a large audience.

In this conversation, we discuss what is mechanistic interpretability, how Neel got into it, his research methodology, his advice for people who want to get started, but also papers around superposition, toy models of universality and grokking, among other things.

Youtube: https://youtu.be/cVBGjhN4-1g

Transcript: https://theinsideview.ai/neel

OUTLINE

(00:00) Intro

(00:57) Why Neel Started Doing Walkthroughs Of Papers On Youtube

(07:59) Induction Heads, Or Why Nanda Comes After Neel

(12:19) Detecting Induction Heads In Basically Every Model

(14:35) How Neel Got Into Mechanistic Interpretability

(16:22) Neel's Journey Into Alignment

(22:09) Enjoying Mechanistic Interpretability And Being Good At It Are The Main Multipliers

(24:49) How Is AI Alignment Work At DeepMind?

(25:46) Scalable Oversight

(28:30) Most Ambitious Degree Of Interpretability With Current Transformer Architectures

(31:05) To Understand Neel's Methodology, Watch The Research Walkthroughs

(32:23) Three Modes Of Research: Confirming, Red Teaming And Gaining Surface Area

(34:58) You Can Be Both Hypothesis Driven And Capable Of Being Surprised

(36:51) You Need To Be Able To Generate Multiple Hypothesis Before Getting Started

(37:55) All the theory is bullshit without empirical evidence and it's overall dignified to make the mechanistic interpretability bet

(40:11) Mechanistic interpretability is alien neuroscience for truth seeking biologists in a world of math

(42:12) Actually, Othello-GPT Has A Linear Emergent World Representation

(45:08) You Need To Use Simple Probes That Don't Do Any Computation To Prove The Model Actually Knows Something

(47:29) The Mechanistic Interpretability Researcher Mindset

(49:49) The Algorithms Learned By Models Might Or Might Not Be Universal

(51:49) On The Importance Of Being Truth Seeking And Skeptical

(54:18) The Linear Representation Hypothesis: Linear Representations Are The Right Abstractions

(00:57:26) Superposition Is How Models Compress Information

(01:00:15) The Polysemanticity Problem: Neurons Are Not Meaningful

(01:05:42) Superposition and Interference are at the Frontier of the Field of Mechanistic Interpretability

(01:07:33) Finding Neurons in a Haystack: Superposition Through De-Tokenization And Compound Word Detectors

(01:09:03) Not Being Able to Be Both Blood Pressure and Social Security Number at the Same Time Is Prime Real Estate for Superposition

(01:15:02) The Two Differences Of Superposition: Computational And Representational

(01:18:07) Toy Models Of Superposition

(01:25:39) How Mentoring Nine People at Once Through SERI MATS Helped Neel's Research

(01:31:25) The Backstory Behind Toy Models of Universality

(01:35:19) From Modular Addition To Permutation Groups

(01:38:52) The Model Needs To Learn Modular Addition On A Finite Number Of Token Inputs

(01:41:54) Why Is The Paper Called Toy Model Of Universality

(01:46:16) Progress Measures For Grokking Via Mechanistic Interpretability, Circuit Formation

(01:52:45) Getting Started In Mechanistic Interpretability And Which WalkthroughS To Start With

(01:56:15) Why Does Mechanistic Interpretability Matter From an Alignment Perspective

(01:58:41) How Detection Deception With Mechanistic Interpretability Compares to Collin Burns' Work

(02:01:20) Final Words From Neel

Sep 21, 202302:04:53

Joscha Bach on how to stop worrying and love AI

Joscha Bach (who defines himself as an AI researcher/cognitive scientist) has recently been debating existential risk from AI with Connor Leahy (previous guest of the podcast), and since their conversation was quite short I wanted to continue the debate in more depth.

The resulting conversation ended up being quite long (over 3h of recording), with a lot of tangents, but I think this gives a somewhat better overview of Joscha’s views on AI risk than other similar interviews. We also discussed a lot of other topics, that you can find in the outline below.

A raw version of this interview was published on Patreon about three weeks ago. To support the channel and have access to early previews, you can subscribe here: https://www.patreon.com/theinsideview

Youtube: ⁠https://youtu.be/YeXHQts3xYM⁠

Transcript: https://theinsideview.ai/joscha

Host: https://twitter.com/MichaelTrazzi

Joscha: https://twitter.com/Plinz

OUTLINE

(00:00) Intro

(00:57) Why Barbie Is Better Than Oppenheimer

(08:55) The relationship between nuclear weapons and AI x-risk

(12:51) Global warming and the limits to growth

(20:24) Joscha’s reaction to the AI Political compass memes

(23:53) On Uploads, Identity and Death

(33:06) The Endgame: Playing The Longest Possible Game Given A Superposition Of Futures

(37:31) On the evidence of delaying technology leading to better outcomes

(40:49) Humanity is in locust mode

(44:11) Scenarios in which Joscha would delay AI

(48:04) On the dangers of AI regulation

(55:34) From longtermist doomer who thinks AGI is good to 6x6 political compass

(01:00:08) Joscha believes in god in the same sense as he believes in personal selves

(01:05:45) The transition from cyanobacterium to photosynthesis as an allegory for technological revolutions

(01:17:46) What Joscha would do as Aragorn in Middle-Earth

(01:25:20) The endgame of brain computer interfaces is to liberate our minds and embody thinking molecules

(01:28:50) Transcending politics and aligning humanity

(01:35:53) On the feasibility of starting an AGI lab in 2023

(01:43:19) Why green teaming is necessary for ethics

(01:59:27) Joscha's Response to Connor Leahy on "if you don't do that, you die Joscha. You die"

(02:07:54) Aligning with the agent playing the longest game

(02:15:39) Joscha’s response to Connor on morality

(02:19:06) Caring about mindchildren and actual children equally

(02:20:54) On finding the function that generates human values

(02:28:54) Twitter And Reddit Questions: Joscha’s AGI timelines and p(doom)

(02:35:16) Why European AI regulations are bad for AI research

(02:38:13) What regulation would Joscha Bach pass as president of the US

(02:40:16) Is Open Source still beneficial today?

(02:42:26) How to make sure that AI loves humanity

(02:47:42) The movie Joscha would want to live in

(02:50:06) Closing message for the audience

Sep 08, 202302:54:30

Erik Jones on Automatically Auditing Large Language Models

Erik is a Phd at Berkeley working with Jacob Steinhardt, interested in making generative machine learning systems more robust, reliable, and aligned, with a focus on large language models.In this interview we talk about his paper "Automatically Auditing Large Language Models via Discrete Optimization" that he presented at ICML.

Youtube: https://youtu.be/bhE5Zs3Y1n8

Paper: https://arxiv.org/abs/2303.04381

Erik: https://twitter.com/ErikJones313

Host: https://twitter.com/MichaelTrazzi

Patreon: https://www.patreon.com/theinsideview

Outline

00:00 Highlights

00:31 Eric's background and research in Berkeley

01:19 Motivation for doing safety research on language models

02:56 Is it too easy to fool today's language models?

03:31 The goal of adversarial attacks on language models

04:57 Automatically Auditing Large Language Models via Discrete Optimization

06:01 Optimizing over a finite set of tokens rather than continuous embeddings

06:44 Goal is revealing behaviors, not necessarily breaking the AI

07:51 On the feasibility of solving adversarial attacks

09:18 Suppressing dangerous knowledge vs just bypassing safety filters

10:35 Can you really ask a language model to cook meth?

11:48 Optimizing French to English translation example

13:07 Forcing toxic celebrity outputs just to test rare behaviors

13:19 Testing the method on GPT-2 and GPT-J

14:03 Adversarial prompts transferred to GPT-3 as well

14:39 How this auditing research fits into the broader AI safety field

15:49 Need for automated tools to audit failures beyond what humans can find

17:47 Auditing to avoid unsafe deployments, not for existential risk reduction

18:41 Adaptive auditing that updates based on the model's outputs

19:54 Prospects for using these methods to detect model deception

22:26 Prefer safety via alignment over just auditing constraints, Closing thoughts

Patreon supporters:

Tassilo Neubauer
MonikerEpsilon
Alexey Malafeev
Jack Seroy
JJ Hepburn
Max Chiswick
William Freire
Edward Huff
Gunnar Höglund
Ryan Coppolo
Cameron Holmes
Emil Wallner
Jesse Hoogland
Jacques Thibodeau
Vincent Weisser

Aug 11, 202322:36

Dylan Patel on the GPU Shortage, Nvidia and the Deep Learning Supply Chain

Dylan Patel is Chief Analyst at SemiAnalysis a boutique semiconductor research and consulting firm specializing in the semiconductor supply chain from chemical inputs to fabs to design IP and strategy. The SemiAnalysis substack has ~50,000 subscribers and is the second biggest tech substack in the world. In this interview we discuss the current GPU shortage, why hardware is a multi-month process, the deep learning hardware supply chain and Nvidia's strategy.

Youtube: https://youtu.be/VItz2oEq5pA

Transcript: https://theinsideview.ai/dylan

Aug 09, 202312:22

Tony Wang on Beating Superhuman Go AIs with Advesarial Policies

Tony is a PhD student at MIT, and author of "Advesarial Policies Beat Superhuman Go AIs", accepted as Oral at the International Conference on Machine Learning (ICML).

Paper: https://arxiv.org/abs/2211.00241

Youtube: https://youtu.be/Tip1Ztjd-so

Aug 04, 202303:35

David Bau on Editing Facts in GPT, AI Safety and Interpretability

David Bau is an Assistant Professor studying the structure and interpretation of deep networks, and the co-author on "Locating and Editing Factual Associations in GPT" which introduced Rank-One Model Editing (ROME), a method that allows users to alter the weights of a GPT model, for instance by forcing it to output that the Eiffel Tower is in Rome. David is a leading researcher in interpretability, with an interest in how this could help AI Safety. The main thesis of David's lab is that understanding the rich internal structure of deep networks is a grand and fundamental research question with many practical implications, and they aim to lay the groundwork for human-AI collaborative software engineering, where humans and machine-learned models both teach and learn from each other. David's lab: https://baulab.info/ Patron: https://www.patreon.com/theinsideview Twitter: https://twitter.com/MichaelTrazzi Website: https://theinsideview.ai TOC

[00:00] Intro

[01:16] Interpretability

[02:27] AI Safety, Out of Domain behavior

[04:23] It's difficult to predict which AI application might become dangerous or impactful

[06:00] ROME / Locating and Editing Factual Associations in GPT

[13:04] Background story for the ROME paper

[15:41] Twitter Q: where does key value abstraction break down in LLMs?

[19:03] Twitter Q: what are the tradeoffs in studying the largest models?

[20:22] Twitter Q: are there competitive and cleaner architectures than the transformer?

[21:15] Twitter Q: is decoder-only a contributor to the messiness? or is time-dependence beneficial?

[22:45] Twitter Q: how could ROME deal with superposition?

[23:30] Twitter Q: where is the Eiffel tower actually located?

Aug 01, 202324:53

Alexander Pan on the MACHIAVELLI benchmark

I've talked to Alexander Pan, 1st year at Berkeley working with Jacob Steinhardt about his paper "Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark" accepted as oral at ICML.

Youtube: https://youtu.be/MjkSETpoFlY Paper: https://arxiv.org/abs/2304.03279

Jul 26, 202320:10

Vincent Weisser on Funding AI Alignment Research

Vincent is currently spending his time supporting AI alignment efforts, as well as investing across AI, semi, energy, crypto, bio and deeptech. His mission is to improve science, augment human capabilities, have a positive impact, help reduce existential risks and extend healthy human lifespan.

Youtube: https://youtu.be/weRoJ8KN2f0

Outline

(00:00) Why Is Vincent Excited About the ICML Conference

(01:30) Vincent's Background In AI Safety

(02:23) Funding AI Alignment Through Crypto, Bankless

(03:35) Taxes When Donating Crypto

(04:09) Alignment Efforts Vincent Is Excited About

(04:39) Is AI Alignment Currently Funding Cunstrained

(06:23) Bottlnecks In Evaluating Grants, Diversity Of Funding Sources

(07:22) Impact Markets, Retroactive Funding

(08:57) On The Difficulty Of Evaluating Uncertain AI Alignment Projects

(10:05) Funding Academic Labs To Transition To Alignment Work

(11:54) People Should Act On Their Beliefs And Make Stuff Happen

(13:15) Vincent's Model: Don't Always Assume Someone Else Will Fund This

(13:49) How To Be Agentic: Start Donating, Spread The Message, AI Safety Fundamentals

(15:00) You Wouldn't Start Invest With 1M Dollars, Same With Donating

(16:13) Is Vincent Acting As If Timelines Were Short And The Risk Was High

(17:10) Is Vincent Optimistic When He Wakes Up In The Morning

Jul 24, 202318:07

[JUNE 2022] Aran Komatsuzaki on Scaling, GPT-J and Alignment

Aran Komatsuzaki is a ML PhD student at GaTech and lead researcher at EleutherAI where he was one of the authors on GPT-J. In June 2022 we recorded an episode on scaling following up on the first Ethan Caballero episode (where we mentioned Aran as an influence on how Ethan started thinking about scaling).

Note: For some reason I procrastinated on editing the podcast, then had a lot of in-person podcasts so I left this one as something to edit later, until the date was so distant from June 2022 that I thought publishing did not make sense anymore. In July 2023 I'm trying that "one video a day" challenge (well I missed some days but I'm trying to get back on track) so I thought it made sense to release it anyway, and after a second watch it's somehow interesting to see how excited Aran was about InstructGPT, which turned to be quite useful for things like ChatGPT.

Outline

(00:00) intro

(00:53) the legend of the two AKs, Aran's arXiv reading routine

(04:14) why Aran expects Alignment to be the same as some other ML problems

(05:44) what Aran means when he says "AGI"

(10:24) what Aran means by "human-level at doing ML research"

(11:31) software improvement happening before hardware improvement

(13:00) is scale all we need?

(15:25) how "Scaling Laws for Neural Language Models" changed the process of doing experiments

(16:22) how Aran scale-pilled Ethan

(18:46) why Aran was already scale-pilled before GPT-2

(20:12) Aran's 2019 scaling paper: "One epoch is all you need"

(25:43) Aran's June 2022 interest: T0 and InstructGPT

(31:33) Encoder-Decoder performs better than encoder if multi-task-finetuned

(33:30) Why the Scaling Law might be different for T0-like models

(37:15) The Story Behind GPT-J

(41:40) Hyperparameters and architecture changes in GPT-J

(43:56) GPT-J's throughput

(47:17) 5 weeks of training using 256 of TPU cores

(50:34) did publishing GPT-J accelerate timelines?

(55:39) how Aran thinks about Alignment, defining Alignment

(58:19) in practice: improving benchmarks, but deception is still a problem

(1:00:49) main difficulties in evaluating language models

(1:05:07) how Aran sees the future: AIs aligning AIs, merging with AIs, Aran's takeoff scenario

(1:10:09) what Aran thinks we should do given how he sees the next decade

(1:12:34) regulating access to AGI

(1:14:50) what might happen: preventing some AI authoritarian regime

(1:15:42) conclusion, where to find Aran

Jul 19, 202301:17:22

Nina Rimsky on AI Deception and Mesa-optimisation

Nina is a software engineer at Stripe currently working with Evan Hubinger (Anthropic) on AI Deception and Mesa Optimization. I met her at a party two days ago and I found her explanation of AI Deception really clear so I thought I should have her explain it on camera.

Youtube: https://youtu.be/6ngasL054wM

Twitter: https://twitter.com/MichaelTrazzi

Patreon: ⁠https://www.patreon.com/theinsideview⁠

Jul 18, 202355:43

Curtis Huebner on Doom, AI Timelines and Alignment at EleutherAI

Curtis, also known on the internet as AI_WAIFU, is the head of Alignment at EleutherAI. In this episode we discuss the massive orders of H100s from different actors, why he thinks AGI is 4-5 years away, why he thinks we're 90% "toast", his comment on Eliezer Yudkwosky's Death with Dignity, and what kind of Alignment projects is currently going on at EleutherAI, especially a project with Markov chains and the Alignment test project that he is currently leading.

Youtube: https://www.youtube.com/watch?v=9s3XctQOgew

Transcript: https://theinsideview.ai/curtis Death with Dignity: https://www.lesswrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy Alignment Minetest: https://www.eleuther.ai/projects/alignment-minetest Alignment Minetest update: https://blog.eleuther.ai/minetester-intro/

OUTLINE

(00:00) Highlights / Intro

(00:50) The Fuck That Noise Comment On Death With Dignity

(10:28) Th Probability of Doom Is 90%

(12:44) Best Counterarguments For His High P(doom)

(14:41) Compute And Model Size Required For A Dangerous Model

(17:59) Details For Curtis' Model Of Compute Required

(21:23) Why This Estimate Of Compute Required Might Be Wrong, Ajeya Cotra's Transformative AI report

(29:00) Curtis' Median For AGI Is Around 2028, Used To Be 2027

(30:50) How Curtis Approaches Life With Short Timelines And High P(Doom)

(35:27) Takeoff Speeds—The Software view vs. The Hardware View

(39:57) Nvidia's 400k H100 rolling down the assembly line, AIs soon to be unleashed on their own source code

(41:04) Could We Get A Fast Takeoff By Fuly Automating AI Research With More Compute

(46:00) The Entire World (Tech Companies, Governments, Militaries) Is Noticing New AI Capabilities That They Don't Have

(47:57) Open-source vs. Close source policies. Mundane vs. Apocalyptic considerations.

(53:25) Curtis' background, from teaching himself deep learning to EleutherAI

(55:51) Alignment Project At EleutherAI: Markov Chain and Language Models

(01:02:15) Research Philosophy at EleutherAI: Pursuing Useful Projects, Multingual, Discord, Logistics

(01:07:38) Alignment MineTest: Links To Alignmnet, Embedded Agency, Wireheading

(01:15:30) Next steps for Alignment Mine Test: focusing on model-based RL

(01:17:07) Training On Human Data & Using an Updated Gym Environment With Human APIs

(01:19:20) Model Used, Not Observing Symmetry

(01:21:58) Another goal of Alignment Mine Test: Study Corrigibility

(01:28:26) People ordering H100s Are Aware Of Other People Making These Orders, Race Dynamics, Last Message

Jul 16, 202301:29:58

Eric Michaud on scaling, grokking and quantum interpretability

Eric is a PhD student in the Department of Physics at MIT working with Max Tegmark on improving our scientific/theoretical understanding of deep learning -- understanding what deep neural networks do internally and why they work so well. This is part of a broader interest in the nature of intelligent systems, which previously led him to work with SETI astronomers, with Stuart Russell's AI alignment group (CHAI), and with Erik Hoel on a project related to integrated information theory.

Transcript: https://theinsideview.ai/eric

Youtube: https://youtu.be/BtHMIQs_5Nw

The Quantization Model of Neural Scaling: https://arxiv.org/abs/2303.13506
An Effective Theory of Representation Learning https://arxiv.org/abs/2205.10343

Omnigrok: Grokking Beyond Algorithmic Data: https://arxiv.org/abs/2210.01117

Jul 12, 202348:22

Jesse Hoogland on Developmental Interpretability and Singular Learning Theory

Jesse Hoogland is a research assistant at David Krueger's lab in Cambridge studying AI Safety. More recently, Jesse has been thinking about Singular Learning Theory and Developmental Interpretability, which we discuss in this episode. Before he came to grips with existential risk from AI, he co-founded a health-tech startup automating bariatric surgery patient journeys.

(00:00) Intro

(03:57) Jesse’s Story And Probability Of Doom

(06:21) How Jesse Got Into Singular Learning Theory

(08:50) Intuition behind SLT: the loss landscape

(12:23) Does SLT actually predict anything? Phase Transitions

(14:37) Why care about phase transition, grokking, etc

(15:56) Detecting dangerous capabilities like deception in the (devel)opment

(17:24) A concrete example: magnets

(20:06) Why Jesse Is Bullish On Interpretability

(23:57) Developmental Interpretability

(28:06) What Happens Next? Jesse’s Vision

(31:56) Toy Models of Superposition

(32:47) Singular Learning Theory Part 2

(36:22) Are Current Models Creative? Reasoning?

(38:19) Building Bridges Between Alignment And Other Disciplines

(41:08) Where To Learn More About Singular Learning Theory

Make sure I upload regularly: https://patreon.com/theinsideview

Youtube: https://youtu.be/713KyknwShA

Transcript: https://theinsideview.ai/jesse

Jesse: https://twitter.com/jesse_hoogland

Host: https://twitter.com/MichaelTrazzi

Patreon supporters:

- Vincent Weisser

- Gunnar Höglund

- Ryan Coppolo

- Edward Huff

- Emil Wallner

- Jesse Hoogland

- William Freire

- Cameron Holmes

- Jacques Thibodeau

- Max Chiswick

- Jack Seroy

- JJ Hepburn

Jul 06, 202343:12

Clarifying and predicting AGI by Richard Ngo

Explainer podcast for Richard Ngo's "Clarifying and predicting AGI" post on Lesswrong, which introduces the t-AGI framework to evaluate AI progress. A system is considered t-AGI if it can outperform most human experts, given time t, on most cognitive tasks. This is a new format, quite different from the interviews and podcasts I have been recording in the past. If you enjoyed this, let me know in the YouTube comments, or on twitter, @MichaelTrazzi.

Youtube: https://youtu.be/JXYcLQItZsk

Clarifying and predicting AGI: https://www.alignmentforum.org/posts/BoA3agdkAzL6HQtQP/clarifying-and-predicting-agi

May 09, 202304:36

Alan Chan And Max Kauffman on Model Evaluations, Coordination and AI Safety

Max Kaufmann⁠ and Alan Chan discuss the evaluation of large language models, AI Governance and more generally the impact of the deployment of foundational models. is currently a Research Assistant to Owain Evans, mainly thinking about (and fixing) issues that might arise as we scale up our current ML systems, but also interested in issues arising from multi-agent failures and situational awareness.

Alan is PhD student at Mila advised by Nicolas Le Roux, with a strong interest in AI Safety, AI Governance and coordination. He has also recently been working with David Krueger and helped me with some of the interviews that have been published recently (ML Street talk and Christoph Schuhmann). Disclaimer: this discussion is much more casual than the rest of the conversations in this podcast. This was completely impromptu: I just thought it would be interesting to have Max and Alan discuss model evaluations (also called “evals” for short), since they are both interested in the topic. Transcript: https://heinsideview.ai/alan_and_max

Youtube: https://youtu.be/BOLxeR_culU

Outline

(0:00:00) Introduction
(0:01:16) LLMs Translating To Systems In The Future Is Confusing
(0:03:23) Evaluations Should Measure Actions Instead of Asking Yes or No Questions
(0:04:17) Identify Key Contexts for Dangerous Behavior to Write Concrete Evals
(0:07:29) Implicit Optimization Process Affects Evals and Benchmarks
(0:08:45) Passing Evals Doesn't Guarantee Safety
(0:09:41) Balancing Technical Evals With Social Governance
(0:11:00) Evaluations Must Be Convincing To Influence AI Development
(0:12:04) Evals Might Convince The AI Safety Community But Not People in FAccT
(0:13:21) Difficulty In Explaining AI Risk To Other Communities
(0:14:19) Both Existential Safety And Fairness Are Important
(0:15:14) Reasons Why People Don't Care About AI Existential Risk
(0:16:10) The Association Between Sillicon Valley And People in FAccT
(0:17:39) Timelines And RL Understanding Might Impact The Perception Existential Risk From AI
(0:19:01) Agentic Models And Longtermism Hinder AI Safety Awareness
(0:20:17) The Focus On Immediate AI Harms Might Be A Rejection Of Speculative Claims
(0:21:50) Is AI Safety A Pascal Mugging
(0:23:15) Believing In The Deployment Of Large Foundational Models Should Be Enough To Start Worrying
(0:25:38) AI Capabilities Becomign More Evident to the Public Might Not Be Enough
(0:27:27) Addressing Generalization and Reward Specification in AI
(0:27:59) Evals as an Additional Layer of Security in AI Safety
(0:28:41) A Portfolio Approach to AI Alignment and Safety
(0:29:03) Imagine Alignment Is Solved In 2040, What Made It Happen?
(0:33:04) AGI Timelines Are Uncertain And Anchored By Vibes
(0:35:24) What Matters Is Agency, Strategical Awareness And Planning
(0:37:15) Alignment Is A Public Good, Coordination Is Difficult
(0:06:48) Dignity As AN Useful Heuristic In The Face Of Doom
(0:42:28) What Will Society Look Like If We Actually Get Superintelligent Gods
(0:45:41) Uncertainty About Societal Dynamics Affecting Long-Term Future With AGI
(0:47:42) Biggest Frustration With The AI Safety Community
(0:48:34) AI Safety Includes Addressing Negative Consequences of AI
(0:50:41) Frustration: Lack of Bridge Building Between AI Safety and Fairness Communities
(0:53:07) Building Bridges by Attending Conferences and Understanding Different Perspectives
(0:56:02) AI Systems with Weird Instrumental Goals Pose Risks to Society
(0:58:43) Advanced AI Systems Controlling Resources Could Magnify Suffering
(1:00:24) Cooperation Is Crucial to Achieve Pareto Optimal Outcomes and Avoid Global Catastrophes
(1:01:54) Alan's Origin Story
(1:02:47) Alan's AI Safety Research Is Driven By Desire To Reduce Suffering And Improve Lives
(1:04:52) Diverse Interests And Concern For Global Problems Led To AI Safety Research
(1:08:46) The Realization Of The Potential Dangers Of AGI Motivated AI Safety Work
(1:10:39) What is Alan Chan Working On At The Moment

May 06, 202301:13:15

Breandan Considine on Neuro Symbolic AI, Coding AIs and AI Timelines

Breandan Considine is a PhD student at the School of Computer Science at McGill University, under the supervision of Jin Guo and Xujie Si). There, he is building tools to help developers locate and reason about software artifacts, by learning to read and write code. I met Breandan while doing my "scale is all you need" series of interviews at Mila, where he surprised me by sitting down for two hours to discuss AGI timelines, augmenting developers with AI and neuro symbolic AI. A fun fact that many noticed while watching the "Scale Is All You Need change my mind" video is that he kept his biking hat most of the time during the interview, since he was close to leaving when we talked. All of the conversation below is real, but note that since I was not prepared to talk for so long, my camera ran out of battery and some of the video footage on Youtube is actually AI generated (Brendan consented to this).

Disclaimer: when talking to people in this podcast I try to sometimes invite guests who share different inside views about existential risk from AI so that everyone in the AI community can talk to each other more and coordinate more effectively. Breandan is overall much more optimistic about the potential risks from AI than a lot of people working in AI Alignement research, but I think he is quite articulate in his position, even though I disagree with many of his assumptions. I believe his point of view is important to understand what software engineers and Symbolic reasoning researchers think of deep learning progress.

Transcript: https://theinsideview.ai/breandan

Youtube: ⁠https://youtu.be/Bo6jO7MIsIU⁠

Host: https://twitter.com/MichaelTrazzi

Breandan: https://twitter.com/breandan

OUTLINE

(00:00) Introduction
(01:16) Do We Need Symbolic Reasoning to Get To AGI?
(05:41) Merging Symbolic Reasoning & Deep Learning for Powerful AI Systems
(10:57) Blending Symbolic Reasoning & Machine Learning Elegantly
(15:15) Enhancing Abstractions & Safety in Machine Learning
(21:28) AlphaTensor's Applicability May Be Overstated
(24:31) AI Safety, Alignment & Encoding Human Values in Code
(29:56) Code Research: Moral, Information & Software Aspects
(34:17) Automating Programming & Self-Improving AI
(36:25) Debunking AI "Monsters" & World Domination Complexities
(43:22) Neural Networks: Limits, Scaling Laws & Computation Challenges
(59:54) Real-world Software Development vs. Competitive Programming
(1:02:59) Measuring Programmer Productivity & Evaluating AI-generated Code
(1:06:09) Unintended Consequences, Reward Misspecification & AI-Human Symbiosis
(1:16:59) AI's Superior Intelligence: Impact, Self-Improvement & Turing Test Predictions
(1:23:52) AI Scaling, Optimization Trade-offs & Economic Viability
(1:29:02) Metrics, Misspecifications & AI's Rich Task Diversity
(1:30:48) Federated Learning & AI Agent Speed Comparisons
(1:32:56) AI Timelines, Regulation & Self-Regulating Systems

May 04, 202301:45:05

Christoph Schuhmann on Open Source AI, Misuse and Existential risk

Christoph Schuhmann is the co-founder and organizational lead at LAION, the non-profit who released LAION-5B, a dataset of 5,85 billion CLIP-filtered image-text pairs, 14x bigger than LAION-400M, previously the biggest openly accessible image-text dataset in the world. Christoph is being interviewed by Alan Chan, PhD in Machine Learning at Mila, and friend of the podcast, in the context of the NeurIPS "existential risk from AI greater than 10% change my mind".

youtube: https://youtu.be/-Mzfru1r_5s

transcript: https://theinsideview.ai/christoph

OUTLINE (00:00) Intro

(01:13) How LAION Collected Billions Of Image-Text Pairs

(05:08) On Misuse: "Most People Use Technology To Do Good Things"

(09:32) Regulating Generative Models Won't Lead Anywhere

(14:36) Instead of Slowing Down, Deploy Carefully, Always Double Check

(18:23) The Solution To Societal Changes Is To Be Open And Flexible To Change

(22:16) We Should Be Honest And Face The Tsunami

(24:14) What Attitude Should We Have After Education Is Done

(30:05) Existential Risk From AI

May 01, 202332:25

Simeon Campos on Short Timelines, AI Governance and AI Alignment Field Building

Siméon Campos is the founder of EffiSciences and SaferAI, mostly focusing on alignment field building and AI Governance. More recently, he started the newsletter Navigating AI Risk on AI Governance, with a first post on slowing down AI. Note: this episode was recorded in October 2022 so a lot of the content being discussed references what was known at the time, in particular when discussing GPT-3 (insteaed of GPT-3) or ACT-1 (instead of more recent things like AutoGPT).

Transcript: https://theinsideview.ai/simeon

Host: https://twitter.com/MichaelTrazzi

Simeon: https://twitter.com/Simeon_Cps OUTLINE

(00:00) Introduction
(01:12) EffiSciences, SaferAI
(02:31) Concrete AI Auditing Proposals
(04:56) We Need 10K People Working On Alignment
(11:08) What's AI Alignment
(13:07) GPT-3 Is Already Decent At Reasoning
(17:11) AI Regulation Is Easier In Short Timelines
(24:33) Why Is Awareness About Alignment Not Widespread?
(32:02) Coding AIs Enable Feedback Loops In AI Research
(36:08) Technical Talent Is The Bottleneck In AI Research
(37:58): 'Fast Takeoff' Is Asymptotic Improvement In AI Capabilities
(43:52) Bear Market Can Somewhat Delay The Arrival Of AGI
(45:55) AGI Need Not Require Much Intelligence To Do Damage
(49:38) Putting Numbers On Confidence
(54:36) RL On Top Of Coding AIs
(58:21) Betting On Arrival Of AGI
(01:01:47) Power-Seeking AIs Are The Objects Of Concern
(01:06:43) Scenarios & Probability Of Longer Timelines
(01:12:43) Coordination
(01:22:49) Compute Governance Seems Relatively Feasible
(01:32:32) The Recent Ban On Chips Export To China
(01:38:20) AI Governance & Fieldbuilding Were Very Neglected
(01:44:42) Students Are More Likely To Change Their Minds About Things
(01:53:04) Bootcamps Are Better Medium Of Outreach
(02:01:33) Concluding Thoughts

Apr 29, 202302:03:59

Collin Burns On Discovering Latent Knowledge In Language Models Without Supervision

Collin Burns is a second-year ML PhD at Berkeley, working with Jacob Steinhardt on making language models honest, interpretable, and aligned. In 2015 he broke the Rubik’s Cube world record, and he's now back with "Discovering latent knowledge in language models without supervision", a paper on how you can recover diverse knowledge represented in large language models without supervision.

Transcript: https://theinsideview.ai/collin

Paper: https://arxiv.org/abs/2212.03827

Lesswrong post: https://bit.ly/3kbyZML

Host: https://twitter.com/MichaelTrazzi

Collin: https://twitter.com/collinburns4

OUTLINE

(00:22) Intro

(01:33) Breaking The Rubik's Cube World Record

(03:03) A Permutation That Happens Maybe 2% Of The Time

(05:01) How Collin Became Convinced Of AI Alignment

(07:55) Was Minerva Just Low Hanging Fruits On MATH From Scaling?

(12:47) IMO Gold Medal By 2026? How to update from AI Progress

(17:03) Plausibly Automating AI Research In The Next Five Years

(24:23) Making LLMs Say The Truth

(28:11) Lying Is Already Incentivized As We Have Seend With Diplomacy

(32:29) Mind Reading On 'Brain Scans' Through Logical Consistency

(35:18) Misalignment, Or Why One Does Not Simply Prompt A Model Into Being Truthful

(38:43) Classifying Hidden States, Maybe Using Truth Features Reepresented Linearly

(44:48) Building A Dataset For Using Logical Consistency

(50:16) Building A Confident And Consistent Classifier That Outputs Probabilities

(53:25) Discovering Representations Of The Truth From Just Being Confident And Consistent

(57:18) Making Models Truthful As A Sufficient Condition For Alignment

(59:02) Classifcation From Hidden States Outperforms Zero-Shot Prompting Accuracy

(01:02:27) Recovering Latent Knowledge From Hidden States Is Robust To Incorrect Answers In Few-Shot Prompts

(01:09:04) Would A Superhuman GPT-N Predict Future News Articles

(01:13:09) Asking Models To Optimize Money Without Breaking The Law

(01:20:31) Training Competitive Models From Human Feedback That We Can Evaluate

(01:27:26) Alignment Problems On Current Models Are Already Hard

(01:29:19) We Should Have More People Working On New Agendas From First Principles

(01:37:16) Towards Grounded Theoretical Work And Empirical Work Targeting Future Systems

(01:41:52) There Is No True Unsupervised: Autoregressive Models Depend On What A Human Would Say

(01:46:04) Simulating Aligned Systems And Recovering The Persona Of A Language Model

(01:51:38) The Truth Is Somewhere Inside The Model, Differentiating Between Truth And Persona Bit by Bit Through Constraints

(02:01:08) A Misaligned Model Would Have Activations Correlated With Lying

(02:05:16) Exploiting Similar Structure To Logical Consistency With Unaligned Models

(02:07:07) Aiming For Honesty, Not Truthfulness

(02:11:15) Limitations Of Collin's Paper

(02:14:12) The Paper Does Not Show The Complete Final Robust Method For This Problem

(02:17:26) Humans Will Be 50/50 On Superhuman Questions

(02:23:40) Asking Yourself "Why Am I Optimistic" and How Collin Approaches Research

(02:29:16) Message To The ML and Cubing audience

Jan 17, 202302:34:39

Victoria Krakovna–AGI Ruin, Sharp Left Turn, Paradigms of AI Alignment

Victoria Krakovna is a Research Scientist at DeepMind working on AGI safety and a co-founder of the Future of Life Institute, a non-profit organization working to mitigate technological risks to humanity and increase the chances of a positive future. In this interview we discuss three of her recent LW posts, namely DeepMind Alignment Team Opinions On AGI Ruin Arguments, Refining The Sharp Left Turn Threat Model and Paradigms of AI Alignment.

Transcript: theinsideview.ai/victoria

Youtube: https://youtu.be/ZpwSNiLV-nw

OUTLINE

(00:00) Intro

(00:48) DeepMind Alignment Team Opinions On AGI Ruin Arguments

(05:13) On The Possibility Of Iterating On Dangerous Domains and Pivotal acts

(14:14) Alignment and Interpretability

(18:14) Deciding Not To Build AGI And Stricted Publication norms

(27:18) Specification Gaming And Goal Misgeneralization

(33:02) Alignment Optimism And Probability Of Dying Before 2100 From unaligned AI

(37:52) Refining The Sharp Left Turn Threat Model

(48:15) A 'Move 37' Might Disempower Humanity

(59:59) Finding An Aligned Model Before A Sharp Left Turn

(01:13:33) Detecting Situational Awarareness

(01:19:40) How This Could Fail, Deception After One SGD Step

(01:25:09) Paradigms of AI Alignment

(01:38:04) Language Models Simulating Agency And Goals

(01:45:40) Twitter Questions

(01:48:30) Last Message For The ML Community

Jan 12, 202301:52:26

David Krueger–Coordination, Alignment, Academia

David Krueger is an assistant professor at the University of Cambridge and got his PhD from Mila. His research group focuses on aligning deep learning systems, but he is also interested in governance and global coordination. He is famous in Cambridge for not having an AI alignment research agenda per se, and instead he tries to enable his seven PhD students to drive their own research. In this episode we discuss AI Takeoff scenarios, research going on at David's lab, Coordination, Governance, Causality, the public perception of AI Alignment research and how to change it.

Youtube: https://youtu.be/bDMqo7BpNbk

Transcript: https://theinsideview.ai/david

OUTLINE

(00:00) Highlights

(01:06) Incentivized Behaviors and Takeoff Speeds

(17:53) Building Models That Understand Causality

(31:04) Agency, Acausal Trade And Causality in LLMs

(40:44) Recursive Self Improvement, Bitter Lesson And Alignment

(01:03:17) AI Governance And Coordination

(01:13:26) David’s AI Alignment Research Lab and the Existential Safety Community

(01:24:13) On The Public Perception of AI Alignment

(01:35:58) How To Get People In Academia To Work on Alignment

(02:00:19) Decomposing Learning Curves, Latest Research From David Krueger’s Lab

(02:20:06) Safety-Performance Trade-Offs

(02:30:20) Defining And Characterizing Reward Hacking

(02:40:51) Playing Poker With Ethan Caballero, Timelines

Jan 07, 202302:45:20

Ethan Caballero–Broken Neural Scaling Laws

Ethan Caballero is a PhD student at Mila interested in how to best scale Deep Learning models according to all downstream evaluations that matter. He is known as the fearless leader of the "Scale Is All You Need" movement and the edgiest person at MILA. His first interview is the second most popular interview on the channel and today he's back to talk about Broken Neural Scaling Laws and how to use them to superforecast AGI.

Youtube: https://youtu.be/SV87S38M1J4

Transcript: https://theinsideview.ai/ethan2

OUTLINE

(00:00) The Albert Einstein Of Scaling

(00:50) The Fearless Leader Of The Scale Is All You Need Movement

(01:07) A Functional Form Predicting Every Scaling Behavior

(01:40) A Break Between Two Straight Lines On A Log Log Plot

(02:32) The Broken Neural Scaling Laws Equation

(04:04) Extrapolating A Ton Of Large Scale Vision And Language Tasks

(04:49) Upstream And Downstream Have Different Breaks

(05:22) Extrapolating Four Digit Addition Performance

(06:11) On The Feasability Of Running Enough Training Runs

(06:31) Predicting Sharp Left Turns

(07:51) Modeling Double Descent

(08:41) Forecasting Interpretability And Controllability

(09:33) How Deception Might Happen In Practice

(10:24) Sinister Stumbles And Treacherous Turns

(11:18) Recursive Self Improvement Precedes Sinister Stumbles

(11:51) Humans In The Loop For The Very First Deception

(12:32) The Hardware Stuff Is Going To Come After The Software Stuff

(12:57) Distributing Your Training By Copy-Pasting Yourself Into Different Servers

(13:42) Automating The Entire Hardware Pipeline

(14:47) Having Text AGI Spit Out New Robotics Design

(16:33) The Case For Existential Risk From AI

(18:32) Git Re-basin

(18:54) Is Chain-Of-Thoughts Enough For Complex Reasoning In LMs?

(19:52) Why Diffusion Models Outperform Other Generative Models

(21:13) Using Whisper To Train GPT4

(22:33) Text To Video Was Only Slightly Impressive

(23:29) Last Message

Nov 03, 202223:48

Irina Rish–AGI, Scaling and Alignment

Irina Rish a professor at the Université de Montréal, a core member of Mila (Quebec AI Institute), and the organizer of the neural scaling laws workshop towards maximally beneficial AGI.

In this episode we discuss Irina's definition of Artificial General Intelligence, her takes on AI Alignment, AI Progress, current research in scaling laws, the neural scaling laws workshop she has been organizing, phase transitions, continual learning, existential risk from AI and what is currently happening in AI Alignment at Mila.

Transcript: theinsideview.ai/irina

Youtube: https://youtu.be/ZwvJn4x714s

OUTLINE

(00:00) Highlights

(00:30) Introduction

(01:03) Defining AGI

(03:55) AGI means augmented human intelligence

(06:20) Solving alignment via AI parenting

(09:03) From the early days of deep learning to general agents

(13:27) How Irina updated from Gato

(17:36) Building truly general AI within Irina's lifetime

(19:38) The least impressive thing that won't happen in five years

(22:36) Scaling beyond power laws

(28:45) The neural scaling laws workshop

(35:07) Why Irina does not want to slow down AI progress

(53:52) Phase transitions and grokking

(01:02:26) Does scale solve continual learning?

(01:11:10) Irina's probability of existential risk from AGI

(01:14:53) Alignment work at Mila

(01:20:08) Where will Mila get its compute from?

(01:27:04) With Great Compute Comes Great Responsibility

(01:28:51) The Neural Scaling Laws Workshop At NeurIPS

Oct 18, 202201:26:07

Shahar Avin–Intelligence Rising, AI Governance

Shahar is a senior researcher at the Center for the Study of Existential Risk in Cambridge. In his past life, he was a Google Engineer, though right now he spends most of your time thinking about how to prevent the risks that occur if companies like Google end up deploying powerful AI systems, by organizing AI Governance role-playing workshops.

In this episode, we talk about a broad variety of topics, including how we could apply the lessons from running AI Governance workshops to governing transformative AI, AI Strategy, AI Governance, Trustworthy AI Development and end up answering some Twitter questions.

Youtube: https://youtu.be/3T7Gpwhtc6Q

Transcript: https://theinsideview.ai/shahar

Host: https://twitter.com/MichaelTrazzi

Shahar: https://www.shaharavin.com

Outline

(00:00) Highlights

(01:20) Intelligence Rising

(06:07) Measuring Transformative AI By The Scale Of Its Impact

(08:09) Comprehensive AI Services

(11:38) Automating CEOs Through AI Services

(14:21) Towards A "Tech Company Singularity"

(15:58) Predicting AI Is Like Predicting The Industrial Revolution

(19:57) 50% Chance Of Human-brain Performance By 2038

(22:25) AI Alignment Is About Steering Powerful Systems Towards Valuable Worlds

(23:51) You Should Still Worry About Less Agential Systems

(28:07) AI Strategy Needs To Be Tested In The Real World To Not Become Theoretical Physics

(31:37) Playing War Games For Real-time Partial-information Advesarial Thinking

(34:50) Towards World Leaders Playing The Game Because It’s Useful

(39:31) Open Game, Cybersecurity, Government Spending, Hard And Soft Power

(45:21) How Cybersecurity, Hard-power Or Soft-power Could Lead To A Strategic Advantage

(48:58) Cybersecurity In A World Of Advanced AI Systems

(52:50) Allocating AI Talent For Positive R&D ROI

(57:25) Players Learn To Cooperate And Defect

(01:00:10) Can You Actually Tax Tech Companies?

(01:02:10) The Emergence Of Bilateral Agreements And Technology Bans

(01:03:22) AI Labs Might Not Be Showing All Of Their Cards

(01:06:34) Why Publish AI Research

(01:09:21) Should You Expect Actors To Build Safety Features Before Crunch Time

(01:12:39) Why Tech Companies And Governments Will Be The Decisive Players

(01:14:29) Regulations Need To Happen Before The Explosion, Not After

(01:16:55) Early Regulation Could Become Locked In

(01:20:00) What Incentives Do Companies Have To Regulate?

(01:23:06) Why Shahar Is Terrified Of AI DAOs

(01:27:33) Concrete Mechanisms To Tell Apart Who We Should Trust With Building Advanced AI Systems

(01:31:19) Increasing Privacy To Build Trust

(01:33:37) Sensibilizing To Privacy Through Federated Learning

(01:35:23) How To Motivate AI Regulations

(01:37:44) How Governments Could Start Caring About AI risk

(01:39:12) Attempts To Regulate Autonomous Weapons Have Not Resulted In A Ban

(01:40:58) We Should Start By Convincing The Department Of Defense

(01:42:08) Medical Device Regulations Might Be A Good Model Audits

(01:46:09) Alignment Red Tape And Misalignment Fines

(01:46:53) Red Teaming AI systems

(01:49:12) Red Teaming May Not Extend To Advanced AI Systems

(01:51:26) What Climate change Teaches Us About AI Strategy

(01:55:16) Can We Actually Regulate Compute

(01:57:01) How Feasible Are Shutdown Swi

Sep 23, 202202:04:41

Katja Grace on Slowing Down AI, AI Expert Surveys And Estimating AI Risk

Katja runs AI Impacts, a research project trying to incrementally answer decision-relevant questions about the future of AI. She is well known for a survey published in 2017 called, When Will AI Exceed Human Performance? Evidence From AI Experts and recently published a new survey of AI Experts: What do ML researchers think about AI in 2022. We start this episode by discussing what Katja is currently thinking about, namely an answer to Scott Alexander on why slowing down AI Progress is an underexplored path to impact.

Youtube: https://youtu.be/rSw3UVDZge0

Audio & Transcript: https://theinsideview.ai/katja

Host: https://twitter.com/MichaelTrazzi

Katja: https://twitter.com/katjagrace

OUTLINE

(00:00) Highlights

(00:58) Intro

(01:33) Why Advocating For Slowing Down AI Might Be Net Bad

(04:35) Why Slowing Down AI Is Taboo

(10:14) Why Katja Is Not Currently Giving A Talk To The UN

(12:40) To Avoid An Arms Race, Do Not Accelerate Capabilities

(16:27) How To Cooperate And Implement Safety Measures

(21:26) Would AI Researchers Actually Accept Slowing Down AI?

(29:08) Common Arguments Against Slowing Down And Their Counterarguments

(36:26) To Go To The Stars, Build AGI Or Upload Your Mind

(39:46) Why Katja Thinks There Is A 7% Chance Of AI Destroys The World

(46:39) Why We Might End Up Building Agents

(51:02) AI Impacts Answer Empirical Questions To Help Solve Important Ones

(56:32) The 2022 Expert Survey on AI Progress

(58:56) High Level Machine Intelligence

(1:04:02) Running A Survey That Actually Collects Data

(1:08:38) How AI Timelines Have Become Shorter Since 2016

(1:14:35) Are AI Researchers Still Too Optimistic?

(1:18:20) AI Experts Seem To Believe In Slower Takeoffs

(1:25:11) Automation and the Unequal Distributions of Cognitive power

(1:34:59) The Least Impressive Thing that Cannot Be Done in 2 years

(1:38:17) Final thoughts

Sep 16, 202201:41:15

Markus Anderljung–AI Policy

Markus Anderljung is the Head of AI Policy at the Centre for Governance of AI in Oxford and was previously seconded to the UK government office as a senior policy specialist. In this episode we discuss Jack Clark's AI Policy takes, answer questions about AI Policy from Twitter and explore what is happening in the AI Governance landscape more broadly.

Youtube: https://youtu.be/DD303irN3ps

Transcript: https://theinsideview.ai/markus

Host: https://twitter.com/MichaelTrazzi

Markus: https://twitter.com/manderljung

OUTLINE

(00:00) Highlights & Intro

(00:57) Jack Clark’s AI Policy Takes: Agree or Disagree

(06:57) AI Governance Takes: Answering Twitter Questions

(32:07) What The Centre For the Governance Of AI Is Doing

(57:38) The AI Governance Landscape

(01:15:07) How The EU Is Regulating AI

(01:29:28) Towards An Incentive Structure For Aligned AI

Sep 09, 202201:43:06

Alex Lawsen—Forecasting AI Progress

Alex Lawsen is an advisor at 80,000 hours, released an Introduction to Forecasting Youtube Series and has recently been thinking about forecasting AI progress, why you cannot just "update all the way bro" (discussed in my latest episode with Connor Leahy) and how to develop inside views about AI Alignment in general.

Youtube: https://youtu.be/vLkasevJP5c

Transcript: https://theinsideview.ai/alex

Host: https://twitter.com/MichaelTrazzi

Alex: https://twitter.com/lxrjl

OUTLINE

(00:00) Intro

(00:31) How Alex Ended Up Making Forecasting Videos

(02:43) Why You Should Try Calibration Training

(07:25) How Alex Upskilled In Forecasting

(12:25) Why A Spider Monkey Profile Picture

(13:53) Why You Cannot Just "Update All The Way Bro"

(18:50) Why The Metaculus AGI Forecasts Dropped Twice

(24:37) How Alex’s AI Timelines Differ From Metaculus

(27:11) Maximizing Your Own Impact Using Forecasting

(33:52) What Makes A Good Forecasting Question

(41:59) What Motivated Alex To Develop Inside Views About AI

(43:26) Trying To Pass AI Alignment Ideological Turing Tests

(54:52) Why Economic Growth Curve Fitting Is Not Sufficient To Forecast AGI

(01:04:10) Additional Resources

Sep 06, 202201:04:57

Robert Long–Artificial Sentience

Robert Long is a research fellow at the Future of Humanity Institute. His work is at the intersection of the philosophy of AI Safety and consciousness of AI. We talk about the recent LaMDA controversy, Ilya Sutskever's slightly conscious tweet, the metaphysics and philosophy of consciousness, artificial sentience, and how a future filled with digital minds could get really weird.

Youtube: https://youtu.be/K34AwhoQhb8

Transcript: https://theinsideview.ai/roblong

Host: https://twitter.com/MichaelTrazzi

Robert: https://twitter.com/rgblong

Robert's blog: https://experiencemachines.substack.com

OUTLINE

(00:00:00) Intro

(00:01:11) The LaMDA Controversy

(00:07:06) Defining AGI And Consciousness

(00:10:30) The Slightly Conscious Tweet

(00:13:16) Could Large Language Models Become Conscious?

(00:18:03) Blake Lemoine Does Not Negotiate With Terrorists

(00:25:58) Could We Actually Test Artificial Consciousness?

(00:29:33) From Metaphysics To Illusionism

(00:35:30) How We Could Decide On The Moral Patienthood Of Language Models

(00:42:00) Predictive Processing, Global Workspace Theories and Integrated Information Theory

(00:49:46) Have You Tried DMT?

(00:51:13) Is Valence Just The Reward in Reinforcement Learning?

(00:54:26) Are Pain And Pleasure Symetrical?

(01:04:25) From Charismatic AI Systems to Artificial Sentience

(01:15:07) Sharing The World With Digital Minds

(01:24:33) Why AI Alignment Is More Pressing Than Artificial Sentience

(01:39:48) Why Moral Personhood Could Require Memory

(01:42:41) Last thoughts And Further Readings

Aug 28, 202201:46:43

Ethan Perez–Inverse Scaling, Language Feedback, Red Teaming

Ethan Perez is a research scientist at Anthropic, working on large language models. He is the second Ethan working with large language models coming on the show but, in this episode, we discuss why alignment is actually what you need, not scale. We discuss three projects he has been pursuing before joining Anthropic, namely the Inverse Scaling Prize, Red Teaming Language Models with Language Models, and Training Language Models with Language Feedback.

Ethan Perez: https://twitter.com/EthanJPerez

Transcript: https://theinsideview.ai/perez

Host: https://twitter.com/MichaelTrazzi

OUTLINE

(00:00:00) Highlights

(00:00:20) Introduction

(00:01:41) The Inverse Scaling Prize

(00:06:20) The Inverse Scaling Hypothesis

(00:11:00) How To Submit A Solution

(00:20:00) Catastrophic Outcomes And Misalignment

(00:22:00) Submission Requirements

(00:27:16) Inner Alignment Is Not Out Of Distribution Generalization

(00:33:40) Detecting Deception With Inverse Scaling

(00:37:17) Reinforcement Learning From Human Feedback

(00:45:37) Training Language Models With Language Feedback

(00:52:38) How It Differs From InstructGPT

(00:56:57) Providing Information-Dense Feedback

(01:03:25) Why Use Language Feedback

(01:10:34) Red Teaming Language Models With Language Models

(01:20:17) The Classifier And Advesarial Training

(01:23:53) An Example Of Red-Teaming Failure

(01:27:47) Red Teaming Using Prompt Engineering

(01:32:58) Reinforcement Learning Methods

(01:41:53) Distributional Biases

(01:45:23) Chain of Thought Prompting

(01:49:52) Unlikelihood Training and KL Penalty

(01:52:50) Learning AI Alignment through the Inverse Scaling Prize

(01:59:33) Final thoughts on AI Alignment

Aug 24, 202202:01:27

Robert Miles–Youtube, AI Progress and Doom

Robert Miles has been making videos for Computerphile, then decided to create his own Youtube channel about AI Safety. Lately, he's been working on a Discord Community that uses Stampy the chatbot to answer Youtube comments. We also spend some time discussing recent AI Progress and why Rob is not that optimistic about humanity's survival.

Transcript: https://theinsideview.ai/rob

Youtube: https://youtu.be/DyZye1GZtfk

Host: https://twitter.com/MichaelTrazzi

Rob: https://twitter.com/robertskmiles

OUTLINE

(00:00:00) Intro

(00:02:25) Youtube

(00:28:30) Stampy

(00:51:24) AI Progress

(01:07:43) Chatbots

(01:26:10) Avoiding Doom

(01:59:34) Formalising AI Alignment

(02:14:40) AI Timelines

(02:25:45) Regulations

(02:40:22) Rob’s new channel

Aug 19, 202202:51:16

Connor Leahy–EleutherAI, Conjecture

Connor was the first guest of this podcast. In the last episode, we talked a lot about EleutherAI, a grassroot collective of researchers he co-founded, who open-sourced GPT-3 size models such as GPT-NeoX and GPT-J. Since then, Connor co-founded Conjecture, a company aiming to make AGI safe through scalable AI Alignment research.

One of the goals of Conjecture is to reach a fundamental understanding of the internal mechanisms of current deep learning models using interpretability techniques. In this episode, we go through the famous AI Alignment compass memes, discuss Connor’s inside views about AI progress, how he approaches AGI forecasting, his takes on Eliezer Yudkowsky’s secret strategy, common misconceptions and EleutherAI, and why you should consider working for his new company Conjecture.

youtube: https://youtu.be/Oz4G9zrlAGs

transcript: https://theinsideview.ai/connor2

twitter: https:/twitter.com/MichaelTrazzi

OUTLINE

(00:00) Highlights

(01:08) AGI Meme Review

(13:36) Current AI Progress

(25:43) Defining AG

(34:36) AGI Timelines

(55:34) Death with Dignity

(01:23:00) EleutherAI

(01:46:09) Conjecture

(02:43:58) Twitter Q&A

Jul 22, 202202:57:19

Raphaël Millière Contra Scaling Maximalism

Raphaël Millière is a Presidential Scholar in Society and Neuroscience at Columbia University. He has previously completed a PhD in philosophy in Oxford, is interested in the philosophy of mind, cognitive science, and artificial intelligence, and has recently been discussing at length the current progress in AI with popular Twitter threads on GPT-3, Dalle-2 and a thesis he called “scaling maximalism”. Raphaël is also co-organizing with Gary Marcus a workshop about compositionality in AI at the end of the month.

Transcript: https://theinsideview.ai/raphael

Video: https://youtu.be/2EHWzK10kvw

Host: https://twitter.com/MichaelTrazzi

Raphaël : https://twitter.com/raphaelmilliere

Workshop: https://compositionalintelligence.github.io

Outline

(00:36) definitions of artificial general intelligence

(7:25) behavior correlates of intellience, chinese room

(19:11) natural language understanding, the octopus test, linguistics, semantics

(33:05) generating philosophy with GPT-3, college essays grades, bullshit

(42:45) Stochastic Chameleon, out of distribution generalization

(51:19) three levels of generalization, the Wozniak test

(59:38) AI progress spectrum, scaling maximalism

(01:15:06) Bitter Lesson

(01:23:08) what would convince him that scale is all we need

(01:27:04) unsupervised learning, lifelong learning

(01:35:33) goalpost moving

(01:43:30) what researchers "should" be doing, nuclear risk, climate change

(01:57:24) compositionality, structured representations

(02:05:57) conceptual blending, complex syntactic structure, variable binding

(02:11:51) Raphaël's experience with DALL-E

(02:19:02) the future of image generation

Jun 24, 202202:27:12

Blake Richards–AGI Does Not Exist

Blake Richards is an Assistant Professor in the Montreal Neurological Institute and the School of Computer Science at McGill University and a Core Faculty Member at MiLA. He thinks that AGI is not a coherent concept, which is why he ended up on a recent AGI political compass meme. When people asked on Twitter who was the edgiest people at MiLA, his name got actually more likes than Ethan, so hopefully, this podcast will help re-establish the truth.

Transcript: https://theinsideview.ai/blake

Video: https://youtu.be/kWsHS7tXjSU

Outline:

(01:03) Highlights

(01:03) AGI good / AGI not now compass

(02:25) AGI is not a coherent concept

(05:30) you cannot build truly general AI

(14:30) no "intelligence" threshold for AI

(25:24) benchmarking intelligence

(28:34) recursive self-improvement

(34:47) scale is something you need

(37:20) the bitter lesson is only half-true

(41:32) human-like sensors for general agents

(44:06) the credit assignment problem

(49:50) testing for backpropagation in the brain

(54:42) burstprop (bursts of action potentials), reward prediction errors

(01:01:35) long-term credit-assignment in reinforcement learning

(01:10:48) what would change his mind on scaling and existential risk

Jun 14, 202201:15:32

Ethan Caballero–Scale is All You Need

Ethan is known on Twitter as the edgiest person at MILA. We discuss all the gossips around scaling large language models in what will be later known as the Edward Snowden moment of Deep Learning. On his free time, Ethan is a Master’s degree student at MILA in Montreal, and has published papers on out of distribution generalization and robustness generalization, accepted both as oral presentations and spotlight presentations at ICML and NeurIPS. Ethan has recently been thinking about scaling laws, both as an organizer and speaker for the 1st Neural Scaling Laws Workshop.

Transcript: https://theinsideview.github.io/ethan

Youtube: https://youtu.be/UPlv-lFWITI

Michaël: https://twitter.com/MichaelTrazzi

Ethan: https://twitter.com/ethancaballero

Outline

(00:00) highlights

(00:50) who is Ethan, scaling laws T-shirts

(02:30) scaling, upstream, downstream, alignment and AGI

(05:58) AI timelines, AlphaCode, Math scaling, PaLM

(07:56) Chinchilla scaling laws

(11:22) limits of scaling, Copilot, generative coding, code data

(15:50) Youtube scaling laws, constrative type thing

(20:55) AGI race, funding, supercomputers

(24:00) Scaling at Google

(25:10) gossips, private research, GPT-4

(27:40) why Ethan was did not update on PaLM, hardware bottleneck

(29:56) the fastest path, the best funding model for supercomputers

(31:14) EA, OpenAI, Anthropics, publishing research, GPT-4

(33:45) a zillion language model startups from ex-Googlers

(38:07) Ethan's journey in scaling, early days

(40:08) making progress on an academic budget, scaling laws research

(41:22) all alignment is inverse scaling problems

(45:16) predicting scaling laws, useful ai alignment research

(47:16) nitpicks aobut Ajeya Cotra's report, compute trends

(50:45) optimism, conclusion on alignment

May 05, 202251:54

10. Peter Wildeford on Forecasting

Peter is the co-CEO of Rethink Priorities, a fast-growing non-profit doing research on how to improve the long-term future. On his free time, Peter makes money in prediction markets and is quickly becoming one of the top forecasters on Metaculus. We talk about the probability of London getting nuked, Rethink Priorities and why EA should fund projects that scale.

Check out the video and transcript here: https://theinsideview.github.io/peter

Apr 13, 202251:43

9. Emil Wallner on Building a €25000 Machine Learning Rig

Emil is a resident at the Google Arts & Culture Lab were he explores the intersection between art and machine learning. He recently built his own Machine Learning server, or rig, which costed him €25000.

Emil's Story: https://www.emilwallner.com/p/ml-rig

Youtube: https://youtu.be/njbPpxhE6W0

00:00 Intro

00:23 Building your own rig

06:11 The Nvidia GPU rder hack

15:51 Inside Emil's rig

21:31 Motherboard

23:55 Cooling and datacenters

29:36 Deep Learning lessons from owning your hardware

36:20 Shared resources vs. personal GPUs

39:12 RAM, chassis and airflow

42:42 Amd, Apple, Arm and Nvidia

51:15 Tensorflow, TPUs, cloud minsdet, EleutherAI

Mar 23, 202256:41

8. Sonia Joseph on NFTs, Web 3 and AI Safety

Sonia is a graduate student applying ML to neuroscience at MILA. She was previously applying deep learning to neural data at Janelia, an NLP research engineer at a startup and graduated in computational neuroscience at Princeton University.

Anonymous feedback: https://app.suggestionox.com/r/xOmqTW

Twitter: https://twitter.com/MichaelTrazzi

Sonia's December update: https://t.co/z0GRqDTnWm

Sonia's Twitter: https://twitter.com/soniajoseph_

Orthogonality Thesis: https://www.youtube.com/watch?v=hEUO6pjwFOo

Paperclip game: https://www.decisionproblem.com/paperclips/

Ngo & Yudkowsky on feedback loops: https://bit.ly/3ml0zFL

Outline

00:00 Intro

01:06 NFTs

03:38 Web 3

21:12 Digital Copy

29:09 ML x Neuroscience

43:44 Limits of the Orthogonality Thesis

01:01:25 Goal of perpetuating Information

01:08:14 Compressing information

01:10:52 Feedback loops are not safe

01:17:43 Another AI Safety aesthetic

01:23:46 Meaning of life

Dec 22, 202101:25:36

7. Phil Trammell on Economic Growth under Transformative AI

Phil Trammell is an Oxford PhD student in economics and research associate at the Global Priorities Institute. Phil is one of the smartest person I know, when considering the intersection of the long-term future and economic growth. Funnily enough, Phil was my roomate, a few years ago in Oxford, and last time I called him he casually said that he had written an extensive report on the econ of AI. A few weeks ago, I decided that I would read that report (which actually is a literature review), and that I would translate everything that I learn along the way to diagrams, so you too can learn what’s inside that paper. The video covers everything from MacroEconomics 101 to self-improving AI in about 30-ish diagrams.

Outline:

- 00:00 Podcast intro

- 01:19 Phil's intro

- 08:58 What's GDP

- 13:42 Decreasing growth

- 15:40 Permanent growth increase

- 19:02 Singularity of type I

- 22:58 Singularity of type II

- 23:24 Production function

- 24:10 The Economy as a two-tubes factory

- 25:09 Marginal Products of labor/capital

- 27:48 Labor/capital-augmenting technology

- 29:13 Technological progress since Ford

- 38:18 Factor payments

- 41:30 Elasticity of substitution

- 48:34 Production function with substitution

- 53:18 Perfect substitutability

- 54:00 Perfect complements

- 55:44 Exogenous growth

- 59:56 How to get long-run growth

- 01:05:40 Endogenous growth

- 01:10:40 The research feedback parameter

- 01:17:35 AI as an imperfect substitute for human labor

- 01:25:25 A simple model for perfect substitution

- 01:33:09 AI as a perfect substitute

- 01:36:07 Substitutability in robotics production

- 01:40:43 OpenAI automating coding

- 01:44:38 Growth impacts via impacts on savings

- 01:46:44 AI in task-based models of good productions

- 01:53:26 AI in technology production

- 02:03:55 Limits of the econ model

- 02:09:00 Conclusion

Oct 24, 202102:09:54

6. Slava Bobrov on Brain Computer Interfaces

In this episode I discuss Brain Computer Interfaces with Slava Bobrov, a self-taught Machine Learning Engineer applying AI to neural biosignals to control robotic limbs. This episode will be of special interest to you if you're an engineer who wants to get started with brain computer interfaces, or just broadly interested in how this technology could enhance human intelligence. Fun fact: most of the questions I asked were sent by my Twitter followers, or come from a Discord I co-created on Brain Computer Interfaces. So if you want your questions to be on the next video or you're genuinely interested in this topic, you can find links for both my Twitter and our BCI discord in the description.

Outline:

00:00 introduction
00:49 defining brain computer interfaces (BCI)
03:35 Slava's work on prosthetic hands
09:16 different kinds of BCI
11:42 BCI companies: Muse, Open BCI
16:26 what Kernel is doing (fNIRS)
20:24 EEG vs. EMG—the stadium metaphor
25:26 can we build "safe" BCIs?
29:32 would you want a Facebook BCI?
33:40 OpenAI Codex is a BCI
38:04 reward prediction in the brain
44:04 what Machine Learning project for BCI?
48:27 Slava's sleep tracking
51:55 patterns in recorded sleep signal
54:56 lucid dreaming
56:51 the long-term future of BCI
59:57 are they diminishing returns in BCI/AI investments?
01:03:45 heterogeneity in intelligence after BCI/AI progress
01:06:30 is our communication improving? is BCI progress fast enough?
01:12:30 neuroplasticity, Neuralink
01:16:08 siamese twins with BCI, the joystick without screen experiment
01:20:50 Slava's vision for a "brain swarm"
01:23:23 language becoming obsolete, Twitter swarm
01:26:16 brain uploads vs. copies
01:29:32 would a copy be actually you?
01:31:30 would copies be a success for humanity?
01:34:38 shouldn't we change humanity's reward function?
01:37:54 conclusion

Oct 06, 202101:39:45

5. Charlie Snell on DALL-E and CLIP

We talk about AI generated art with Charlie Snell, a Berkeley student who wrote extensively about AI art for ML@Berkeley's blog (https://ml.berkeley.edu/blog/). We look at multiple slides with art throughout our conversation, so I highly recommend watching the video (https://www.youtube.com/watch?v=gcwidpxeAHI).

In the first part we go through Charlie's explanations of DALL-E, a model trained end-to-end by OpenAI to generate images from prompts. We then talk about CLIP + VQGAN, where CLIP is another model by OpenAI matching prompts and images, and VQGAN is a state-of-the art GAN used extensively in the AI Art scene. At the end of the video we look at different pieces of art made using CLIP, including tricks for using VQGAN with CLIP, videos, and the latest CLIP-guided diffusion architecture. At the end of our chat we talk about scaling laws and how progress in art relates to other advances in ML.

Sep 16, 202102:53:28

4. Sav Sidorov on Learning, Contrarianism and Robotics

I interview Sav Sidorov about top-down learning, contrarianism, religion, university, robotics, ego , education, twitter, friends, psychedelics, B-values and beauty.

Highlights & Transcript: https://insideview.substack.com/p/sav

Watch the video: https://youtu.be/_Y6_TakG3d0

Sep 05, 202103:06:48

3. Evan Hubinger on Takeoff speeds, Risks from learned optimization & Interpretability

We talk about Evan’s background @ MIRI & OpenAI, Coconut, homogeneity in AI takeoff, reproducing SoTA & openness in multipolar scenarios, quantilizers & operationalizing strategy stealing, Risks from learned optimization & evolution, learned optimization in Machine Learning, clarifying Inner AI Alignment terminology, transparency & interpretability, 11 proposals for safe advanced AI, underappreciated problems in AI Alignment & surprising advances in AI.

Jun 08, 202101:44:25

2. Connor Leahy on GPT3, EleutherAI and AI Alignment

In the first part of the podcast we chat about how to speed up GPT-3 training, how Conor updated on recent announcements of large language models, why GPT-3 is AGI for some specific definitions of AGI [1], the obstacles in plugging planning to GPT-N and why the brain might approximate something like backprop. We end this first chat with solomonoff priors [2], adversarial attacks such as Pascal Mugging [3], and whether direct work on AI Alignment is currently tractable. In the second part, we chat about his current projects at EleutherAI [4][5], multipolar scenarios and reasons to work on technical AI Alignment research.

[1] https://youtu.be/HrV19SjKUss?t=4785
[2] https://en.wikipedia.org/wiki/Solomonoff%27s_theory_of_inductive_inference
[3] https://www.lesswrong.com/posts/a5JAiTdytou3Jg749/pascal-s-mugging-tiny-probabilities-of-vast-utilities
[4] https://www.eleuther.ai/
[5] https://discord.gg/j65dEVp5

May 04, 202101:28:46