AI 行业热点

🎙️ 播客精选

OpenAI’s Yann Dubois: Why AI Progress Suddenly Feels Real

The MAD Podcast with Matt Turck · 2026-05-21

Speaker 1 | 00:00 - 00:26
You need to reach this level of reliability to really make any of these AI tools very useful, and I think we just crossed that probably December last year, at least at OpenAI. Now we can trust these models to do a lot of the work that we are doing. The last few months have been pretty wild. We moved from like competitions to usefulness to users and that’s what we are feeling right now. I think most of the time the Bionic is the last mile.

Speaker 1 | 00:26 - 00:34
There will always be a lot of space left for this last mile in different verticals. And I would highly e…

🎧 收听完整节目

🐦 X/Twitter 热点

Swyx (@swyx)

more here [3 ❤️]
—dangerously-skip-git [15 ❤️ 1 🔄]
i think this stack has won the localfirst battle btw. maybe more chapters to this story but i think this is it if you are building fast apps fast [79 ❤️ 5 🔄]

Josh Woodward (@joshwoodward)

What a ride for @GoogleLabs at IO! Hope you’re enjoying lots of the new stuff! [106 ❤️ 2 🔄]
♥️ ♥️♥️

iOS, Android, and Web - so fun to see people loving the new Neural Expressive design! [331 ❤️ 15 🔄]

Peter Yang (@petergyang)

Game changer Codex automation 🙂 [69 ❤️]
[115 ❤️ 6 🔄]
Ok trying this out [19 ❤️]

Google Labs (@GoogleLabs)

(5/5) Discover our latest experiments and help shape the future of AI technology at

Share what you create! We want to see! [21 ❤️ 3 🔄]

(4/5) And then, we took our Labster to the Grand Canyon with Project Genie. [25 ❤️ 2 🔄]
(3/5) A vibe-designed website from @StitchbyGoogle ft. our Labs experiments as mini games, and our very own 8-bit “Labster.” [45 ❤️ 1 🔄]

Amjad Masad (@amasad)

Monetize your apps and we’ll give you credit rewards. [153 ❤️ 3 🔄]
We’re always excited to talk to customers but you shouldn’t be forced to talk to us to buy the product. [259 ❤️ 11 🔄]

Aaron Levie (@levie)

What’s happened is that we went from AI chat tools that were relatively cheap and had small context windows, to AI agents that have giant context windows, the ability to keep track of longer running work, and models that cost an order of magnitude more on inference because they’re that much better.

This has compounded far faster than most realized (unless you were paying close attention at the middle or end of last year, which many here were), and the dollars flowing in now are much more real.

What follows is a continued march of AI capability that will continue to be used by anyone with a frontier use-case (like coding, sciences, finance, consulting) and then a peeling off of tasks to lower cost models that are capable enough for the job. Whereas we thought the cost of AI might converge on a single low price per token before, it’s clear the stratification is only widening based on the task you need performed.

This will be yet another component that has to be figured out for broad AI diffusion. Enterprises will need to put in programs, new finance teams, and technology solutions to manage this all. The labs and platforms that can ensure customers can price optimize for the task at hand will be in the best position. [474 ❤️ 48 🔄]

Ryo Lu (@ryolu_)

building software is more fun together

try our new model, interface, sdk, automations with your team [171 ❤️ 4 🔄]

Garry Tan (@garrytan)

How does one engineer become a 1000x founder? @sdianahu and I give you the real goods here

Thanks @AnjneyMidha for having us! [76 ❤️ 4 🔄]

San Francisco is safer because of Flock Safety. Every city can be safer. We don’t have to choose a world where people are unsafe. It is a choice, though. [328 ❤️ 26 🔄]
Everyone should have an agent with a GBrain [228 ❤️ 5 🔄]

Matt Turck (@mattturck)

This fantastic conversation with @yanndubs is also available on Spotify, Apple Podcasts and here on YouTube (like and subscribe!): [3 ❤️]
Why AI Progress Suddenly Feels Real - my conversation with @yanndubs, who co-leads the Post-Training Frontiers team at @OpenAI

00:00 - Intro

01:30 - Why recent AI progress feels like a step function

04:13 - Model reliability & the emotional rollercoaster of shipping GPT-5.5

07:33 - How OpenAI structures vertical and horizontal teams

09:49 - Improving model efficiency and test-time compute

12:32 - Yann’s journey from Switzerland to OpenAI

15:37 - Reasoning in 2026: Real-world utility vs verifiable rewards

18:34 - GPT-5.5 Thinking vs Pro: Scaling test-time compute

20:09 - How reasoning models become more efficient

23:23 - Pre-training scaling and overcoming the data wall

27:03 - Multimodal data, synthetic data, and embodied AI

31:05 - Demystifying mid-training and post-training

37:21 - Does RL create new capabilities in AI?

38:53 - The challenges and frontier of scaling RL

43:09 - Is building AI models a craft or a strict science

48:21 - How AI models generalize across different domains

54:18 - How reinforcement learning cures AI hallucinations

56:04 - Negative generalization and conflicting instructions

58:05 - Can RL scale to law, medicine, and the broader economy?

1:00:19 - The evaluation bottleneck and Model as a Judge

1:04:21 - Continuous AI progress & continual learning

1:08:49 - Will foundation models eat the agent harness

1:11:23 - Why startups should focus on the last mile of AI [18 ❤️ 4 🔄]

Zara Zhang (@zarazhangrui)

GitHub:
Introducing the Claude Code Lark/Feishu Bridge 🌉 (open-source)

Talk to Claude Code in Lark/Feishu like a colleague

Use Claude Code on your phone via Lark chat
Manage multiple CC sessions as group chats in Lark (one chat = one session), say goodbye to messy terminal tabs
Claude Code can read all your work context in Lark (chat, docs, meeting transcripts, etc) via CLI
Claude Code can write Lark Docs for you; you can even @ mention it in the comment and it will reply
Forward any Lark messages to Claude and it can just get the task done
Claude can send you interactive cards with buttons and UI

Open-source; try it now: [28 ❤️ 2 🔄]

Nikunj Kothari (@nikunj)

[18 ❤️ 2 🔄]
ex founders are THE driving force helping scale some of the most iconic companies (@tryramp, @mercor_ai, @figma, @AnthropicAI, @cognition and others).

if you are an ex founder wanting to hang out with a stellar peer group, sign up for our next event! [53 ❤️ 2 🔄]

Dan Shipper (@danshipper)

~20 years ago I submitted a story that made it to the top of @digg

Feels good to be back! [51 ❤️ 2 🔄]

Aditya Agarwal (@adityaag)

4 thoughts on early-stage hiring:

1/ If an engineer is trying to pick between a pre Series-B company and a BigCo/BigLab –> stop talking to them immediately. They are clearly not ready for a startup.

2/ If someone isn’t willing to take a 70% cash paycut (relative to BigCo/BigLab) –> stop talking to them immediately. They will be unhappy/stressed.

3/ You learn a lot about a candidate during the negotiation/closing process. Do not be afraid to walk away if you get new information.

4/ Startups have zero work-life balance. If you are not willing to put in the hours, you are not in the right headspace to grind. [403 ❤️ 16 🔄]

Two of my favorite people (sorry @rsanghvi) on stage together. [14 ❤️ 1 🔄]

Sam Altman (@sama)

what problem do you most hope AI will solve in the future?

maybe we can help! [7462 ❤️ 420 🔄]

new codex ships today! [2524 ❤️ 115 🔄]
the attack at the mosque in san diego is one of the most chilling i have seen.

my deepest condolences to the victims, families, and community. [13266 ❤️ 635 🔄]

Claude (@claudeai)

What are you making with Claude Design? [115 ❤️ 4 🔄]
[148 ❤️ 7 🔄]
[87 ❤️ 4 🔄]

📝 博客文章

Claude Code auto mode: a safer way to skip permissions

由 Follow Builders 自动生成 · 2026-05-22

沉鱼的博客

AI 行业热点 - 2026-05-22