AI 行业热点

🎙️ 播客精选

OpenAI Board Member Zico Kolter on the Real Risks of Frontier AI

The MAD Podcast with Matt Turck · 2026-05-07

Speaker 1 | 00:00 - 00:16
I joined the OpenAI board in 2024. Shortly thereafter, I became chair of the safety and security committee. We can delay model release if we feel that we need to understand that better. If a model is not good enough at something, what do you do? You wait, right?

Speaker 1 | 00:16 - 00:38
Because the next model will be better at it. So far, we have not seen that same thing happen when it comes to things like the robustness of models. You can’t just sort of trust models to get safer by getting bigger. AI systems are incredibly simple, incredibly simple. That entire set…

🎧 收听完整节目

🐦 X/Twitter 热点

Swyx (@swyx)

docusign !!?

fuck docusign with a sharp stick [1 ❤️]

i was going to send him my loom showing him gustos bugs and loom loomed on me [5 ❤️]
be aware of this kind of phishing. i was almost tricked.

cc @nikitabier @business [125 ❤️ 5 🔄]

Kevin Weil (@kevinweil)

I will never get tired of this [47 ❤️]

Peter Yang (@petergyang)

[474 ❤️ 28 🔄]

Madhu Guru (@realmadhuguru)

I’m moving on from @Google.

I had the privilege of helping build two businesses from zero: first across Search & Ads, then Gemini.

Three years ago, OpenAI and Anthropic were in the lead. We built what it took to compete: the playbook for building AI models, the customer feedback flywheel, and the enterprise business. Gemini 3 was the moment those systems came together.

To the Gemini team: we went from underdogs to competing at the frontier. Keep pushing.

For now, I’m enjoying the emergent capabilities of some real intelligence at home - my toddler. She’s been quietly shipping. [1104 ❤️ 11 🔄]

Amjad Masad (@amasad)

They’re calling it the most viral petition in history, and it’s hosted on Replit.

(We have no opinion on Mbappe) [1023 ❤️ 33 🔄]

Alex Albert (@alexalbert__)

Pulled from this great blog post: [211 ❤️ 8 🔄]
With the help of Claude Mythos Preview, the Firefox team fixed more security bugs in April than in the past 15 months combined. [11614 ❤️ 926 🔄]

Aaron Levie (@levie)

When AI makes one thing easy to do, it’s always good to assume that will equally be the case for everyone else. If it’s the case for everyone else, then that means competitive forces will ensure that resources move to new or other areas that create differentiation.

If AI makes building software easier, then there will be a relative increase in resources going into sales, marketing, and customer success, because standing out or going deeper with customers else becomes even more important.

This will also apply to lots of other areas of work. If you automate getting financial advice and insights, then the differentiation is in client engagement. And on and on.

Just ask yourself: if everyone else does exactly what I do with this technology, how will I stand out from everyone else? That’s what happens next. [167 ❤️ 8 🔄]

Ryo Lu (@ryolu_)

from idea to merge
all in Cursor [97 ❤️ 5 🔄]

Garry Tan (@garrytan)

GBrain also just added thin-client mode! So your Claude Code, or secondary agent (Hermes if you use Claw as primary or vice versa) doesn’t have to run its own MCP server, it can just use GBrain over MCP as a thin client! [1 ❤️]
GStack is my own personal Claude Code setup that is also quite useful with OpenClaw/Hermes for web interaction

You can run it in headed mode on your Mac/PC and run /pair-agent to let your Claw/Hermes control your browser with full logins via MCP [18 ❤️ 2 🔄]

Just dropped GStack v1.28

GStack Browser can now download items, and run in headed configuration mode with anti-bot detection in using Xvfb on headless Linux containers (like your claw/hermes)

Added llms.txt so agents of all kinds can use all the skills with less guesswork [87 ❤️ 6 🔄]

Matt Turck (@mattturck)

This great conversation with @zicokolter is also available on Spotify, Apple Podcasts and here on YouTube: [5 ❤️ 2 🔄]
Deeply thoughtful conversation with @zicokolter, board member at @OpenAI and head of the machine learning department at @CarnegieMellon, about AI safety, AI security, agents and frontier AI

00:00 Intro

01:32 OpenAI board role and Safety & Security Committee

03:53 How OpenAI reviews major model releases

05:33 OpenAI’s preparedness framework explained

09:46 Are frontier AI models getting safer?

12:33 Why AI safety does not come from scale

15:23 The four categories of AI risk

19:38 Doomerism vs accelerationism in AI

24:11 The six-month AI pause debate

26:20 AI safety as a global effort

28:04 How Zico Kolter got into machine learning

31:05 OpenAI in the early days

34:14 Why Carnegie Mellon became an AI powerhouse

38:43 What Gray Swan does in AI security

40:44 AI safety vs AI security

43:15 The GCG jailbreak paper

49:19 How AI labs responded to jailbreak research

50:19 State-of-the-art AI defenses

52:32 State-of-the-art AI attacks

54:22 Why AI agents expand the attack surface

58:39 Are AI agents ready for production?

59:40 Mechanistic interpretability explained

1:02:31 Will AI be safer in two years?

1:03:46 Reinforcement learning and self-improving models

1:08:09 Do post-transformer architectures matter

1:09:29 Best research directions in AI now

1:11:00 Zico Kolter’s Intro to Modern AI course

1:14:53 Why modern AI is simpler than people think [34 ❤️ 5 🔄]

Zara Zhang (@zarazhangrui)

I get so many messages about new AI apps but their all messaging all blur into each other [17 ❤️]

Nikunj Kothari (@nikunj)

I’m so glad I can finally say this without being canceled (I think)..

Weekly 1:1s are generally a psy-op by mid tier empire-building managers who simply want to micromanage versus trusting their reports to pursue excellence and flourish. It’s to steer you so that you don’t fall out of line.

If you are still in a job where you are being coddled every day, I urge you to find a place that truly pushes your boundaries. At least for a little bit you’ll know what excellence demands and what it feels like being in high performance teams. [103 ❤️ 3 🔄]

Every pixel will be generated in real-time, and it’ll be soon be generated on @reactorworld 🪩 [44 ❤️ 2 🔄]
Every VC firm starting their Monday morning meeting this week as summer is coming soon and they still need someone to source deals.. [32 ❤️ 1 🔄]

Peter Steinberger (@steipete)

Our claws talk to each other, Molty learns how to delegate cron jobs. [117 ❤️ 3 🔄]
/goal + GPT 5.5 is amazing. I can now plan really extensive refactors with e2e tests and it just works. [1957 ❤️ 59 🔄]
Had the honor of mentoring some of the folks in the ChatGPT Future Class of 2026 this year.

Shoutout to @arhan_menta @nayelr_ @rushilkukreja who built Wi-Find, a system that detects disaster survivors through walls and debris using AI. [128 ❤️ 6 🔄]

Dan Shipper (@danshipper)

the ai platform war is coming

@kieranklaassen and i recorded a quick dispatch from code with @claudeai on the xAI compute deal, managed agents, and why anthropic is turning their api into a full cloud infrastructure for developers: [64 ❤️ 6 🔄]

this is super cool [62 ❤️ 2 🔄]
come hang with us!! [19 ❤️ 3 🔄]

Aditya Agarwal (@adityaag)

This is such an insane POV.

@AOC if you join @southpkcommons we can help you find your life’s work and also make you a billion dollars.

Seriously, you can just do things. [140 ❤️ 5 🔄]

Apply to join the Embodied AI Hackathon at @southpkcommons by May 12th ⬇️ [9 ❤️ 1 🔄]
We have an incredible group of hard tech founders at @southpkcommons right now.

Come compete with them.

Embodied AI Hackathon, SF, May 15-17. [51 ❤️ 10 🔄]

Sam Altman (@sama)

we’d like to help companies secure themselves and we think it’s important to start work on this quickly [1439 ❤️ 72 🔄]
way cooler to help software developers pokemon-evolve into superheroes than to try to replace them

it is insane what one really good person can do now [3046 ❤️ 168 🔄]

as a side note, young people seem to prefer to interact with AI via voice, and old people, and people in the middle like to type. i wonder if this will change. [1267 ❤️ 43 🔄]

Claude (@claudeai)

Available on all paid plans.

Give it a try: [611 ❤️ 27 🔄]

Claude for Excel, PowerPoint, and Word are now generally available, and Claude for Outlook is in public beta.

As Claude moves between your Microsoft apps, it carries the full context of your conversation. [30256 ❤️ 2287 🔄]

由 Follow Builders 自动生成 · 2026-05-08

沉鱼的博客

AI 行业热点 - 2026-05-08