AI 行业热点

🎙️ 播客精选

Ep 89: AI Research Legend’s Honest Assessment of Where We Are

Unsupervised Learning · 2026-06-03

Speaker 1 | 00:00 - 00:04
Is reasoning enough to get to generalization, or is another method needed?

Speaker 2 | 00:04 - 00:08
It does feel like there is something else that possibly could generalize much better.

Speaker 1 | 00:08 - 00:12
Why do you think Anthropic was the first to be, like, really successful on the coding side?

Speaker 2 | 00:12 - 00:21
Anthropic made this very good decision to focus on coding. OpenAI was like, we’re doing chat GPT. Partway Anthropic made this decision was that they just could not compete.

Speaker 1 | 00:21 - 00:26
What’s your kind of gut intuition on the…

🎧 收听完整节目


🐦 X/Twitter 热点

Swyx (@swyx)

  • one popular theory is that research paper alpha* and lab publishing ~died when researchers realized that instead of fighting with marketing depts they could simply walk out the door and get >$100m for their legally protected tacit knowledge gained

california non-noncompetes have a bigger impact on knowledge spreading than github, arxiv, and huggingface combined

*btw this is a motivator for me to set up @aidotengineer as a product-centric industry conference to complement the paper-centric research conferences [134 ❤️ 5 🔄]

Peter Yang (@petergyang)

  • Good night [4 ❤️]
  • this agentic coding crack is more addictive than video games smh [104 ❤️ 3 🔄]
  • There should be a way to filter or sort all my Codex threads in different ways vs. only by project.

Like filter or sort by:

  • All waiting for approval
  • All currently working

I’m trying to keep it to 10 threads but it’s already getting unwieldly. wdyt @ajambrosino ? [16 ❤️]

Madhu Guru (@realmadhuguru)

  • Routing to models is genuinely hard. It means mapping each task to the right model - which requires benchmarking models against your product’s specific tasks and dialing in the quality/cost trade-off.

And there is an opportunity in that difficulty.

Here is the progression I saw with enterprises while on Gemini.

Phase 1 (2024): Default to the “it” model. Everybody used GPT regardless of task, because it was the shiny new thing.

Phase 2 (early 2025): Over-optimize. Teams over-corrected, looking for the smallest/cheapest model for their task, but did not have evals sophisticated enough to map tasks to models. They ended up burning cycles and shipping slower.

Phase 3: Nuanced routing. The industry’s eval muscle and model diversity got to a point where the most sophisticated AI-native startups succeeded in breaking their product into sub-agents and routed each task to the right model - e.g. hardest reasoning to Claude, simplest to Gemini Flash-Lite or open-weight models.

And like most product patterns, enterprises followed the AI-native builders 6-9 months later. [27 ❤️ 3 🔄]

Amjad Masad (@amasad)

  • When I spoke out against the Gaza genocide, a bunch of midwit VCs ganged up on me both in public and tried to hurt me in private too.

Many of those who stood by me were also VCs. Just the better ones. Both morally, and in return profile.

The best way to avoid sociopaths is to have them self select out of your life by standing for your beliefs. [4935 ❤️ 396 🔄]

  • Vibecon [494 ❤️ 13 🔄]

Aaron Levie (@levie)

  • Token costs are becoming one of the hottest topics for any enterprise I talk with right now. It’s very bullish for AI in general because it means these systems are being used at a scale that wasn’t contemplated before.

It also gives way to another form of differentiation that will emerge for the applied AI layer, which is model routing.

As tokens take on a significant amount of the cost of any given workflow, then companies will inevitably want to ensure that their dollars go into the most efficient use of tokens for the particular job at hand.

Frontier intelligence will always be relevant at the high end of tasks, like coding, legal and financial analysis, healthcare, and more. And dollars spent here will only go up over time. But, equally, you can peel off individual tasks to lower cost models (whether they’re from open weights vendors or the major labs) and deliver a more efficient end outcome.

To do this effectively, the applied AI layer needs to understand the workflows in their domain better than anyone else, and be able to mix and match models to different jobs. If you’re doing document extraction, you need to know which models perform better or worse for any given document type. If you’re legal analysis, you want to know which models perform various types of tasks best. And so on.

This will become one of the bigger differentiation points over time. The companies with the best evals, the best ability to route the workloads, and those that have business models directly aligned to customers financial goals, will be in a great position. [404 ❤️ 63 🔄]

Garry Tan (@garrytan)

  • We never said we don’t upload any user data to the cloud. We said the code (file contents) specifically doesn’t.

Paxel is here to help you, and over time as local models get better, we’ll be able to do even more locally. Can’t wait, tbh. [46 ❤️]

  • I also want to help people become more legit with Paxel [144 ❤️ 10 🔄]
  • Local government matters

Oakland mismanagement is fixable but there has not yet been a resurgence of common sense the way SF had it

@empoweroak is working on this [101 ❤️ 4 🔄]

Zara Zhang (@zarazhangrui)

  • Really enjoyed this talk!
    The value of static content is going down, the value of live interaction is going up
    People want to connect with the human being behind a piece of work, whether it’s content or software
    Raw & opinionated > polished & generic [31 ❤️ 3 🔄]

Nikunj Kothari (@nikunj)

  • don’t forget to touch sand this weekend 🌞 [50 ❤️]
  • A Walk In The Park (part II) feat @taiuti 🌎

(00:00) 👋
(01:10) What world models are
(03:42) Origin story from text-to-3D to @reactorworld
(09:22) Deciding to start the company
(11:22) GTA, games, and the path into programming
(15:09) Building in stealth and keeping secrets
(18:32) Picking investors with independent conviction
(21:43) Where world models will grow first
(26:08) Why low latency matters
(29:34) Becoming CEO and scaling the team
(32:23) Final advice [70 ❤️ 3 🔄]

Dan Shipper (@danshipper)

  • my absolute favorite of Plato’s Dialogues is a deep discussion of the limits of techne and the necessity of aidos and dike

Protagoras here, in the way he talks about where knowledge comes from and whether virtue can be taught, pre-sages LLMs: [8 ❤️ 1 🔄]

  • to add two more that increase in value:

Aidōs—reverence and responsiveness to others
Dikē—the capacity to perceive what is right [33 ❤️ 2 🔄]

  • LLMs are not conscious.

LLMs are not not conscious.

Both true. [41 ❤️ 4 🔄]


Follow Builders 自动生成 · 2026-06-07