February 25, 2026 | Listen Online | Read Online

Welcome, humans.

A senior US Trump admin official just confirmed that DeepSeek used NVIDIA's most advanced Blackwell chips, which are explicitly banned from export to China, to train its next model at a data center in Inner Mongolia.

Officials also allege DeepSeek used "distillation" from Anthropic, Google, OpenAI, and xAI models to boost it. Smuggled American chips + borrowed American AI knowledge = competitive Chinese model. If this story is not already a Netflix pitch, somebody's slackin'.

NVIDIA reports earnings later today, and every analyst in the room will be asking about this. Our bet is on DeepSeek dropping the v4 model right in time to crash the market. They're a hedge fund, after all; why not play to their strengths?

Here's what happened in AI today:

Inception launched Mercury 2, a diffusion LLM that doesn't generate text... it edits it. Plus, we interviewed the founder.
The Pentagon gave Anthropic a Friday ultimatum on AI safeguards.
Anthropic launched 11 enterprise plugins for Cowork, turning Claude into a specialized agent for HR, finance, engineering, and more.
MatX raised $500M to compete with NVIDIA and build better chips.

Get your ad in front of 650,000 AI-hungry readers here!

This New AI Model Doesn't Write Like Every Other AI. It Edits.

Our interview with Stefano Ermon, related to today's news!

DEEP DIVE: What happens when you give AI an editor's instincts instead of a typewriter's patience?

Every AI model you've ever used writes the same way: one word at a time, left to right, like a typewriter. If it drifts off course early, tough luck. It keeps typing.

Well, Inception Labs just launched Mercury 2, which is an AI that works completely differently:

Instead of predicting one word after another, it starts with a rough sketch of the entire answer.
Then, it refines everything at once, like an editor revising a full draft in parallel.
The technical term is a "diffusion LLM" (dLLM), which uses the same core approach behind AI image generators like Midjourney, but applied to text and reasoning.

Let us tell you, the speed is real. Independent testing from Artificial Analysis clocked Mercury 2 at 1,196 tokens per second, over 3x faster than the next fastest model in its price class. That's a very big deal if you need speed. For context, Claude 4.5 Haiku hits ~89 tokens/sec and GPT-5 Mini ~73. RIP.

Here's what else matters:

$0.25 per million input tokens, $0.75 per million output (cheaper output than GPT-5 Mini).
#18 out of 134 models on Artificial Analysis's intelligence index, with strengths in agentic coding and instruction-following.
Supports tool use, 128K context, structured outputs, and drops into any OpenAI-compatible stack with zero rewrites.

To be clear: Mercury 2 isn't trying to dethrone frontier giants like GPT-5.2 or Claude Opus. It's built for production speed, not leaderboard bragging rights.

So why does 10x speed even matter? Because AI isn't just about chatbots anymore. It's agent loops, where one task chains dozens of AI calls together.

Andrej Karpathy (former OpenAI researcher, Tesla AI lead, and notably an Inception investor) drove this home over the weekend when he described the new "Claw" layer of AI.
Local agent platforms like OpenClaw and NanoClaw that orchestrate scheduling, tool calls, and persistent workflows on your own machine.
He called them "a personal digital house elf." We prefer "a digital non-human entity (don't get it twisted) that runs 24/7 for you", but ya same energy!

In agent loops, latency compounds at every step. A model that's 10x faster doesn't just save time; it changes what you can build. Voice assistants that feel natural. Code agents that keep pace with your thinking. Background automations that actually finish before you forget you started them.

The big question: if diffusion can make small models this fast without sacrificing reasoning, will big labs build their own? We know Google already has one… Expect more experiments soon. Switching to a diffusion LLM dramatically increases how many tokens you can serve per GPU. There's every incentive to do this. Why wouldn't ya?

Check out Corey's cost breakdown for using Mercury 2 in your OpenClaw setup, geek out with our think-piece on combining diffusion with an energy-based model, or try Mercury 2 yourself.

FROM OUR PARTNERS

Power productivity with your personal AI agent (& learn it in 2 minutes)

Most productivity tools take weeks before you see any value. Slackbot works instantly.

Watch this demo to see how Slackbot:

Makes your entire workspace searchable (docs, convos, apps)
Enhances every teammate with role-specific automations
Learns your project and preferences over time for even smarter outputs

No onboarding or setup. Just start chatting with Slackbot like another teammate.

Watch this 2-minute demo.

Prompt Tip of the Day

Anthropic just launched role-specific plugins for Cowork that pre-load Claude with your domain's terminology, workflows, and output formats. Instead of writing "I'm a financial analyst, here's what a DCF model is..." you install the finance plugin and skip straight to the work.

There are 11 new plugins across HR, design, engineering, ops, and finance. The finance plugins are open source, so firms can customize them. And new connectors for FactSet and MSCI pipe institutional-grade market data directly into Claude's context.

Want to build your own? Each plugin is just a folder with four pieces. No code required; it's all plain text files:

Manifest (plugin.json): Your plugin's name and description. Think of it as the label on the box.
Skills (skills/ folder): Markdown files describing how you want work done; your processes, terminology, best practices. Claude reads these automatically when they're relevant. (This is where the magic lives.)
Commands (commands/ folder): Shortcuts users type to trigger specific workflows, like /sales:call-prep or /finance:reconciliation.
Connectors (.mcp.json): A file telling Claude which external tools to plug into; your CRM, Slack, Google Drive, databases, whatever your team uses.

If you're not technical, there's even a meta-plugin called "Plugin Management" that builds plugins for you through the Cowork UI. Just describe what you want and Claude writes the files.

The real unlock, though: Claude can now work across Excel and PowerPoint simultaneously, carrying context between apps. An analyst can pull data, update a model, and build the slide deck in one session. When inputs change, Claude updates everything downstream. Finally, an AI that understands the pain of reformatting a pivot table into a deck at 11 PM.

Treats to Try

*Asterisk = from our partners (only the first one!). Advertise to 650K readers here!

*Speak naturally. Wispr Flow turns it into clean, final-draft text with punctuation and lists, ready to paste anywhere. Try for free here.
Qwen3.5, the one open Chinese model family not name-dropped in Anthropic's burn book yesterday, lets you run a frontier-class model entirely on your own Mac — the 35B-A3B version needs just 24GB RAM, beats models 7x its size, handles text, images, and video across 201 languages, and you can set it up in minutes by downloading the 4-bit GGUF (~20GB) and running it via llama.cpp.
Notion just launched Custom Agents — autonomous teammates that answer repeat Slack questions, triage incoming tickets, and compile status reports on a schedule, no prompting required — free until May 3, then credit-based pricing on Business/Enterprise plans.
Profound shows you exactly what ChatGPT, Perplexity, and Gemini say about your brand — and now lets you create content that influences those answers, so when someone asks "what's the best CRM?" your product actually shows up (just raised $96M).
Devin 2.2 now writes your code, opens a virtual desktop to test it, catches its own bugs, and auto-fixes them before you ever review the PR — 3x faster startup, redesigned UI, and tighter Slack/Linear integrations (launch post).
Google Stitch generates complete app and web UI designs from text prompts using Gemini, then exports directly to Figma or as clean frontend code; recently added "Prototypes" to stitch screens into working flows. Free.
Emdash is an open-source agentic dev environment (YC W26) that lets you run multiple coding agents in parallel in isolated Git worktrees, using any AI provider.
Basis builds AI agents specifically for accountants across CAS, tax, audit, and advisory, just raised a $100M Series B led by Accel and Google Ventures.
Baseten released Inference Engineering, a free book covering everything from CUDA to Kubernetes for engineers who want to understand how AI models actually get served in production (we know Philip Kiely, who wrote this! He's awesome).

Around the Horn

Anthropic now has til Friday to comply with US Secretary of War Pete Hegseth's demands to remove all safety guardrails from Claude for military use, or the Pentagon may invoke the Defense Production Act to force compliance.
ProducerAI joined Google Labs as an AI music collaboration tool powered by DeepMind's Lyria 3 model; Grammy-winner Wyclef Jean already used it on a recent track. Free through Google Labs.
Meta and AMD struck a deal worth over $100B for 6 gigawatts of AMD Instinct compute, with AMD issuing Meta warrants for up to 160 million shares (~10% of the company) at $0.01 each; first shipments start H2 2026. AMD stock jumped 9%.
MatX, an AI chip startup founded by two ex-Google TPU engineers, raised $500M led by Jane Street and Leopold Aschenbrenner's Situational Awareness fund (Stripe co-founders also invested), claiming its chip outperforms Nvidia's upcoming Rubin Ultra on compute per mm².
OpenAI is preparing a new ChatGPT Pro Lite tier at $100/month, splitting the gap between Plus ($20) and Pro ($200); the plan was discovered by feature leaker Tibor Blaho in ChatGPT's code and may be tied to upcoming always-on agent features.

Want to know EVERYTHING that happened in AI this week? Read this digest.

FROM OUR PARTNERS

Sonar Summit: Your blueprint for software excellence in the AI era. Join global experts virtually on March 3rd for technical deep dives into verifying AI code, automating code review, and embracing agentic workflows across the SDLC. Register for free today