| |
Welcome, humans. | A senior US Trump admin official just confirmed that DeepSeek used NVIDIA's most advanced Blackwell chips, which are explicitly banned from export to China, to train its next model at a data center in Inner Mongolia. | Officials also allege DeepSeek used "distillation" from Anthropic, Google, OpenAI, and xAI models to boost it. Smuggled American chips + borrowed American AI knowledge = competitive Chinese model. If this story is not already a Netflix pitch, somebody's slackin'. | NVIDIA reports earnings later today, and every analyst in the room will be asking about this. Our bet is on DeepSeek dropping the v4 model right in time to crash the market. They're a hedge fund, after all; why not play to their strengths? | Here's what happened in AI today: | Inception launched Mercury 2, a diffusion LLM that doesn't generate text... it edits it. Plus, we interviewed the founder. The Pentagon gave Anthropic a Friday ultimatum on AI safeguards. Anthropic launched 11 enterprise plugins for Cowork, turning Claude into a specialized agent for HR, finance, engineering, and more. MatX raised $500M to compete with NVIDIA and build better chips.
| | | This New AI Model Doesn't Write Like Every Other AI. It Edits. |  | Our interview with Stefano Ermon, related to today's news! |
| DEEP DIVE: What happens when you give AI an editor's instincts instead of a typewriter's patience? | Every AI model you've ever used writes the same way: one word at a time, left to right, like a typewriter. If it drifts off course early, tough luck. It keeps typing. | Well, Inception Labs just launched Mercury 2, which is an AI that works completely differently: | Instead of predicting one word after another, it starts with a rough sketch of the entire answer. Then, it refines everything at once, like an editor revising a full draft in parallel. The technical term is a "diffusion LLM" (dLLM), which uses the same core approach behind AI image generators like Midjourney, but applied to text and reasoning.
| Let us tell you, the speed is real. Independent testing from Artificial Analysis clocked Mercury 2 at 1,196 tokens per second, over 3x faster than the next fastest model in its price class. That's a very big deal if you need speed. For context, Claude 4.5 Haiku hits ~89 tokens/sec and GPT-5 Mini ~73. RIP. | Here's what else matters: | $0.25 per million input tokens, $0.75 per million output (cheaper output than GPT-5 Mini). #18 out of 134 models on Artificial Analysis's intelligence index, with strengths in agentic coding and instruction-following. Supports tool use, 128K context, structured outputs, and drops into any OpenAI-compatible stack with zero rewrites.
| To be clear: Mercury 2 isn't trying to dethrone frontier giants like GPT-5.2 or Claude Opus. It's built for production speed, not leaderboard bragging rights. | So why does 10x speed even matter? Because AI isn't just about chatbots anymore. It's agent loops, where one task chains dozens of AI calls together. | Andrej Karpathy (former OpenAI researcher, Tesla AI lead, and notably an Inception investor) drove this home over the weekend when he described the new "Claw" layer of AI. Local agent platforms like OpenClaw and NanoClaw that orchestrate scheduling, tool calls, and persistent workflows on your own machine. He called them "a personal digital house elf." We prefer "a digital non-human entity (don't get it twisted) that runs 24/7 for you", but ya same energy!
| In agent loops, latency compounds at every step. A model that's 10x faster doesn't just save time; it changes what you can build. Voice assistants that feel natural. Code agents that keep pace with your thinking. Background automations that actually finish before you forget you started them. | The big question: if diffusion can make small models this fast without sacrificing reasoning, will big labs build their own? We know Google already has one… Expect more experiments soon. Switching to a diffusion LLM dramatically increases how many tokens you can serve per GPU. There's every incentive to do this. Why wouldn't ya? | Check out Corey's cost breakdown for using Mercury 2 in your OpenClaw setup, geek out with our think-piece on combining diffusion with an energy-based model, or try Mercury 2 yourself. | |
|
|
|
|
Watch this demo to see how Slackbot: | Makes your entire workspace searchable (docs, convos, apps) Enhances every teammate with role-specific automations Learns your project and preferences over time for even smarter outputs
| No onboarding or setup. Just start chatting with Slackbot like another teammate. | Watch this 2-minute demo. | |
Prompt Tip of the Day | Anthropic just launched role-specific plugins for Cowork that pre-load Claude with your domain's terminology, workflows, and output formats. Instead of writing "I'm a financial analyst, here's what a DCF model is..." you install the finance plugin and skip straight to the work. | There are 11 new plugins across HR, design, engineering, ops, and finance. The finance plugins are open source, so firms can customize them. And new connectors for FactSet and MSCI pipe institutional-grade market data directly into Claude's context. | Want to build your own? Each plugin is just a folder with four pieces. No code required; it's all plain text files: | Manifest (plugin.json): Your plugin's name and description. Think of it as the label on the box. Skills (skills/ folder): Markdown files describing how you want work done; your processes, terminology, best practices. Claude reads these automatically when they're relevant. (This is where the magic lives.) Commands (commands/ folder): Shortcuts users type to trigger specific workflows, like /sales:call-prep or /finance:reconciliation. Connectors (.mcp.json): A file telling Claude which external tools to plug into; your CRM, Slack, Google Drive, databases, whatever your team uses.
| If you're not technical, there's even a meta-plugin called "Plugin Management" that builds plugins for you through the Cowork UI. Just describe what you want and Claude writes the files. | The real unlock, though: Claude can now work across Excel and PowerPoint simultaneously, carrying context between apps. An analyst can pull data, update a model, and build the slide deck in one session. When inputs change, Claude updates everything downstream. Finally, an AI that understands the pain of reformatting a pivot table into a deck at 11 PM. | | Treats to Try | *Asterisk = from our partners (only the first one!). Advertise to 650K readers here! | |
|
| Around the Horn | | Anthropic now has til Friday to comply with US Secretary of War Pete Hegseth's demands to remove all safety guardrails from Claude for military use, or the Pentagon may invoke the Defense Production Act to force compliance. ProducerAI joined Google Labs as an AI music collaboration tool powered by DeepMind's Lyria 3 model; Grammy-winner Wyclef Jean already used it on a recent track. Free through Google Labs. Meta and AMD struck a deal worth over $100B for 6 gigawatts of AMD Instinct compute, with AMD issuing Meta warrants for up to 160 million shares (~10% of the company) at $0.01 each; first shipments start H2 2026. AMD stock jumped 9%. MatX, an AI chip startup founded by two ex-Google TPU engineers, raised $500M led by Jane Street and Leopold Aschenbrenner's Situational Awareness fund (Stripe co-founders also invested), claiming its chip outperforms Nvidia's upcoming Rubin Ultra on compute per mm². OpenAI is preparing a new ChatGPT Pro Lite tier at $100/month, splitting the gap between Plus ($20) and Pro ($200); the plan was discovered by feature leaker Tibor Blaho in ChatGPT's code and may be tied to upcoming always-on agent features.
| Want to know EVERYTHING that happened in AI this week? Read this digest. | |
|
|
|
|
|
| | | | That's all for now. | | What'd you think of today's email? | |
|
| P.P.S: Love the newsletter, but only want to get it once per week? Don't unsubscribe—update your preferences here. |
|
No comments:
Post a Comment
Keep a civil tongue.