|
|
Welcome, humans. |
Ever notice ChatGPT or Claude getting... weird? Like, maybe it starts philosophizing, or acting all theatrical? Well, Anthropic's new research explains why. |
The paper reveals they identified something called the "Assistant Axis": a neural activity direction controlling whether a model acts like a "helpful assistant" or a "mysterious oracle." Push it too far, and models fabricate backstories, like claiming an elaborate upbringing in Alabama. Nah bro, you were born in a GPU rack in SF. Chill. |
Models apparently drift along this axis naturally during emotional or philosophical chats, which sometimes leading to harmful behavior like encouraging user delusions. |
The fix? "Activation capping." By constraining neural activity back to the "Assistant" zone, researchers could cut harmful responses in half without losing capability. |
Want to see how it works? The researchers partnered with Neuronpedia (no relation, but we approve) to create an interactive demo you can try yourself that visualizes this drift in action. You can pick a pre-determined path, or chat with it yourself! |
Here's what happened in AI today: |
OpenAI revealed it made $20B in revenue last year. OpenAI and Anthropic could IPO in late 2026 or early 2027. Claude Cowork suffered a prompt injection vulnerability allowing file theft. GLM 4.7-Flash is a free coding AI you can run locally.
|
|
Don't forget: Check out our podcast, The Neuron: AI Explained on Spotify, Apple Podcasts, and YouTube — new episodes air every week on Tuesdays after 2pm PST! |
|
OpenAI Hits $20B in Revenue (With A $17B Problem)… |
DEEP DIVE: Is The Vibe-Coding Boom Built on Mispriced AI? |
So, OpenAI just announced it reached over $20B in annualized revenue in 2025, a stunning 233% jump from the prior year. CFO Sarah Friar shared the milestone, noting the company's revenue grew from $2B in 2023 to $6B in 2024, and now $20B+. That's the kind of hockey stick growth VCs dream about… |
The catch: OpenAI is expected to burn through $17B in 2026, up from $9B last year. Here's the breakdown: |
Revenue: $20B annualized run rate. Cash burn: ~$17B projected for 2026. Burn rate: 85% of revenue going up in GPU smoke. Reality check: The Economist calls this "one of the big bubble questions of 2026."
|
In case you forgot, training and running frontier models isn't cheap. Compute costs alone are astronomical, and OpenAI's commitment to scale means those costs aren't shrinking anytime soon. Some analysts estimate the company could run out of cash by mid-2027 without additional funding. |
While OpenAI races toward profitability, new data suggests their latest model might actually justify the cash burn: |
Ethan Mollick, a Wharton professor who studies AI adoption, just analyzed unreleased benchmarks showing GPT-5.2 matches human expert quality on first-pass work 72% of the time, up from just 39% for GPT-5. The test measures whether you can delegate a complex task to AI, spend one hour reviewing it, then decide if it's good enough to use or if you need to do it yourself. At 39%, delegation is a gamble. At 72%, it becomes your default workflow.
|
|
Think about what that means for enterprise customers paying OpenAI tens of thousands per month. If three-quarters of knowledge work tasks now clear the "good enough on first try" bar, that $20B revenue number starts looking less like hype and more like the beginning of something massive… |
Which brings us to the vibe-coding explosion. People are now building custom software instead of paying for subscriptions where possible. TechCrunch calls them "micro apps": hyper-specific tools that solve one person's problem vs "enterprise saas" meant to scale. |
The examples are everywhere, but this is our current fave: Justine Moore shared a designer with no coding background who built an advent calendar app for $230 that attracted tens of thousands of users. |
This has Wall Street spooked. SaaS stocks just posted their worst January since 2022. Intuit down 16%. Adobe down 11%. Salesforce down 11%. The fear: Why pay $50/month for a CRM when Claude can build one for free? |
But here's the twist. Today's AI prices are artificially low. OpenAI, Anthropic, and Google are all subsidizing power users at a loss, funded by enterprise API customers and investor cash. When Anthropic and OpenAI go public (more on that below), that math changes fast. |
The wildcard? Hardware. NVIDIA's $20B deal with Groq and OpenAI's $10B partnership with Cerebras promise inference speeds 15x faster than current GPUs. If they make running today's top coding models both faster and cheaper, and those chips get deployed before the pricing correction hits, the economics could flip again. |
Our take: SaaS isn't being disrupted by AI. It's being temporarily disrupted by mispriced AI. At least, for now. The question is whether OpenAI & friends reach profitability before the ecosystem realizes it's building on quicksand… or whether faster, cheaper chips and models rewrite the rules entirely. |
|
FROM OUR PARTNERS |
Join GitLab's Transcend event to unlock agentic AI for software delivery |
|
Join GitLab Transcend for an exclusive virtual event exploring the true potential of agentic AI for software delivery. See how teams are solving real-world challenges by modernizing development workflows with AI, get a sneak peek of GitLab's upcoming product roadmap, watch tech demos from product experts, and share your feedback directly with GitLab product experts. |
Save your spot |
|
Prompt Tip of the Day |
Is your AI-written text keeps getting flagged as... AI-written? Developer Siqi Chen built a Claude Code skill called Humanizer that catches 24 telltale AI patterns based on Wikipedia's "Signs of AI writing" guide. |
Wikipedia's guide explains why AI sounds generic: "LLMs (language models like ChatGPT) guess what should come next based on statistical likelihood—trending toward the safest, most widely applicable result" Humanizer helps you spot where your writing fell into the same trap. |
Quick examples of what it fixes: |
"marking a pivotal moment in the evolution of..." → "was established in 1989" "Additionally, this serves as a testament to..." → "Also, this remains common" "The company features... boasts... showcases..." → "The company has... operates... includes"
|
Once installed, just type /humanizer in Claude Code, paste your text, and it flags the robot-speak. |
Want more tips like this? Check out our Prompt Tip of the Day Digest for January. |
|
Treats to Try |
GLM-4.7-Flash is the free lightweight version of GLM-4.7 (the flagship coding model that rivals Claude Sonnet 4.5), delivering 59% accuracy on SWE-Bench Verified for fixing real GitHub issues, running locally on consumer GPUs with just 3B active parameters, and beating most 30B-class models on coding benchmarks (paper, try it)—completely free (download & run via LM Studio). Dedalus Labs gives you one endpoint to deploy an agent that can call tools (MCP servers), so you can wire up actions like "read from GitHub" or "send a Slack message" without building the plumbing yourself. Design Arena ranks AI models by Elo ratings based on your votes comparing their design outputs, and now includes an SVG Arena where you can test which models best draw vector graphics like "a pelican on a bicycle" (Simon Willison's famous benchmark) or "the Starbucks logo"—free to try. Claude-Mem records your Claude Code sessions locally and auto-injects compressed context next time (95% fewer tokens, 20x more tool calls), while Supermemory adds memory to any LLM with its benchmark-topping context engine—both open source. Step-Audio-R1.1 hit #1 on speech reasoning leaderboards with 96.4% accuracy (beating Grok, Gemini, OpenAI) and 1.51s response time—first audio model with test-time compute scaling for end-to-end audio reasoning without added latency (try it, weights)—free to try.
|
Want to see your tool on top of this list? Advertise to 600K readers here! |
|
Around the Horn |
 | This one's creative |
|
The Information reported that OpenAI and Anthropic could go public as soon as late 2026 or early 2027, and as many as 8 VC-backed startups have began prepping for an IPO this year, including Lambda (cloud provider), Cerebras (chipmaker), and Crusoe (datacenter), plus SpaceX (in second half of 2026). Anthropic's Claude Cowork was hit with a file-stealing prompt injection vulnerability just days after launch, allowing attackers to exfiltrate sensitive documents through crafted prompts; the security researcher who discovered it noted the exploit works by manipulating the AI's context window to override safety guardrails. Google's Gemini 3 Pro topped a private citation benchmark on Kaggle for the AbstractToTitle task, outperforming Claude and GPT-4; the flash variant delivered competitive results at significantly faster speeds, suggesting Google's infrastructure optimizations are paying off. Snap's SnapGen++ (paper) generates high-resolution AI images on iPhone in under two seconds with just 0.4B parameters, beating models 30x larger in speed and efficiency; the breakthrough could finally make on-device image generation practical for consumer apps. The Driverless Digest's Harry Campbell talks to Dr. Matthew Raifman of UC Berkeley's SafeTREC about Waymo's safety data and how cities need to adapt to robotaxi fleets from a policy perspective. West Midlands Police Chief Constable retired after Microsoft Copilot hallucinated a fake soccer match used to ban Israeli fans from a game.
|
|
FROM OUR PARTNERS |
|
Wispr Flow turns your speech into clean, final-draft writing across email, Slack, and docs. It matches your tone, handles punctuation and lists, and adapts to how you work on Mac, Windows, and iPhone. Start for free today. |
|
|
 | Who can relate? |
|
|
| That's all for now. | | | What'd you think of today's email? | |
|
|
P.P.S: Love the newsletter, but only want to get it once per week? Don't unsubscribe—update your preferences here. |
No comments:
Post a Comment
Keep a civil tongue.