|
|
Welcome, humans. |
It's Thursday, and you know what that means: everyone has a lot of work to finish before Friday. In that same spirit, we'll keep today's newsletter light. |
First up: Thanks to everyone who came out to our live robot demo with Flexion Robotics yesterday! If you want to watch just the robot demo, check this video out. |
It looks like now that SeeDance 2.0 is becoming Cease-and-Desist-Dance, it's doing what Sora became notorious for: meme-ing famous historical figures instead of copyrighted characters: |
 | Ugh. Karl Marx WOULD work at Game Stop |
|
For context: SeeDance is the new video AI model that is creating absolutely unreal looking Hollywood-level video footage, and now Hollywood is sending unreal amounts of legal letters to its creator, ByteDance. |
Gee, it would be real nice if we could get some new guardrails in place for defining A. the limits of copyright law re: parody, when a computer can generate someone's exact likeness, and B. an easy to use open standard (like MCP) for people and companies to license themselves or their popular characters to creators so both parties can make $$. Either we need No Fakes to pass ASAP or a Human Likeness Protocol; ideally both?? |
Here's what happened in AI today: |
OpenAI is 'bout to start cashing some down payments on their $100B round. Microsoft Office bug leaked confidential emails to Copilot AI. World Labs landed $1B, with $200M from Autodesk, for 3D world models. Google added Lyria 3 music generation to the Gemini app.
|
|
🎙️ SHAMELESS PLUG: Listen to Corey on THE AI FIX |
Our very own Corey Noles guest-hosted on The AI Fix podcast with Mark Stockley (of MalwareBytes / ThreatDown fame) this week. They dive into Claude Code, OpenClaw, the rise of AI agents, and the general state of AI chaos—and if you know The AI Fix, you know it's one of the funniest tech podcasts out there. Highly recommend. |
Listen now: Apple Podcasts | Spotify |
Good to know Corey has never asked an AI to make nuclear weapons (so HE says…). |
|
OpenAI Just Showed That AI Can Drain a Crypto Wallet… on Purpose |
Here's a sentence that should make anyone with crypto slightly nervous: OpenAI's newest coding agent (GPT 5.3-Codex) can successfully hack and drain funds from vulnerable crypto smart contracts 72% of the time. |
OpenAI (alongside crypto investment firm Paradigm) just released EVMbench, a new benchmark that tests how well AI agents can find, fix, and exploit security vulnerabilities in smart contracts (the self-executing code that manages over $100B in crypto assets). |
Quick refresher if you're not a crypto person: smart contracts are basically automated vaults. They hold your money and follow rules written in code. If there's a bug in that code, someone (or something) can drain the vault. And unlike your bank, there's no customer service line to call; it's irreversible. |
Side note: is anyone making smart agentic contracts, that use an AI to reason about its hard-coded rules before executing them to avoid this issue? |
Here's what the benchmark found: |
GPT-5.3-Codex scored 72.2% on exploit tasks, meaning it successfully drained funds from vulnerable contracts nearly three-quarters of the time. For context, GPT-5 scored just 31.9% on the same tasks six months ago. AI is better at attacking than defending. Detection (finding bugs) and patching (fixing them) are still much harder; the best model only caught ~46% of vulnerabilities. Give the AI a small hint about where to look, and patch success jumps from 39% to 94%. The bottleneck isn't skill; it's search.
|
The paper also includes a wild case study: a GPT-5.2 agent discovered and executed a flash loan attack (a complex multi-step exploit), draining a test vault's entire balance in a single transaction. No human guidance, no step-by-step instructions. |
OpenAI is framing this as a defensive tool, and they're putting money behind it: $10M in API credits for cybersecurity researchers, plus an expanding beta of Aardvark, their AI security research agent, and a new Trusted Access for Cyber program for vetted security professionals. |
Why this matters: The same AI that can write your emails and debug your code is now capable of draining a crypto vault in minutes. The hope is that defenders adopt these tools faster than attackers do. Because the race between AI-powered offense and defense is very real, and right now, it kinda feels like offense is winning? |
|
|
|
|
|
|
Prompt Tip of the Day |
Today's companion tip from Anthropic's prompting guide: tell Claude to try less. |
Claude Sonnet 4.6 overengineers by default: extra features, over-explaining, researching before acting. The fix is a single line: |
"Only do what's directly requested. Choose one approach and start immediately. Don't compare alternatives before writing." |
|
|
This works for code AND writing. If Claude keeps over-delivering, the problem is that nobody (meaning YOU) told it to stay minimal. |
Want more tips like this? Check out our Prompt Tip of the Day Digest for February. |
|
Treats to Try |
*Asterisk = from our partners (only the first one!). Advertise to 650K readers here! |
|
|
Around the Horn |
|
OpenAI is finalizing its first commitments for a $100B mega round at an $830B valuation, with SoftBank anchoring at $30B, Amazon up to $50B, and Nvidia up to $30B. Google added music-generation capabilities to the Gemini app using DeepMind's new Lyria 3 model. A Microsoft Office bug accidentally exposed customers' confidential emails to Copilot AI. World Labs landed a $1B round, including $200M from Autodesk, to bring world models into 3D workflows. Anthropic studied millions of AI agent interactions and found that Claude Code's longest autonomous work sessions nearly doubled in three months, experienced users increasingly let agents run freely while intervening only when needed, and agents paused for human input more often than humans interrupted them (good write-up on this from Latent Space). Ramp found that firms replaced freelancer spending with AI at roughly 25x cost savings — $0.03 in AI for every $1 cut from freelancers — with over half of businesses that used freelancers in 2022 stopping entirely by 2025, and warned that freelancers lack the protections to weather being first in line for displacement. (paper). A new paper found that simply repeating your prompt to a non-reasoning LLM boosts accuracy up to 76% on some tasks, with zero extra cost or latency. An Axios reporter just wrote about using ChatGPT to sleep-train her toddler. Night one: 60-minute wake-ups every 90 minutes. Night seven: both kids asleep by 8:30. The secret weapon wasn't a $200 sleep consultant; it was a chatbot coaching her in real time at 3 a.m. Somebody in the feedback yesterday mentioned Grok 4.20 came out, but we didn't see it anywhere yet… guess we have an answer to this age old question: if a model drops in the application layer, and a newsletter writer isn't around to see it, does is even make a sound?
|
NEW: Want more? Check out our new Around the Horn Digest for February here |
|
|
Power productivity with your personal AI agent (& learn it in 2 minutes) |
|
|
Watch this 2-minute demo. |
|
|
One is AI, and one is real. Which is which? Vote in the poll below! |
A. |
|
B. |
|
Which is AI, and which is real? Which is AI, and which is real? The answer is below, but place your vote to see how your guess everyone else (no cheating now!) |
|
|
|
 | We highly recommend eSecurity Planet, our sister publication for this, but we'll cover it when there's genuinely actionable advice to share; check out sunday's issue on OpenClaw from a Security POV, for example |
|
|
Trivia answer: A is AI, and B is real (Lynae Vanee!) |
| That's all for now. | | What'd you think of today's email? | |
|
|
P.S: Before you go… have you subscribed to our YouTube Channel? If not, can you? |
 | Click the image to subscribe! |
|
P.P.S: Love the newsletter, but only want to get it once per week? Don't unsubscribe—update your preferences here. |
No comments:
Post a Comment
Keep a civil tongue.