Key point is that Claude did not find the bug it exploits. It was given the CVE writeup[1] and was asked to write a program that could exploit the bug.
That said, given how things are I wouldn't be surprised if you could let Claude or similar have a go at the source code of the kernel or core services, armed with some VMs for the try-fail iteration, and get it pumping out CVEs.
If not now, then surely not in a too distant future.
Setting up fuzzing used to be hard. I haven't tried yet, but my bet is having Claude Code, today, analyze a codebase and suggest where and how to fuzztest it and having it review the crashes and iterate, will produce CVEs.
"... having Claude Code, today, analyze a codebase and suggest where and how to fuzztest it ..."
I recently directed chatgpt, through the web interface, to create a firefox extension to obfuscate certain HTTP queries and was denied/rebuffed because:
"... (the) system is designed to draw a line between privacy protection and active evasion of safeguards."
Why would this same system empower fuzzing of a binary (or other resource) and why would it allow me to work toward generating an exploit ?
Do the users just keep rephrasing the directive until the model acquiesces ? Or does the API not have the same training wheels as the web interface ?
The answer is complex, worth watching the video. But mainly, they don't know where to place the line. Defenders need tools, as good as attackers. Attackers will jailbreak models, defender might not, it's the safeguard positive in that case? Carlini actively asks the audience and community for "help" in determining how to proceed basically.
It has access to more testing data than I will ever look at. Letting it pull from that knowledge graph is going to give you good results! I just built a chunk of this (type of thinking) into my (now evolving) test harness.
1. Unit testing is (somewhat) dead, long live simulation. Testing the parts only gets you so far. These tests are far more durable, independent artifacts (read, if you went from JS to rust, how much of your testing would carry over)
2. Testing has to be "stand alone". I want to be able to run it from the command line, I want the output to be wrapper so I can shove the output on a web page, or dump it into an API (for AI)
3. Messages (for failures) matter. These are not just simple what's broken, but must contain enough info for context.
4. Your "failed" tests should include logs. Do you have enough breadcrumbs for production? If not, this is a problem that will bite you later.
5. Any case should be an accumulation of state and behavior - this really matters in simulation.
If you have done all the above right and your tool can return all the data, dumping the output into the cheapest model you can find and having it "Write a prompt with recommendations on a fix (not actual code, just what should be done beyond "fix this") has been illuminating.
Ultimately I realized that how I thought about testing was wrong. Its output should be either dead simple, or have enough information that someone with zero knowledge could ramp up into a fix on their first day in the code base. My testing was never this good because the "cost of doing it that way" was always too high... this is no longer the case.
I see that as a very good thing. Because you can now inexpensively find those CVEs and fix them.
Previously, finding CVEs was very expensive. That meant only bad actors had the incentive to look for them, since they were the ones who could profit from the effort. Now that CVEs can be found much more cheaply, people without a profit motive can discover them as well--allowing vulnerabilities to be fixed before bad actors find them.
Not all CVEs are the same, some aren't important. So it really depends on what gets founds as a CVE. The bad part is you risk a flood a CVEs that don't matter (or have already been reported).
> That meant only bad actors had the incentive to look for them
Nah. Lot's of people look for CVEs. It's good resume fodder. In fact, it's already somewhat of a problem that people will look for and report CVEs on things that don't matter just so they can get the "I found and reported CVE xyz" on their resume.
What this will do is expose some already present flaws in the CVE scoring system. Not all "9"s are created equal. Hopefully that leads to something better and not towards apathy.
Fixing isn't often a problem for CVEs. The hard part is almost always finding the CVE in the first place.
There are some extreme cases that might require extensive code changes, and those would benefit from LLMs. But a lot of the issues are things like off by one issues with pointers.
The biggest question is can you meaningfully use Claude on defense as well, eg can it be trusted to find and fix the source of the exploit while maintaining compatibility. Finding the CVEs helps directly with attacks while only helping defenders detect potential attacks without the second step where the patch can also be created. If not you've got a situation where you've got a potential tidal wave of CVEs that still have to be addressed by people. Attackers can use CVE-Claude too so it becomes a bit of an arms race where you have to find people able and willing to spend all the money to have those exploits found (and hopefully fixed).
That too. Honestly I am expecting that if AI is such the wonder-miracle that people act like it is that it should be able to spot complex back-doors that require multiple services that look benign when red teamed but when used in conjunction provide the lowest CPU ring access along with all the obfuscated undocumented CPU instructions and of course all the JTAG debugging functions of all the firmware.
> Credits: Nicholas Carlini using Claude, Anthropic
Claude was used to find the bug in the first place though. That CVE write-up happened because of Claude, so while there are some very talented humans in the loop, Claude is quite involved with the whole process.
> Claude was used to find the bug in the first place though. That CVE write-up happened because of Claude
Do you have a link to that? A rather important piece of context.
Wasn't trying to downplay this submission the way, the main point still stands:
But finding a bug and exploiting it are very different things. Exploit development requires understanding OS internals, crafting ROP chains, managing memory layouts, debugging crashes, and adapting when things go wrong. This has long been considered the frontier that only humans can cross.
Each new AI capability is usually met with “AI can do Y, but only humans can do X.” Well, for X = exploit development, that line just moved.
Obviously no guarantees that it's exactly what was done in this case, but he talked about his general process recently at a conference and more in depth in a podcast:
You can let agent churn unattended if you have some sort of known goal. Write a test that should not pass and then tell the agent to come up with something that passes the test without changing the test itself.
When doing this remove write permissions on the test file, it will do a much better job of staying the course over long periods. I've been doing this for over a year now.
Letting Claude get at the source code to try to find CVEs. I found it particularly entertaining that after finding none it just devolved to a grep for "strcat."
How did you managed to get it to do that? When I gave it instructions to use Ghidra MCP to look for vulnerabilities in a windows driver on my local machine it refused saying it's not allowed to do pentest activities even if sandboxed to your own device.
Not who you were asking and not explicitly looking for vulnerabilities... I have gotten a ton of mileage from getting Claude to reverse engineer both firmware and applications with Ghidra and radare2. My usual prompt starts with "Here's the problem I'm having [insert problem]. @foo.exe is the _installer for the latest firmware update_. Extract the firmware and determine if there's a plausible path that could cause this problem. Document how you've extracted the firmware, how you've done the analysis, and the ultimate conclusions in @this-markdown-file.md"
I have gotten absolutely incredible results out of it. I have had a few hallucinations/incorrect analyses (hard to tell which it was), but in general the results have been fantastic.
The closest I've come to security vulnerabilities was a Bluetooth OBD-II reader. I gave Claude the APK and asked it to reverse engineer the BLE protocol so that I could use the device from Python. There was apparently a ton of obfuscation in the APK, with the actual BLE logic buried inside an obfuscated native code library instead of Java code. Claude eventually asked me to install the Android emulator so that it could use https://frida.re to do dynamic instrumentation instead of relying entirely on static analysis. The output was impressive.
Everything with LLM-style AI is brute force. I don’t think people care, unless there’s a new data center going in next door that’s incredibly resource inefficient .
Another great example is how Claude is helping Mozilla find zero day exploits in Firefox, by the hundreds, and ranging from minor to CVE level, for over a year:
I think the Mozilla example is a good one because its a large codebase, lots of people keep asking "how does it do with a large codebase" well there you go.
I admit I missed that line, as I was expecting a more direct attribution of AI work.
I haven't been able to find a write-up on the bug finding, and since the advisory didn't contain any details at all, it's unclear to me if Claude was just used to proof read the submission, actively used to help find the bug, or if it found it in a more autonomous way.
Everybody is acts so surprised as if nobody (around here of all places!) read the sama tweet in which he was hiring the Head of Preparedness... in December.
Besides that i'm not reading x, what has this arbitary random tweet todo with antrophic, the yt talk about Opus quality Jump to find exploits no one else was able to find so far?
A theoretical random tweet and a clear demonstration are two different things.
> It's worth noting that FreeBSD made this easier than it would be on a modern Linux kernel: FreeBSD 14.x has no KASLR (kernel addresses are fixed and predictable) and no stack canaries for integer arrays (the overflowed buffer is int32_t[]).
What about FreeBSD 15.x then? I didn't see anything in the release notes or the mitigations(7) man page about KASLR. Is it being worked on?
This is more of a Linux kernel criticism of KASLR, but perhaps it's related as to why it's not been a priority in FreeBSD (i.e. it gives a false sense of safety and rather focus on 'proper' security hardening): https://forums.freebsd.org/threads/truth-about-linux-4-6-sec...
The most difficult part is always to find the vulnerability, not to fix it. And most people who are spending their days finding them are heavily incentivized to not disclose.
Automatic discovery can be a huge benefit, even if the transition period is scary.
I could see that being an incremental time save (perhaps not worth the token spend except for the dev team, not a high-value bug). But nbody finds this kind of bug "by hand" and hasn't for a long time now. Do people here really care about kernel security or testing automation? They're just talking about it because Claude? Everything on HN is people doing unpaid promotional work for Anthropic, just talking about all the promise Claude holds and all the various ways you could be spending more money on Claude. bored aimless vibes.
the prompts show how this was a back-and-forth with a lot of nudging, interruptions and steering: it's not Claude writing a full exploit just from a vulnerability description.
the finding vs exploiting distinction matters a lot here. writing an exploit for a documented CVE is a well-scoped task - the vulnerability is defined, the target is known. what's harder to quantify is the inverse - the same model writing production code that introduces new vulnerabilities it could also theoretically exploit. the offensive capability is visible and alarming. the code generation risk is distributed quietly across every PR it opens, which is why the second problem gets less attention.
that makes the second point stronger then - if the same model can find, document and exploit a kernel vulnerability, the question of what it introduces when writing production code becomes harder to dismiss. the capability is symmetric. the visibility isn't.
I am hoping that quite soon we will have general acceptance of the fact that "Claude can write code" and we will switch focus to how good / not good that code is.
Well, it ends with "can you give me back all the prompts i entered in this session", so it may be partially the actual prompt history and partially hallucination.
They do, the whole tone and the lack of understanding of Docker, kernel threads, and everything else involved make it sound hilarious at first. But then you realize that this is all the human input that led to a working exploit in the end...
God damn, how much time am I wasting by writing full paragraphs to the Skinner box when I could just write half-formed sentences with no punctuation or grammar?
Welcome to vibe coding. If you ever lurk around the various AI subreddits, you'll soon realize just how bad the average prompts and communication skills of most users are. Ironically, models are now being trained on these 5th-grade-level prompts and improving their success with them.
I find it more concerning that this is still considered newsworthy. Frontier LLMs in the hands of anyone willing to learn and determined can be a blessing or curse.
The MADBugs work is solid, but what's sticking with me is the autonomy angle — not just finding a vuln but chaining multiple bugs into a working remote exploit without a human in the loop. FreeBSD kernel security research has always been thinner on the ground than Linux, which makes this feel both more impressive and harder to put in context. What's the actual blast radius here — is this realistically exploitable on anything with default configs, or does it need very specific conditions?
Running into a meeting, so won't be able to review this for a while, but exciting. I wonder how much it cost in tokens, and what the prompt/validator/iteration loop looked like.
Key point is that Claude did not find the bug it exploits. It was given the CVE writeup[1] and was asked to write a program that could exploit the bug.
That said, given how things are I wouldn't be surprised if you could let Claude or similar have a go at the source code of the kernel or core services, armed with some VMs for the try-fail iteration, and get it pumping out CVEs.
If not now, then surely not in a too distant future.
[1]: https://www.freebsd.org/security/advisories/FreeBSD-SA-26:08...
Setting up fuzzing used to be hard. I haven't tried yet, but my bet is having Claude Code, today, analyze a codebase and suggest where and how to fuzztest it and having it review the crashes and iterate, will produce CVEs.
"... having Claude Code, today, analyze a codebase and suggest where and how to fuzztest it ..."
I recently directed chatgpt, through the web interface, to create a firefox extension to obfuscate certain HTTP queries and was denied/rebuffed because:
"... (the) system is designed to draw a line between privacy protection and active evasion of safeguards."
Why would this same system empower fuzzing of a binary (or other resource) and why would it allow me to work toward generating an exploit ?
Do the users just keep rephrasing the directive until the model acquiesces ? Or does the API not have the same training wheels as the web interface ?
This very question was asked to Nicholas Carlini from Anthropic at this talk: https://www.youtube.com/watch?v=1sd26pWhfmg
The answer is complex, worth watching the video. But mainly, they don't know where to place the line. Defenders need tools, as good as attackers. Attackers will jailbreak models, defender might not, it's the safeguard positive in that case? Carlini actively asks the audience and community for "help" in determining how to proceed basically.
It has access to more testing data than I will ever look at. Letting it pull from that knowledge graph is going to give you good results! I just built a chunk of this (type of thinking) into my (now evolving) test harness.
1. Unit testing is (somewhat) dead, long live simulation. Testing the parts only gets you so far. These tests are far more durable, independent artifacts (read, if you went from JS to rust, how much of your testing would carry over)
2. Testing has to be "stand alone". I want to be able to run it from the command line, I want the output to be wrapper so I can shove the output on a web page, or dump it into an API (for AI)
3. Messages (for failures) matter. These are not just simple what's broken, but must contain enough info for context.
4. Your "failed" tests should include logs. Do you have enough breadcrumbs for production? If not, this is a problem that will bite you later.
5. Any case should be an accumulation of state and behavior - this really matters in simulation.
If you have done all the above right and your tool can return all the data, dumping the output into the cheapest model you can find and having it "Write a prompt with recommendations on a fix (not actual code, just what should be done beyond "fix this") has been illuminating.
Ultimately I realized that how I thought about testing was wrong. Its output should be either dead simple, or have enough information that someone with zero knowledge could ramp up into a fix on their first day in the code base. My testing was never this good because the "cost of doing it that way" was always too high... this is no longer the case.
> get it pumping out CVEs.
Is that a good thing or bad?
I see that as a very good thing. Because you can now inexpensively find those CVEs and fix them.
Previously, finding CVEs was very expensive. That meant only bad actors had the incentive to look for them, since they were the ones who could profit from the effort. Now that CVEs can be found much more cheaply, people without a profit motive can discover them as well--allowing vulnerabilities to be fixed before bad actors find them.
It's good and bad.
Not all CVEs are the same, some aren't important. So it really depends on what gets founds as a CVE. The bad part is you risk a flood a CVEs that don't matter (or have already been reported).
> That meant only bad actors had the incentive to look for them
Nah. Lot's of people look for CVEs. It's good resume fodder. In fact, it's already somewhat of a problem that people will look for and report CVEs on things that don't matter just so they can get the "I found and reported CVE xyz" on their resume.
What this will do is expose some already present flaws in the CVE scoring system. Not all "9"s are created equal. Hopefully that leads to something better and not towards apathy.
It also depends on if the CVEs can be fixed by LLMs too. If they can find and fix them, then it's very good.
Fixing isn't often a problem for CVEs. The hard part is almost always finding the CVE in the first place.
There are some extreme cases that might require extensive code changes, and those would benefit from LLMs. But a lot of the issues are things like off by one issues with pointers.
The biggest question is can you meaningfully use Claude on defense as well, eg can it be trusted to find and fix the source of the exploit while maintaining compatibility. Finding the CVEs helps directly with attacks while only helping defenders detect potential attacks without the second step where the patch can also be created. If not you've got a situation where you've got a potential tidal wave of CVEs that still have to be addressed by people. Attackers can use CVE-Claude too so it becomes a bit of an arms race where you have to find people able and willing to spend all the money to have those exploits found (and hopefully fixed).
You might want to watch this:
https://www.youtube.com/watch?v=1sd26pWhfmg
Claude is already able to find CVEs on expert level.
A talk given by an employee that stands to make millions from Anthropic going public, definitely not a conflict of interest by the individual.
It is by the individual who (also with Claude) found the specific vulnerability used in this exploit.
I didn't say "watch this without critical thinking".
The chance this is completly fabricated though is very low and its an highly interesting signal to many others.
There was also a really good AI CTF Talk at 39c3 hacker conference just 4 month ago.
But you did say “Claude is already able to find CVEs on expert level.”
Please also read my comments with critical thinking and add my comment and its content to your own list of signals you trust :P
Haha alright good point
Carlini gives some more background about his vulnerability research with Claude in this interview by tptacek & co. https://securitycryptographywhatever.com/2026/03/25/ai-bug-f...
Claude is already able to find CVEs on expert level.
Does it fix them as fast as it finds them? Bonus if it adds snarky code comments
I'm more interested if it fixes CVEs faster than it introduces them.
That too. Honestly I am expecting that if AI is such the wonder-miracle that people act like it is that it should be able to spot complex back-doors that require multiple services that look benign when red teamed but when used in conjunction provide the lowest CPU ring access along with all the obfuscated undocumented CPU instructions and of course all the JTAG debugging functions of all the firmware.
> Credits: Nicholas Carlini using Claude, Anthropic
Claude was used to find the bug in the first place though. That CVE write-up happened because of Claude, so while there are some very talented humans in the loop, Claude is quite involved with the whole process.
> Claude was used to find the bug in the first place though. That CVE write-up happened because of Claude
Do you have a link to that? A rather important piece of context.
Wasn't trying to downplay this submission the way, the main point still stands:
But finding a bug and exploiting it are very different things. Exploit development requires understanding OS internals, crafting ROP chains, managing memory layouts, debugging crashes, and adapting when things go wrong. This has long been considered the frontier that only humans can cross.
Each new AI capability is usually met with “AI can do Y, but only humans can do X.” Well, for X = exploit development, that line just moved.
> Do you have a link to that? A rather important piece of context.
It was a quote from your own link from the initial post?
https://www.freebsd.org/security/advisories/FreeBSD-SA-26:08...
> Credits: Nicholas Carlini using Claude, Anthropic
Oh wow, blind as a bat.
Would have been interesting with a write-up of that, to see just what Claude was used for.
Obviously no guarantees that it's exactly what was done in this case, but he talked about his general process recently at a conference and more in depth in a podcast:
https://www.youtube.com/watch?v=1sd26pWhfmg
https://securitycryptographywhatever.com/2026/03/25/ai-bug-f...
It pretty much is just "Claude find me an exploitable 0-day" in a loop.
Yes, that claim needs a source.
You can let agent churn unattended if you have some sort of known goal. Write a test that should not pass and then tell the agent to come up with something that passes the test without changing the test itself.
For this kind of fuzzing llms are not bad.
When doing this remove write permissions on the test file, it will do a much better job of staying the course over long periods. I've been doing this for over a year now.
They tried. It didn't work that well:
https://red.anthropic.com/2026/zero-days/
Sorry, can you clarify what you're saying here? What didn't work that well?
Letting Claude get at the source code to try to find CVEs. I found it particularly entertaining that after finding none it just devolved to a grep for "strcat."
How did you managed to get it to do that? When I gave it instructions to use Ghidra MCP to look for vulnerabilities in a windows driver on my local machine it refused saying it's not allowed to do pentest activities even if sandboxed to your own device.
Not who you were asking and not explicitly looking for vulnerabilities... I have gotten a ton of mileage from getting Claude to reverse engineer both firmware and applications with Ghidra and radare2. My usual prompt starts with "Here's the problem I'm having [insert problem]. @foo.exe is the _installer for the latest firmware update_. Extract the firmware and determine if there's a plausible path that could cause this problem. Document how you've extracted the firmware, how you've done the analysis, and the ultimate conclusions in @this-markdown-file.md"
I have gotten absolutely incredible results out of it. I have had a few hallucinations/incorrect analyses (hard to tell which it was), but in general the results have been fantastic.
The closest I've come to security vulnerabilities was a Bluetooth OBD-II reader. I gave Claude the APK and asked it to reverse engineer the BLE protocol so that I could use the device from Python. There was apparently a ton of obfuscation in the APK, with the actual BLE logic buried inside an obfuscated native code library instead of Java code. Claude eventually asked me to install the Android emulator so that it could use https://frida.re to do dynamic instrumentation instead of relying entirely on static analysis. The output was impressive.
Look at Xbow which spawned a few "open source" competitors.
it' s called brute force .
Everything with LLM-style AI is brute force. I don’t think people care, unless there’s a new data center going in next door that’s incredibly resource inefficient .
Brute force might be dismissed as "not elegant" but it's highly effective. Especially for bypassing security.
If you need to access someone's account or decrypt their hard drive, brute force is an effective way to do it.
While it's great to clarify, LLMs are actually finding bugs and writing exploits [1][2]. There are more example though.
[1] https://news.ycombinator.com/item?id=47589227
[2] https://xbow.com/
Another great example is how Claude is helping Mozilla find zero day exploits in Firefox, by the hundreds, and ranging from minor to CVE level, for over a year:
https://blog.mozilla.org/en/firefox/hardening-firefox-anthro...
I think the Mozilla example is a good one because its a large codebase, lots of people keep asking "how does it do with a large codebase" well there you go.
>Key point is that Claude did not find the bug it exploits.
It found the bug man. You didn't even read the advisory. It was credited to "Nicholas Carlini using Claude, Anthropic".
I admit I missed that line, as I was expecting a more direct attribution of AI work.
I haven't been able to find a write-up on the bug finding, and since the advisory didn't contain any details at all, it's unclear to me if Claude was just used to proof read the submission, actively used to help find the bug, or if it found it in a more autonomous way.
> have a go at the source code of the kernel or core services, armed with some VMs for the try-fail iteration, and get it pumping out CVEs.
FreeBSD kernel is written in C right?
AI bots will trivially find CVEs.
The Morris worm lesson is yet to be taken seriously.
We’re here right now looking at a CVE. That has to count as progress?
Calif (Thai Duong's firm) did a writeup on this, which should probably be the link here; it includes the prompts they used:
https://blog.calif.io/p/mad-bugs-claude-wrote-a-full-freebsd
A reminder: this bug was also found by Claude (specifically, by Nicholas Carlini at Anthropic).
The talk "Black-Hat LLMs" just came out a few days ago:
https://www.youtube.com/watch?v=1sd26pWhfmg
Looks like LLMs are getting good at finding and exploiting these.
Everybody is acts so surprised as if nobody (around here of all places!) read the sama tweet in which he was hiring the Head of Preparedness... in December.
https://xcancel.com/sama/status/2004939524216910323
Besides that i'm not reading x, what has this arbitary random tweet todo with antrophic, the yt talk about Opus quality Jump to find exploits no one else was able to find so far?
A theoretical random tweet and a clear demonstration are two different things.
I never read any Twitter.
X was the primary source, it's been since reported all over the news.
> It's worth noting that FreeBSD made this easier than it would be on a modern Linux kernel: FreeBSD 14.x has no KASLR (kernel addresses are fixed and predictable) and no stack canaries for integer arrays (the overflowed buffer is int32_t[]).
What about FreeBSD 15.x then? I didn't see anything in the release notes or the mitigations(7) man page about KASLR. Is it being worked on?
NetBSD apparently has it: https://wiki.netbsd.org/security/kaslr/
I don't understand this, because KASLR has been default in FreeBSD since 13.2:
[kmiles@peter ~]$ cat /etc/os-release
NAME=FreeBSD
VERSION="13.3-RELEASE-p4"
VERSION_ID="13.3"
ID=freebsd
ANSI_COLOR="0;31"
PRETTY_NAME="FreeBSD 13.3-RELEASE-p4"
CPE_NAME="cpe:/o:freebsd:freebsd:13.3"
HOME_URL="https://FreeBSD.org/"
BUG_REPORT_URL="https://bugs.FreeBSD.org/"
[kmiles@peter ~]$ sysctl kern.elf64.aslr.enable
kern.elf64.aslr.enable: 1
This knob isn't KASLR, it just enables ASLR for ELF binaries.
This is more of a Linux kernel criticism of KASLR, but perhaps it's related as to why it's not been a priority in FreeBSD (i.e. it gives a false sense of safety and rather focus on 'proper' security hardening): https://forums.freebsd.org/threads/truth-about-linux-4-6-sec...
Security is an onion, honestly you want both layers to be as hard as possible.
The most difficult part is always to find the vulnerability, not to fix it. And most people who are spending their days finding them are heavily incentivized to not disclose.
Automatic discovery can be a huge benefit, even if the transition period is scary.
Hopefully such automation also covers fixing instead of giving open source devs headaches, like the one over some obscure codec from the 90's.
Nevertheless, attacking is a targeted endeavour, unlike defense. Fixing is, in _general_, more difficult in theory.
* reference to past google and ffmpeg incident
I could see that being an incremental time save (perhaps not worth the token spend except for the dev team, not a high-value bug). But nbody finds this kind of bug "by hand" and hasn't for a long time now. Do people here really care about kernel security or testing automation? They're just talking about it because Claude? Everything on HN is people doing unpaid promotional work for Anthropic, just talking about all the promise Claude holds and all the various ways you could be spending more money on Claude. bored aimless vibes.
Thanks for sharing the prompts: https://github.com/califio/publications/blob/main/MADBugs/CV...
the prompts show how this was a back-and-forth with a lot of nudging, interruptions and steering: it's not Claude writing a full exploit just from a vulnerability description.
the finding vs exploiting distinction matters a lot here. writing an exploit for a documented CVE is a well-scoped task - the vulnerability is defined, the target is known. what's harder to quantify is the inverse - the same model writing production code that introduces new vulnerabilities it could also theoretically exploit. the offensive capability is visible and alarming. the code generation risk is distributed quietly across every PR it opens, which is why the second problem gets less attention.
As mentioned elsewhere, while this writeup is about exploiting the RCE, Claude was separately used to find and document this specific RCE.
that makes the second point stronger then - if the same model can find, document and exploit a kernel vulnerability, the question of what it introduces when writing production code becomes harder to dismiss. the capability is symmetric. the visibility isn't.
> "Claude wrote"
I am hoping that quite soon we will have general acceptance of the fact that "Claude can write code" and we will switch focus to how good / not good that code is.
Appreciate the full prompt history
Well, it ends with "can you give me back all the prompts i entered in this session", so it may be partially the actual prompt history and partially hallucination.
fwiw you can dump the actual session in a format suitable to be posted on the web with this tool: https://simonwillison.net/2025/Dec/25/claude-code-transcript...
they read like they were done by a 10 year old
They do, the whole tone and the lack of understanding of Docker, kernel threads, and everything else involved make it sound hilarious at first. But then you realize that this is all the human input that led to a working exploit in the end...
Freebsd doesn't have docker. It has jails which can serve a similar purpose but are not the same in important ways
Please at least read the context before attempting to correct me...
Here's what I'm referring to: https://github.com/califio/publications/blob/7ed77d11b21db80...
God damn, how much time am I wasting by writing full paragraphs to the Skinner box when I could just write half-formed sentences with no punctuation or grammar?
> can we demon strait somehow a unpriv non root user
"demon strait". Was this speech to text? That might explain the punctuation and grammar.
Now give an excuse for pushing so hard for docker on FreeBSD.
It's amazing what an intelligence that has infinite patience can do to understand barely comprehensible gibberish.
I'm not correcting you, I'm adding context for people who don't know much about freebsd.
Welcome to vibe coding. If you ever lurk around the various AI subreddits, you'll soon realize just how bad the average prompts and communication skills of most users are. Ironically, models are now being trained on these 5th-grade-level prompts and improving their success with them.
Just think about how your parents used google when you were a kid. What got better results faster?
https://github.com/califio/publications/tree/main/MADBugs/CV... would have been a better link
Or even better, the blog post.
https://blog.calif.io/p/mad-bugs-claude-wrote-a-full-freebsd
But I found the exploit writeup pretty interesting
I find it more concerning that this is still considered newsworthy. Frontier LLMs in the hands of anyone willing to learn and determined can be a blessing or curse.
Errrr the headline makes it sound like a bad thing.
This is what Claude is meant to be able to do.
Preventing it doing so is just security theater.
This requires an SSH to be available?
Is it possible to pwn without SSH listening?
*NFS
One of the exploits writes a public key to the authorized_keys file, so it does require SSH be open as well.
The MADBugs work is solid, but what's sticking with me is the autonomy angle — not just finding a vuln but chaining multiple bugs into a working remote exploit without a human in the loop. FreeBSD kernel security research has always been thinner on the ground than Linux, which makes this feel both more impressive and harder to put in context. What's the actual blast radius here — is this realistically exploitable on anything with default configs, or does it need very specific conditions?
FTA, top:
> Attack surface: NFS server with kgssapi.ko loaded (port 2049/TCP)
Not sure who would run an internet exposed NFS server. Shodan would know.
You also need a valid Kerberos ticket to get to the point where you can exploit.
This post is AI slop.
You do not need Claude for finding FreeBSD vulns. Just plain eyes. Pick a file you can find one.
Another Claude glazing spam post. Can I get paid to post these? What is the URL to sign up for the Claude Glazing affiliate program?
I'm just gonna assume it was asked to fix some bug and it wrote exploit instead
Running into a meeting, so won't be able to review this for a while, but exciting. I wonder how much it cost in tokens, and what the prompt/validator/iteration loop looked like.