跳转至

Mythos finds a curl vulnerability

Ch12.009 Mythos finds a curl vulnerability

📊 Level ⭐⭐ | 45.1KB | entities/mythos-finds-a-curl-vulnerability.md

"Mythos finds a curl vulnerability"

URL Source: https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-vulnerability/ Published Time: 2026-05-11T08:01:35+02:00 Markdown Content:

Mythos finds a curl vulnerability | daniel.haxx.se

Skip to content Image 1: daniel.haxx.se

daniel.haxx.se

Search Primary Menu

Mythos finds a curl vulnerability

May 11, 2026Daniel Stenberg23 Comments yes, as in singular one. Back in April 2026 Anthropic caused a lot of media noise when they concluded that their new AI model Mythos is dangerously good at finding security flaws in source code. Apparently Mythos was so good at this that Anthropic would not release this model to the public yet but instead trickle it out to a selected few companies for a while to allow a few good ones(?) to get a head start and fix the most pressing problems first, before the general populace would get their hands on it. The whole world seemed to lose its marbles. Is this the end of the world as we know it? An amazingly successful marketing stunt for sure.

My (non-) access

Part of the deal with project Glasswing was that Anthropic also offered access to their latest AI model to "Open Source projects" via Linux Foundation. Linux Foundation let their project Alpha Omega handle this part, and I was contacted by their representatives. As lead developer of curl I was offered access to the magic model and I graciously accepted the offer. Sure, I'd like to see what it can find in curl. I signed the contract for getting access, but then nothing happened. Weeks went past and I was told there was a hiccup somewhere and access was delayed. Eventually, I was instead offered that someone else, who has access to the model, could run a scan and analysis on curl for me using Mythos and send me a report. To me, the distinction isn't that important. It's not that I would have a lot of time to explore lots of different prompts and doing deep dive adventures anyway. Getting the tool to generate a first proper scan and analysis would be great, whoever did it. I happily accepted this offer. (I am purposely leaving out the identity of the individual(s) involved in getting the curl analysis done as it is not the point of this blog post.)

AI scans of curl

Before this first Mythos report, we had already scanned curl with several different very capable AI powered tools (I mean in addition to running a number of "normal" static code analyzers all the time, using the pickiest compiler options and doing fuzzing on it for years etc). Primarily AISLE, Zeropath and OpenAI's Codex Security have been used to scrutinize the code with AI. These tools and the analyses they have done have triggered somewhere between two and three hundred bugfixes merged in curl through-out the recent 8-10 months or so. A bunch of the findings these AI tools reported were confirmed vulnerabilities and have been published as CVEs. Probably a dozen or more. Nowadays we also use tools like GitHub's Copilot and Augment code to review pull requests, and their remarks and complaints help us to land better code and avoid merging new bugs. I mean, we still merge bugs of course but the PR review bots regularly highlight issues that we fix: our merges would be worse without them. The AI reviews are used in addition to the human reviews. They help us, they don't replace us. We also see a high volume of high quality security reports flooding in: security researchers now use AI extensively and effectively. Security is a top priority for us in the curl project. We follow every guideline and we do software engineering properly, to reduce the number of flaws in code. Scanning for flaws is just one of many steps to keep this ship safe. You need to search long and hard to find another software project that makes as much or goes further than curl, for software security. Image 3 Steps involved in keeping curl secure

May 6, 2026

It was with great anticipation we received the first source code analysis report generated with Mythos. Another chance for us to find areas to improve and bugs to fix. To make an even better curl. This initial scan was made on curl's git repository and its master branch of a certain recent commit. It counted 178K lines of code analyzed in the src/ and lib/ subdirectories. The analysis details several different approaches and methods it has performed the search, and how it has focused on trying to find which flaws. A fun note in the top of the report says:

curl is one of the most fuzzed and audited C codebases in existence (OSS-Fuzz, Coverity, CodeQL, multiple paid audits). Finding anything in the hot paths (HTTP/1, TLS, URL parsing core) is unlikely. … and it correctly found no problems in those areas. Image 4 Completely unscientific poll on Mastodon about people's expectations for Mythos scanning curl

The size of curl

curl is currently 176,000 lines of C code when we exclude blank lines. The source code consists of 660,000 words, which is 12% more words than the entire English edition of the novel War and Peace. On average, every single production source code line of curl has been written (and then rewritten) 4.14 times. We have polished on this. Right now, the existing production code in git master that still remains, has been authored by 573 separate individuals. Over time, a total of 1,465 individuals have so far had their proposed changes merged into curl's git repository. We have published 188 CVEs for curl up until now. curl is installed in over twenty billion instances. It runs on over 110 operating systems and 28 CPU architectures. It runs in every smart phone, tablet, car, TV, game console and server on earth.

Five findings became one

The report concluded it found five "Confirmed security vulnerabilities". I think using the term confirmed is a little amusing when the AI says it confidently by itself. Yes, the AI thinks they are confirmed, but the curl security team has a slightly different take. Five issues felt like nothing as we had expected an extensive list. Once my curl security team fellows and I had poked on the this short list for a number of hours and dug into the details, we had trimmed the list down and were left with one confirmed vulnerability. The other four were three false positives (they highlighted shortcomings that are documented in API documentation) and the fourth we deemed "just a bug". The single confirmed vulnerability is going to end up a severity low CVE planned to get published in sync with our pending next curl release 8.21.0 in late June. The flaw is not going to make anyone grasp for breath. All details of that vulnerability will of course not get public before then, so you need to hold out for details on that. The Mythos report on curl also contained a number of spotted bugs that it concluded were not vulnerabilities, much like any new code analyzer does when you run it on hundreds of thousands of lines of code. All the bugs in the report are being investigated and one by one we are fixing those that we agree with. All in all about twenty bugs that are described and explained very nicely. Barely any false positives, so I presume they have had a rather high threshold for certainty. curl is certainly getting better thanks to this report, but counted by the volume of issues found, all the previous AI tools we have used have resulted in larger bugfix amounts. This is only natural of course since the first tools we ran had many more and easier bugs to find. As we have fixed issues along the way, finding new ones are slowly becoming harder. Additionally, a bug can be small or big so it's not always fair to just compare numbers

Not particularly "dangerous"

My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing. This is just one source code repository and maybe it is much better on other things. I can only tell and comment on what it found here.

Still very good

But allow me to highlight and reiterate what I have said before: AI powered code analyzers are significantly better at finding security flaws and mistakes in source code than any traditional code analyzers did in the past. All modern AI models are good at this now. Anyone with time and some experimental spirits can find security problems now. The high quality chaos is real. Any project that has not scanned their source code with AI powered tooling will likely find huge number of flaws, bugs and possible vulnerabilities with this new generation of tools. Mythos will, and so will many of the others. Not using AI code analyzers in your project means that you leave adversaries and attackers time and opportunity to find and exploit the flaws you don't find.

How AI analyzers differ

  • They can spot when the comment says something about the code and then conclude that the code does not work as the comment says.
  • It can check code for platforms and configurations we otherwise cannot run analyzers for
  • It "knows" details about 3rd party libraries and their APIs so it can detect abuse or bad assumptions.
  • It "knows" details about protocols curl implements and can question details in the code that seem to violate or contradict protocol specifications
  • They are typically good at summarizing and explaining the flaw, something which can be rather tedious and difficult with old style analyzers.
  • They can often generate and offer a patch for its found issue (even if the patch usually is not a 100% fix).

More details from the report

Zero memory-safety vulnerabilities found. Methodology note: this review is hand-driven analysis using LLM subagents for parallel file reads, with every candidate finding re-verified by direct source inspection in the main session before being recorded. The CVE to variant-hunt mapping was built from curl's own vuln.json. No automated SAST tooling was used. This outcome is consistent with curl's status as one of the most heavily fuzzed and audited C codebases. The defensive infrastructure (capped dynbufs everywhere, curlx_str_number with explicit max on every numeric parse, curlx_memdup0 overflow guard, CURL_PRINTF format-string enforcement, per-protocol response-size caps, pingpong 64KB line cap) systematically closes the bug classes that would normally be productive in a codebase this size. Coverage now includes: all minor protocols, all file parsers, all TLS backends' verify paths, http/1/2/3, ftp full depth, mprintf, x509asn1, doh, all auth mechanisms, content encoding, connection reuse, session cache, CLI tool, platform-specific code, and CI/build supply chain.

AI finds existing kinds of errors

It should be noted that the AI tools find the usual and established kind of errors we already know about. It just finds new instances of them. We have not seen any AI so far report a vulnerability that would somehow be of a novel kind or something totally new. They do not reinvent the field in that way, but they do dig up more issues than any other tools did before.

More to find

These were absolutely not the last bugs to find or report. Just while I was writing the drafts for this blog post we have received more reports from security researchers about suspected problems. The AI tools will improve further and the researchers can find new and different ways to prompt the existing AIs to make them find more. We have not reached the end of this yet. I hope we can keep getting more curl scans done with Mythos and other AIs, over and over until they truly stop finding new problems.

Credits

Thanks to Anthropic and Alpha Omega for providing the model, the tools and doing the scan for us. Thanks also to the individual who did the scan for us. Much appreciated! Top image by Jin Kim from Pixabay Thanks for flying curl. It's never dull. AIcURL and libcurlSecurity

Post navigation

Previous Post Approaching zero bugs?

23 thoughts on "Mythos finds a curl vulnerability"

  1. Image 5Pavansays: May 11, 2026 at 08:45 This was a fun read, thanks Daniel for the writeup. Reply
  2. Image 6Ximon Eighteensays: May 11, 2026 at 09:10 Minor typo in "It "knows" details about protocols curl implements and can question details in the code that seem to violate or contract protocol specifications". I suspect that should be "contradict" not "contract". Reply
    1. Image 7Daniel Stenbergsays: May 11, 2026 at 09:58 @Ximon: thanks, fixed! Reply
  3. Image 8Demi Marie Obenoursays: May 11, 2026 at 10:26 I think nghttp2, ngtcp2, and nghttp3 would be good next targets. All three seem to be maintained by one person, and all three are used by libcurl. And I suspect none of them gets anywhere near as much attention. Other relevant libraries would be OpenSSL, c-ares, and libidn2. Reply
  4. Image 9sin99xxsays: May 11, 2026 at 13:27 curl flooding with "High Quality Report Spam" is crazy Reply
  5. Image 10The pedantsays: May 11, 2026 at 13:52 gasp for breath 🙂 Reply
  6. Image 11TimothyEricssonsays: May 11, 2026 at 14:39 Excellent writeup. I'm not worried about Mythos, I'm worried about what comes next year! New exotic security findings, it'll be really cool to see what superintelligence can find. Reply
  7. Image 12Yawarsays: May 11, 2026 at 15:44 I remember the days when you were getting flooded with low-quality AI-assisted vulnerability spam. Those seemed like quite desperate days! Have we come out of that tunnel of despair? Reply
  8. Image 13Karl Otssays: May 11, 2026 at 17:30 Very useful report, thanks for sharing, Daniel! Feels like a good reality check across all the FUD and speaks volumes on behalf of good hygiene and secure development practices. I'm curious, can you share the cost token cost spent to find these 5 "confirmed" vulnerabilities? How did this compare to Codex Security? Reply
    1. Image 14Daniel Stenbergsays: May 12, 2026 at 00:01 @Karl: we get all of this for free, thanks to friendly donors. I don't know the the spending not the real costs. Reply
  9. Image 15Jacobsays: May 11, 2026 at 18:44 Thanks for this insightful write up Daniel. Did the team that performed the scan and analysis mention a ball park on token usage? I have heard differing things in my circles. Reply
    1. Image 16Daniel Stenbergsays: May 12, 2026 at 00:00 @Jakob: nope, and I didn't ask and I frankly did not care. This access and all the tokens needed were provided as a gift. Reply
  10. Image 17Tomsays: May 11, 2026 at 20:07 Do you have the option of running it on an older code base to se what it would find, to do a pile for like comparison for when you used other AI issue checkers? Reply
    1. Image 18Daniel Stenbergsays: May 11, 2026 at 23:59 @Tom: I'll leave such comparisons for someone else. I mostly care about improving curl. Reply
  11. Image 19Balasays: May 11, 2026 at 20:19 Daniel– the individual who ran the scan for you, did they just do static code analysis or dynamic fuzzing? Not finding any critical vuln speaks very highly of Curl's though this might be an exception. Reply
  12. Image 20Peter Tärningsays: May 11, 2026 at 22:29 I think it's maybe not entirely fair to judge Mythos' capabilities based on a security scan for vulnerabilities in curl. Still, it's good that it actually found something that can be fixed. Maybe Anthropic/Mythos is just hype or marketing — I personally don't think so, at least judging by other reports. I'd rather say (again, just my own opinion) that curl is one of the few open source projects where deep expertise, dedication, and long-term commitment really pay off. Feel free to compare Daniel (curl) with Linus (Linux). Reply
  13. Image 21William Kielysays: May 12, 2026 at 01:41 [Disclaimer: Layperson here.] It sounds like Mythos was just prompted once to to run a scan and analysis on curl and then generate a report? If so, might Mythos find several more vulnerabilities with more or better prompting? Naively, one hypothesis I'd have for the following is that the other tools may have been used better and/or more times than Mythos (if Mythos was in fact just prompted once to do the scan and generate a report), and that might be why Mythos found fewer bugs than those other tools:

    counted by the volume of issues found, all the previous AI tools we have used have resulted in larger bugfix amounts Getting more insight into what the individual with the Mythos access did to generate this report and whether or not this was close to exhaustive use of Mythos to improve curl would be helpful. Five issues felt like nothing as we had expected an extensive list. Hypothesis: Your expectations weren't off; the Mythos tool just wasn't used as effectively or exhaustively as it potentially could have been. Thoughts on this hypothesis? Reply

    1. Image 22mehsays: May 13, 2026 at 13:22 Great post, thanks for your work! @William Kiely: As stated in the post, someone related to the Linux Foundation did the analysis. Without any further knowledge I would assume: First, this person is probably a well educated software developer or security researcher as well. And second, this person has probably done this for multiple FOSS projects, at least more often than Stenberg who would've only used it once on curl (also an assumption). And while "curl is one of the most fuzzed and audited C codebases in existence" sounds like an excuse placed at the beginning of the report, curl is without any doubt one of the most fuzzed and audited codebases in existence, and you would not expect many bugs there. Reply
  14. Image 23Willysays: May 12, 2026 at 04:57 We've found a bunch of bugs and a few vulnerabilities in haproxy using AI-based tools, which if great, but honestly when reading this I'm having more and more doubts about Mythos being more than marketing hype. OK it might be more powerful than other models, but I think there's no point running something that big if you haven't first run other models locally first to catch all the more visible slack. And only once you're used to seeing only false positives or low-importance stuff that you don't care about, it might make sense to try bigger models like Mythos and see if they find anything different. Reply
  15. Image 24Hardik Cholerasays: May 12, 2026 at 05:08 This was an awesome read, thank you Daniel. Keep flying Curl! Reply
  16. Image 25Mathias Przybylowiczsays: May 12, 2026 at 11:46 Great stuff Daniel – thanks for sharing. You & other people here might be interested in an article about the "Mythos effect" I published last week: https://mathiasprzybylowicz.substack.com/p/claude-mythos-software-security The article itself is an exec summary, there's a 57-pages (yeah, it's crazy :D) downloadable monograph which might complement your views. Reply
  17. Image 26Ikensays: May 12, 2026 at 14:18 Can you try again with GPT 5.5 Cyber? https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities Reply
  18. Image 27Wolfgangsays: May 12, 2026 at 14:41 It is somewhat disappointing that the lavishly funded AlphaOmega organization first promises access to open source developers and then backpedals. We do not know if Anthropic issues directives to only contact projects if issues have been found in order to avoid the situation that Mythos has not found anything at all in a specific project. That is probably one of the reasons for the embargo. If only a fraction of the money that AlphaOmega receives went to open source authors, perhaps no external audits would be needed. It also disturbs me that Anthropic/AlphaOmega get all the glory, money and publicity for finding a small amount of issues while the real open source authors get very little. There are no NYT articles if the Curl authors find and quietly fix issues. Reply

Leave a Reply Cancel reply

Your email address will not be published.Required fields are marked * Comment * Name * Email * Website Time limit is exhausted. Please reload CAPTCHA.one 6 one 5 eight Δ This site uses Akismet to reduce spam. Learn how your comment data is processed.

Recent Posts

Recent Comments

curl, open source and networking

Image 28 Sponsor me:on GitHub Follow me: @bagder Keep up: RSS-feed Email: weekly reports May 2026| M | T | W | T | F | S | S | || --- | --- | --- | --- | --- | --- | --- | || | 1 | 2 | 3 | || 4 | 5 | 6 | 7 | 8 | 9 | 10 | || 11 | 12 | 13 | 14 | 15 | 16 | 17 | || 18 | 19 | 20 | 21 | 22 | 23 | 24 | || 25 | 26 | 27 | 28 | 29 | 30 | 31 | |« Apr PrivacyProudly powered by WordPress Image 29 Image 30

深度分析

Daniel Stenberg 关于 Mythos 扫描 curl 代码库的亲身经历揭示了 AI 安全分析工具的现实状态: 1. "极度危险"宣称的实证检验 Anthropic 在 2026 年 4 月宣称 Mythos "极度擅长发现源代码中的安全缺陷",引发全球关注。但对 curl 这个拥有 176,000 行 C 代码、已发布 188 个 CVE 的极度成熟代码库进行实际扫描后,Mythos 报告了 5 个"确认"漏洞,经人工审核后只有 1 个确实是漏洞(严重性为"低")。这表明媒体炒作与实际能力存在显著差距。 2. AI 分析工具的差异化价值 尽管 Mythos 在 curl 上的表现不如预期,Daniel 强调 AI 代码分析器"显著优于"传统静态分析工具。目前 AISLE、Zeropath、Codex Security 等工具已在过去 8-10 个月内为 curl 贡献了 200-300 个错误修复。这种"高质量混乱"意味着 AI 工具正在改变安全研究的格局,使更多漏洞能被及时发现。 3. 成熟代码库的"天花板效应" curl 可能是被审计最充分的 C 代码库之一——经过 OSS-Fuzz、Coverity、CodeQL 和多次付费审计。这种程度的 fuzzing 和审查意味着"热路径"(HTTP/1、TLS、URL 解析核心)中几乎不可能发现新问题。Mythos 的发现集中在边缘组件,印证了这一点。 4. AI 工具仍会发现已知类型的错误 Daniel 明确指出:"我们尚未看到任何 AI 报告某种全新或完全新颖的漏洞。它们不会重新发明这个领域,只是挖掘出比以前工具更多的已知类型问题。"这提醒我们,AI 是强大的力量倍增器,但并非安全问题的银弹。 5. 开源生态与 AI 安全的复杂关系 Linux Foundation 的 Alpha Omega 项目试图为开源项目提供 Mythos 访问,但实际执行中遇到延迟,最终只能提供间接扫描服务。这一案例反映了AI 安全工具从研究到实际可用的转化过程中存在的障碍。

实践启示

对于安全团队: 1. 不要被 AI 安全的营销炒作迷惑 — 对任何声称"革命性"的 AI 安全工具,将其置于你已有的工具链中进行验证。Daniel 的经验表明,多个工具组合使用(fuzzing + SAST + AI)比依赖单一"神奇"工具更有效。 2. 成熟代码库需要更深入的扫描 — 如果你的项目已经经过多年安全审计,AI 工具可能会发现一些遗漏,但更可能是边缘组件中的低危问题。调整预期,专注于建立持续的扫描文化而非寻找"圣杯"。 3. AI 工具评审应作为人工评审的补充而非替代 — curl 团队使用 AI 评审 PR,这帮助他们避免合并错误,但人工审核仍然必不可少。AI 能发现人类容易忽略的模式,但理解上下文和业务逻辑仍需要人类专家。 对于开发者: 1. 建立纵深防御而非依赖单一安全措施 — curl 的成功很大程度上归功于其系统性的防御基础设施(capped dynbufs、curlx_str_number、curlx_memdup0 溢出保护等)。即使 AI 工具变得更强大,这些基本工程实践仍然不可或缺。 2. 定期使用 AI 工具扫描代码 — 如果你尚未在项目中集成 AI 代码分析,现在是好时机。任何项目都能从扫描中受益——问题只在于严重程度。早期采用者已经修复了"容易发现"的问题,所以扫描越晚,剩余问题可能越少。 3. 关注工具的实际效果而非宣传 — Daniel 选择分享完整的实验细节,包括失败和成功,这是一种对社区负责的态度。评估工具时,寻找这类客观的、来自实践者的报告。 对于开源维护者: 1. 主动联系 AI 安全计划 — Alpha Omega 项目等倡议为开源项目提供免费的安全扫描。即使实际获得访问可能延迟,登记参与仍值得尝试。 2. 记录你的安全实践 — curl 详细的防御基础设施和 188 个 CVE 的历史,使其成为测试 AI 工具的有价值基准。你的项目可能不需要如此复杂的记录,但知道什么已被测试过有助于评估新工具的结果。 3. 平衡宣传与实际价值 — 如果你的项目因 AI 工具发现了漏洞而获得关注,保持现实态度。低严重性漏洞的过度宣传可能弊大于利;专注于实际改进代码安全。

相关实体