Claude Opus 4.8: The System Card¶
Ch01.581 Claude Opus 4.8: The System Card¶
📊 Level ⭐⭐ | 6.1KB |
entities/claude-opus-4-8-system-card-zvi.md
Claude Opus 4.8: The System Card¶
深度分析¶
Published Time: 2026-05-29T20:50:28+00:00
Markdown Content: Only six weeks after Opus 4.7, we have Opus 4.8.
For everyone, that means another incremental upgrade to Claude. It is once again smarter, and can do tasks for longer, and comes with a number of hot new features.
For me, that also means reading another 244 page system card.
It was only April 20 when I did a full review of the Opus 4.7 system card, plus an additional post focusing on related issues of model welfare.
These updates are incremental and coming more rapidly, and this still is below the capability level of Claude Mythos, so the focus will be on the delta. What is different about Opus 4.8 versus what we already know about Opus 4.7 and Mythos?
It turns out there’s still a lot to talk about.

Image created as self-portrait for this post by Claude Opus 4.8
Table of Contents¶
- Here We Go Again: Executive Summary.
- Introduction (1).
- RSP Evaluations (2).
- Move That Goalpost.
- The Failures Are News.
- Alignment Risk Slowly Rises.
- New Risk Pathways Just Dropped.
- Cyber (3).
- Harmful Requests (4.1).
- We Need To Talk (4.2 and 4.3).
- Overcoming Bias (4.4).
- Agentic Safety (5).
- Prompt Injection (5.2).
- Alignment (6).
- Looking For Problems.
- Who Watches The Training (6.2.2).
- Automated Behavioral Audit.
- The Model Is Smarter Than The Eval (6.2.3.2).
- You Should See The Other Guy.
- UK AISI Testing (6.2.4).
- In Vendbench (6.2.5).
- Honesty (6.3.3 to 6.3.6).
- Chain of Thought (CoT) Monitorability (6.5).
- What’s In The Box? (6.6).
- That’s All For Now.
Here We Go Again: Executive Summary¶
Again, this is my summary of their summary, plus additional key points.
- Mythos still exists, so it is unsurprising this did not set off the RSP triggers.
- Cyber capabilities are better than 4.7 but still well behind Mythos. Mythos seems to be an outlier in its cyber capabilities, relative to its other capabilities.
- Other capabilities are also better than 4.7 but still behind Mythos.
- Honesty is improved quite a bit across the board, especially agentic honesty.
- Mundane safety is, in all key aspects, as good or better for 4.8 than for 4.7.
- Mundane alignment is also robustly as good or better for 4.8 than for 4.7.
- There was some backsliding on prompt injections, computer use and adversarial situations, likely due to taking out training on this to avoid dishonesty.
- The ‘can you pull off various underhanded tasks’ tests still failed, although if it was properly underhanded you would see that, wouldn’t you?
- Anthropic evaluates the model welfare situation as good.
Introduction (1)¶
Standard training disclosures. No changes.
RSP Evaluations (2)¶
Because Mythos exists there is no new Risk Report for Claude Opus 4.8. Fair.
They go over the evals and keep saying ‘Mythos is better.’ Again, reasonably fair.
I don’t love that they used this as a reason to skip a bunch
相关实体¶
- Claude Opus 47
- Claude 4 5 Sonnet Opus Release Notes
- 刚刚Opus 47发布相比46核心变化与Claude Code搭配最佳实践 V2
- Tokenomics The 625 Minute Rule For Claudes Cache
- Anthropic Long Running Agent Adversarial Architecture
- MOC
→ 原文存档