Tiny Spoon

Big AI news, in small bites

PRODUCTAnthropic
1,000 SUBAGENTS!CLAUDE OPUS 4.8

Anthropic shipped Claude Opus 4.8 alongside Dynamic Workflows: Claude writes a JavaScript script that plans the work, then orchestrates up to 1,000 subagents (16 concurrent) in Claude Code. SWE-bench Verified: 88.6%. Pricing flat at $5/$25 per Mtok. Anthropic called the model "a modest but tangible improvement."

The model upgrade is incremental. The Dynamic Workflows feature is not. One Claude plans a job. Claude writes a runtime script. Up to 1,000 sub-agents do the work in parallel.

Benchmarks: 88.6% SWE-bench Verified (up from 87.6%), 74.6% Terminal-Bench 2.1, 93.6% GPQA Diamond, 1890 Elo on GDPval-AA. Sub-features matter more. Mid-task system messages on the Messages API. Optional 2.5x fast mode for cheaper inference. Honesty improvements in the alignment assessment. Same $5/$25 per Mtok pricing as Opus 4.7. Anthropic deliberately calls it a "modest" release. They're saving the Opus 5 marketing budget.

For PMs building agent products: Dynamic Workflows kills the long-running single-prompt pattern. Plan the orchestrator, not the prompt. For execs: 1,000-subagent batch jobs make overnight agentic work the new SLA. For dev infra: budget for Claude Code spend to 2-3x by end of summer.

▾ full brief & sources

Why this matters

  • Dynamic Workflows is the first production-grade implementation of "agent that orchestrates 1,000 sub-agents". Not a demo. In Claude Code.
  • Mid-conversation system messages let agents change behavior mid-task without losing prompt-cache hits. Big cost saver for long runs.
  • Anthropic shipped same-day with the $65B raise. The model release is the proof point for the round.

🔍 What happened

  • May 28, 2026. Anthropic released Claude Opus 4.8 across Claude, the API (claude-opus-4-8), Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.
  • Benchmarks vs Opus 4.7: SWE-bench Verified 88.6% (87.6%), SWE-bench Pro 69.2% (64.3%), MCP-Atlas 82.2% (77.3%), BrowseComp 84.3% (79.3%).
  • Terminal-Bench 2.1: 74.6%. GPQA Diamond: 93.6%. GDPval-AA: 1890 Elo, leading.
  • Dynamic Workflows: Claude writes a JavaScript orchestrator that runs in a background runtime, dispatches subagents, checks checkpoints, resumes from saved state.
  • Limits: 16 concurrent subagents, 1,000 total per workflow run.
  • Mid-task system messages: role: "system" turns accepted after user turns, preserving prompt cache hits.
  • Optional 2.5x fast mode for cheaper inference where quality tolerance allows.
  • Pricing flat at $5 / $25 per Mtok (input / output).

💬 Smart takes

  • Simon Willison: "A modest but tangible improvement." Notes mid-conversation system messages are "really powerful" for steering an agent without breaking the cache.
  • Anthropic launch post: Opus 4.8 is for "fiduciary-grade AI systems for legal and tax professionals." Honesty improvements are the headline.
  • Every (Vibe Check): "Anthropic should've rounded up to 5." The capability jump is bigger than the version number suggests.
  • Skeptic: 1,000 subagent runs are also 1,000 ways to silently spend money. Without budgets and observability, finance teams find out at month-end.

🧭 Where this goes

  1. OpenAI and Google ship comparable orchestration primitives (Codex sessions, Gemini Spark workflows) within 60 days.
  2. Dev-tools layer rewrites: CI/CD systems start delegating to Claude Code Dynamic Workflows for long-running migrations.
  3. Enterprise FinOps gets a new category: per-workflow agent budgets, kill-switches at $X per run.
  4. Anthropic ships Opus 5 with a new capability axis (multi-modal or world-model) before Q4.

🎯 Implication

  • For PMs building agent products: design for orchestrators, not prompts. The new unit of work is a workflow, not a turn.
  • For execs procuring AI: ask vendors what their "1,000 subagent" demo looks like in your environment. The gap between labs widens here.
  • For dev infra leaders: add per-workflow budget caps to your Claude Code rollout this quarter.