A coding agent is the model plus everything you build around it.
Model · Harness · Ratchet · HaaS
Claude Code、Cursor、Codex — 底層模型不同,差異主要來自 harness。
Terminal Bench 2.0
同樣 Opus 4.6,
只改 harness →
Top 30 → Top 5
AGENTS.mdrm -rf / DROP TABLEAgent 錯誤不是一次性故事,不是「運氣差」,是永久信號。
.skip( / xit(
模型的推理能力隨 context 填滿而下降。三種技術應對 context rot。
"I told the agent"
≠ "The system enforces"
rm -rf / git push --force / DROP TABLE"Every component in a harness encodes an assumption about what the model can't do on its own."
— Anthropic Engineering
Top coding agents look more like each other than their underlying models do.