SWE-bench Pro
Leaderboard view. Green bars use Warp Grep, outlined bars are baseline.
SWE-bench Pro Improvement
Model +
WarpGrep
Baseline
15%
Average Cost Reduction
19%
Average Time Reduction
28
Turns Saved on Average
Performance Breakdown
Official SWE-bench Pro benchmark, MiniMax 2.5 with and without Warp Grep.
Sweep: MiniMax 2.5
| Metric | Baseline | Warp Grep | Delta |
|---|---|---|---|
| Avg events/instance | 157 | 135 | 14% faster |
| Avg prompt tokens | 2,926,502 | 2,461,973 | 16% less |
| Avg completion tokens | 17,190 | 15,222 | 11% less |
| Avg reasoning tokens | 7,347 | 6,835 | 7% less |
| Avg cost/instance | $0.18 | $0.15 | 17% cheaper |
| Total cost (18 inst) | $3.26 | $2.77 | 15% cheaper |
WarpGrep helps models focus on coding, not searching.
-39%
Input Tokens
-26%
Agent Turns
+10%
Tasks Solved
Claude 4.5 Opus on SWE-bench, with vs. without WarpGrep.
15% cheaper · 10% more accurate · 26% fewer turns
Build better coding agents
WarpGrep is available as an API and SDK component. Join 500+ teams using Morph.