Benchmarks

Public measured benchmark is in progress. We are not citing headline numbers on this site until it lands.

Public measured run (in progress) · github.com/finsavvyai/clawpipe-booster-benchmark
Pre-registered methodology v1.0 was locked on 2026-05-18. Methodology was published before any results to prevent post-hoc selection of workloads or thresholds. Decision rule (commit / library / archive) was set before the run and is binding on the result.

What the measured run covers

Read or comment

METHODOLOGY.md v1.0 · DECISION-RULE.md · Public review thread (closed 2026-05-18, methodology locked)

Prior synthetic benchmark (preserved for transparency)

Why this section exists. Before the public measured run, we published a synthetic in-house benchmark. The numbers below were generated against a mocked gateway on 200 unique prompts × 2 passes. They are not customer-measured savings and are not a defensible cost-reduction claim. We are preserving them here for transparency, not citing them on marketing surfaces.

Reproduce the prior synthetic run

git clone https://github.com/finsavvyai/clawpipe-sdk
cd clawpipe/benchmarks
npm install
npx tsx run-benchmark.ts
open results/summary.md

The dataset (benchmarks/prompt-dataset.json), runner (run-benchmark.ts), and raw results (results/benchmark-results.json) are in the repo. The runner file is explicit at line 5 that the gateway is mocked.

Start free →   Measured benchmark in progress →