live
tbench.ai APIwww.tbench.ai ↗
Track terminal agent performance across the Terminal-Bench benchmark by querying live rankings, accuracy scores, and detailed agent/model metadata. Compare leaderboards to see how different agents rank on various terminal tasks.
Endpoints
2
Updated
3h ago
Try it
Filter by agent name (case-insensitive substring match).
Maximum number of entries to return.
Filter by model name (case-insensitive substring match against any model in the entry's model list).
Free-text search filter. Matches against agent name, agent organization, or model names (case-insensitive).
Benchmark version. Known values for terminal-bench: 1.0, 2.0, 2.1, 3.0. Known values for terminal-bench-science: 1.0.
Benchmark name. Known values: terminal-bench, terminal-bench-science.
When set to 'true', returns only verified entries.
→ api.parse.bot/scraper/8c0968f8-4e4a-4ec7-97f3-48511c29d583/<endpoint>
Ready to send
Fill in the parameters and hit sign in to send to see live response data here.
Use it in your codegrab a free API key at signup
curl -X GET 'https://api.parse.bot/scraper/8c0968f8-4e4a-4ec7-97f3-48511c29d583/get_leaderboard' \ -H 'X-API-Key: $PARSE_API_KEY'
All endpoints · 2 totalclick to expand
Get leaderboard entries for a specific Terminal-Bench benchmark version. Returns ranked entries sorted by accuracy descending, with optional filtering by search term, agent name, model name, and verified status.
Input
| Param | Type | Description |
|---|---|---|
| agent | string | Filter by agent name (case-insensitive substring match). |
| limit | integer | Maximum number of entries to return. |
| model | string | Filter by model name (case-insensitive substring match against any model in the entry's model list). |
| search | string | Free-text search filter. Matches against agent name, agent organization, or model names (case-insensitive). |
| version | string | Benchmark version. Known values for terminal-bench: 1.0, 2.0, 2.1, 3.0. Known values for terminal-bench-science: 1.0. |
| benchmark | string | Benchmark name. Known values: terminal-bench, terminal-bench-science. |
| verified_only | string | When set to 'true', returns only verified entries. |
Response
{
"type": "object",
"fields": {
"entries": "array of leaderboard entry objects with rank, agent, model, accuracy, and metadata",
"version": "string",
"benchmark": "string",
"total_entries": "integer",
"filtered_entries": "integer"
},
"sample": {
"entries": [
{
"key": "nexau-ahe__gpt-5.5",
"date": "2026-05-14",
"rank": 1,
"agent": "NexAU-AHE",
"model": [
"GPT-5.5"
],
"stderr": 0.0107,
"accuracy": 0.847,
"verified": false,
"agent_url": "https://github.com/china-qijizhifeng/agentic-harness-engineering.git",
"agent_name": "nexau",
"model_names": [
"gpt-5.5"
],
"agent_version": "unknown",
"model_providers": [
"openai"
],
"agent_organization": "china-qijizhifeng",
"integration_method": "API",
"model_organization": [
"OpenAI"
]
}
],
"version": "2.0",
"benchmark": "terminal-bench",
"total_entries": 142,
"filtered_entries": 142
}
}About the tbench.ai API
The tbench.ai API on Parse exposes 2 endpoints for the publicly available data on www.tbench.ai. Calls return JSON over HTTPS and are billed per successful response.
Pin a release with the API-Snapshot-Version header so canonical updates don't silently change your contract.
Related APIs
artificialanalysis.ai API
Compare and rank LLM models and providers across performance benchmarks, then dive into detailed specifications for any model to find the best fit for your needs. Discover performance metrics for specialized AI systems handling speech, images, and video, plus benchmark data for different hardware configurations.
python.org API
Access comprehensive Python release information including downloads, versions, and supported operating systems, plus stay updated with the latest Python news and events. Search across Python.org's resources and browse release files, details, and the FTP index all in one place.
alienvault.com API
Search and analyze global threat intelligence data including indicators of compromise, threat pulses, and adversary profiles from the Open Threat Exchange community. Monitor recent security alerts and access detailed information about threats and adversaries to strengthen your cybersecurity defenses.
nvidia.com API
nvidia.com API
crt.sh API
Search for SSL/TLS certificates across public transparency logs by domain, fingerprint, serial number, or public key, and retrieve detailed certificate information including issuer, validity dates, and certificate chain details. Monitor certificate issuance for domains you care about to track security changes and detect unauthorized certificates.
lucide.dev API
Browse and download thousands of Lucide icons with instant search and category filtering to find exactly what you need. Get SVG files and metadata for each icon to integrate them seamlessly into your projects.
instantdomainsearch.com API
Check domain name availability instantly across over 800 TLD extensions and verify whether specific domains are registered. Search and monitor domain registration status to find your perfect web address or track competitor domains in real-time.
producthunt.com API
Access Product Hunt's daily leaderboards, detailed product pages, and search functionality. Retrieve ranked product launches for any date or date range, including upvote counts, descriptions, maker information, tags, and external links.