Discover/epoch.ai API
live

epoch.ai APIepoch.ai

Retrieve FrontierMath benchmark scores for 90+ AI models across Tier 4 and Tier 1-3 difficulty levels, with provider info and release dates.

Endpoints
3
Updated
14d ago
Try it

No input parameters required.

api.parse.bot/scraper/8633ba07-9301-42b3-905a-68d6f1233019/<endpoint>
Ready to send
Fill in the parameters and hit sign in to send to see live response data here.
Use it in your codegrab a free API key at signup
curl -X GET 'https://api.parse.bot/scraper/8633ba07-9301-42b3-905a-68d6f1233019/get_scores' \
  -H 'X-API-Key: $PARSE_API_KEY'
All endpoints · 3 totalclick to expand

Get all FrontierMath benchmark scores for both Tier 4 and Tier 1-3, merged into a single table. Each model has scores for both tiers (null if not evaluated for a tier). Sorted by Tier 4 score descending, then Tier 1-3 score.

Input

No input parameters required.

Response
{
  "type": "object",
  "fields": {
    "models": "array of model objects with model_name, model_id, provider, release_date, tier4_score, tier4_score_pct, tier4_error, tier4_error_pct, tier13_score, tier13_score_pct, tier13_error, tier13_error_pct",
    "tier4_task": "string - Tier 4 task identifier",
    "tier13_task": "string - Tier 1-3 task identifier",
    "tier4_count": "integer - number of models with Tier 4 scores",
    "tier13_count": "integer - number of models with Tier 1-3 scores",
    "total_models": "integer - total unique models across both tiers"
  },
  "sample": {
    "data": {
      "models": [
        {
          "model_id": "gpt-5.5-pro-pre-release_high",
          "provider": "OpenAI",
          "model_name": "GPT-5.5 Pro (high)",
          "tier4_error": 0.071,
          "tier4_score": 0.396,
          "release_date": "2026-04-23",
          "tier13_error": 0.029,
          "tier13_score": 0.524,
          "tier4_error_pct": 7.1,
          "tier4_score_pct": 39.6,
          "tier13_error_pct": 2.9,
          "tier13_score_pct": 52.4
        }
      ],
      "tier4_task": "FrontierMath-Tier-4-2025-07-01-Private",
      "tier13_task": "FrontierMath-2025-02-28-Private",
      "tier4_count": 62,
      "tier13_count": 92,
      "total_models": 96
    },
    "status": "success"
  }
}

About the epoch.ai API

This API exposes 3 endpoints that return FrontierMath benchmark scores from Epoch AI's leaderboard, covering 90+ AI models from OpenAI, Anthropic, Google DeepMind, Meta AI, DeepSeek, and others. The get_scores endpoint merges Tier 4 and Tier 1-3 results into a single table, while dedicated endpoints for each tier return score percentages, error margins, model release dates, and provider organization for every evaluated model.

Endpoints and Coverage

The API provides three endpoints covering Epoch AI's FrontierMath leaderboard. get_scores returns a unified view of both difficulty tiers merged into one array, where each model object carries tier4_score, tier4_score_pct, tier13_score, tier13_score_pct, and associated error fields — with null values where a model has not been evaluated on a particular tier. The response also includes summary counts: tier4_count, tier13_count, and total_models.

Tier-Specific Endpoints

get_tier4_scores targets the hardest FrontierMath problems (approximately 50 in total) and returns all models that have been evaluated at that level, sorted by score descending. Each model object includes display_name, model_id, release_date, organization, score_pct, and score_error. get_tier13_scores covers the 290-problem Tier 1-3 set and returns the same field shape. Both endpoints expose the task identifier string, which identifies the benchmark task tracked on the leaderboard.

Data Shape and Sorting

All three endpoints sort results by score descending — get_scores sorts by Tier 4 score first, then Tier 1-3 score as a tiebreaker. Model identity is consistent across endpoints via model_id. Provider attribution is available as organization (in the tier-specific endpoints) or provider (in get_scores). None of the endpoints require input parameters; each call returns the full current leaderboard snapshot.

Common use cases
  • Track which AI models lead the FrontierMath Tier 4 leaderboard using score_pct and score_error fields.
  • Compare Tier 4 vs Tier 1-3 performance gaps for a given model using the merged get_scores endpoint.
  • Filter models by organization to monitor benchmarks for a specific AI lab such as Anthropic or DeepSeek.
  • Plot score progression over time using release_date alongside score_pct across model generations.
  • Identify models that have only been evaluated on one tier by checking for null scores in the merged response.
  • Build a research dashboard showing frontier math reasoning capability across 90+ models and multiple providers.
  • Alert pipelines when new models appear on the leaderboard by comparing total_models across periodic calls.
Pricing & limitsSee full pricing →
TierPriceCredits/monthRate limit
Free$0/mo1005 req/min
Hobby$30/mo1,00020 req/min
Developer$100/mo5,000250 req/min

One credit = one API call regardless of which marketplace API you call. Exceeding the rate limit returns a 429 response. Authenticate with the X-API-Key header.

Frequently asked questions
Does Epoch AI have an official developer API for FrontierMath data?+
Epoch AI does not currently offer a public developer API for the FrontierMath leaderboard data. The organization publishes the leaderboard at epoch.ai/frontiermath/tiers-1-4 as a research resource, but there is no documented REST or GraphQL API available to developers.
What does the `get_scores` endpoint return that the tier-specific endpoints don't?+
get_scores merges both tiers into one array and exposes tier4_score, tier4_score_pct, tier4_error, tier13_score, and tier13_score_pct as parallel fields on each model object. It also returns tier4_count, tier13_count, and total_models as top-level summary integers. The tier-specific endpoints (get_tier4_scores, get_tier13_scores) return a simpler shape focused on one tier's score, score_pct, and score_error, and include a display_name field not present in the merged response.
Can I filter results by provider or retrieve scores for a single model by ID?+
The endpoints do not accept filter parameters — each call returns the full leaderboard. Filtering by organization/provider or selecting by model_id needs to be done client-side on the returned array. You can fork this API on Parse and revise it to add a filtered endpoint that accepts an organization name or model_id as a query parameter.
Does the API include historical scores or benchmark results beyond FrontierMath?+
Not currently. The API covers the FrontierMath Tier 4 and Tier 1-3 leaderboards only, reflecting the current published scores for each model. Historical score snapshots and other Epoch AI benchmarks (such as MATH or GPQA data published elsewhere on the site) are not included. You can fork this API on Parse and revise it to add endpoints targeting other Epoch AI leaderboard pages.
How fresh is the leaderboard data, and are all 90+ models guaranteed to have scores on both tiers?+
The data reflects the current state of the FrontierMath leaderboard as published by Epoch AI. Not all models are evaluated on both tiers — get_scores uses null for missing tier scores, and tier4_count vs tier13_count will typically differ. New models appear once Epoch AI adds them to the leaderboard; there is no fixed refresh schedule.
Page content last updated . Spec covers 3 endpoints from epoch.ai.
Related APIs in Developer ToolsSee all →
arxiv.org API
Search and discover academic research papers on arXiv using keywords, authors, titles, categories, and dates, then access detailed metadata for any paper. Browse the complete arXiv category taxonomy to explore research across different scientific disciplines.
alienvault.com API
Search and analyze global threat intelligence data including indicators of compromise, threat pulses, and adversary profiles from the Open Threat Exchange community. Monitor recent security alerts and access detailed information about threats and adversaries to strengthen your cybersecurity defenses.
allaboutcircuits.com API
Access educational electronics content from All About Circuits, including technical articles, circuit diagrams, textbook volumes, and forum discussions organized by category. Search and browse the latest resources, view detailed articles, explore engineering tools, and find answers across their community forums.
bazaardb.gg API
Search and retrieve comprehensive data about The Bazaar game cards, including items, skills, merchants, trainers, monsters, and events with full details like tiers, attributes, enchantments, and tooltips. Quickly find the specific card information you need to optimize your gameplay strategy and deck building.
icons8.com API
Search for millions of icons across different visual styles like colorful, pattern-based, and minimalist designs to find the perfect icon for your project. Discover and retrieve icons in your preferred style to enhance your designs and applications.
wynncraft.com API
Access detailed Wynncraft game information to look up item metadata and search across the complete item database, retrieve player statistics and character inventories, and browse guild information and global search results. Use this data to compare gear, track player progress, analyze guild rosters, or build tools for the Wynncraft community.
cursor.directory API
Search and discover AI cursor rules, MCP servers, and job listings organized by category to enhance your development workflow. Browse detailed information about each rule and server to find the tools and configurations that best fit your needs.
smstome.com API
Browse temporary phone numbers from countries around the world and read incoming SMS messages in real time. List available numbers by country, retrieve messages sorted newest to oldest, and search message history by sender or content.