live

Arena APIarena.ai ↗

Fetch AI model leaderboard rankings from arena.ai across agent, text, image, video, and code arenas. Get ELO scores, confidence intervals, and model metadata.

This API takes change requests — .

Developer Tools reviews ratings other

Endpoints

Updated

2d ago

What is the Arena API?

The Arena AI Leaderboard API exposes a single get_leaderboard endpoint that returns ranked AI model data across five arenas: agent, text, text-to-image, image-edit, and code. Each response includes per-model scores, confidence intervals, and metadata — with the agent arena also returning signal-based sub-scores for task outcome, steerability, and tool hallucination, plus session-level totals and a last-updated timestamp.

This call costs1 credit / call— charged only on success

Try it

arena

The arena/modality to fetch rankings for. Accepts exactly one of: agent, text, text-to-image, image-edit, text-to-video, image-to-video, video-edit, code/webdev, code/image-to-webdev.

→ api.parse.bot/scraper/dee580cd-1286-483f-818e-02b6459a0d69/<endpoint>

Ready to send

Fill in the parameters and hit sign in to send to see live response data here.

Call it over HTTPgrab a free API key at signup

curl -X GET 'https://api.parse.bot/scraper/dee580cd-1286-483f-818e-02b6459a0d69/get_leaderboard' \
  -H 'X-API-Key: $PARSE_API_KEY'

Python SDK · recommended

Typed, relational, agent-ready

A generated client with real types, enums, and the links between objects — the structure a flat JSON response can't carry. Autocompletes in your editor and reads cleanly to coding agents.

Fully typed · autocompletes
Objects link to objects
Typed errors & pagination

Typed Python client. Set up the SDK in your uv project, then pull this API’s typed client:

uv add parse-sdk
uv run parse init
uv run parse add --marketplace arena-ai-api

uv run parse add --marketplace pulls a pinned snapshot of this canonical API — it won’t change underneath you. To customize it, subscribe and swap to your own copy.

"""
Arena AI Leaderboard API Client

Fetch AI model leaderboard rankings from arena.ai across different modalities.
Get your API key from: https://parse.bot/settings
"""

import os
import requests
from typing import Optional
import json


class ParseClient:
    """Client for interacting with the Parse API for Arena AI Leaderboard data."""

    def __init__(self, api_key: Optional[str] = None):
        """Initialize the ParseClient with API credentials.
        
        Args:
            api_key: API key for Parse API. If not provided, reads from PARSE_API_KEY env var.
        """
        self.base_url = "https://api.parse.bot"
        self.scraper_id = "dee580cd-1286-483f-818e-02b6459a0d69"
        self.api_key = api_key or os.getenv("PARSE_API_KEY")
        
        if not self.api_key:
            raise ValueError("API key must be provided or set in PARSE_API_KEY environment variable")

    def _call(self, endpoint: str, method: str = "POST", **params) -> dict:
        """Make an API call to the Parse API.
        
        Args:
            endpoint: The endpoint name to call
            method: HTTP method (GET or POST)
            **params: Parameters to pass to the endpoint
            
        Returns:
            The JSON response from the API
            
        Raises:
            requests.exceptions.RequestException: If the API call fails
        """
        url = f"{self.base_url}/scraper/{self.scraper_id}/{endpoint}"
        headers = {
            "X-API-Key": self.api_key,
            "Content-Type": "application/json"
        }
        
        try:
            if method == "GET":
                response = requests.get(url, headers=headers, params=params)
            elif method == "POST":
                response = requests.post(url, headers=headers, json=params)
            else:
                raise ValueError(f"Unsupported HTTP method: {method}")
            
            response.raise_for_status()
            return response.json()
        except requests.exceptions.RequestException as e:
            print(f"Error calling {endpoint}: {e}")
            raise

    def get_leaderboard(self, arena: str = "agent") -> dict:
        """Get the full leaderboard rankings for a specified arena.
        
        Returns all ranked models with their scores, confidence intervals, and metadata.
        The agent arena returns signal-based scores (task outcome, steerability, etc.),
        while other arenas return ELO-style ratings with vote counts.
        
        Args:
            arena: The arena/modality to fetch rankings for. Options:
                - agent (default)
                - text
                - text-to-image
                - image-edit
                - text-to-video
                - image-to-video
                - video-edit
                - code/webdev
                - code/image-to-webdev
                
        Returns:
            Dictionary containing leaderboard data with models list
        """
        return self._call("get_leaderboard", method="GET", arena=arena)


def format_model_info(model: dict) -> str:
    """Format model information for display.
    
    Args:
        model: Model ranking object from leaderboard
        
    Returns:
        Formatted string with model information
    """
    rank = model.get("rank", "N/A")
    name = model.get("model", "Unknown")
    org = model.get("organization", "Unknown")
    avg_score = model.get("avg_score", {})
    score_value = avg_score.get("value", 0) if isinstance(avg_score, dict) else 0
    ci = avg_score.get("ci", 0) if isinstance(avg_score, dict) else 0
    sessions = model.get("sessions", 0)
    
    return (f"#{rank} {name} (by {org}) | "
            f"Score: {score_value:.4f} ±{ci:.4f} | "
            f"Sessions: {sessions:,}")


def main():
    """Main function demonstrating practical usage of the Arena AI Leaderboard API."""
    
    # Initialize the client
    client = ParseClient()
    
    print("=" * 80)
    print("Arena AI Leaderboard Analysis")
    print("=" * 80)
    
    # Define the arenas to analyze
    arenas_to_check = ["agent", "text", "code/webdev"]
    
    # Collect leaderboard data from multiple arenas
    leaderboard_data = {}
    
    for arena in arenas_to_check:
        print(f"\nFetching leaderboard for '{arena}' arena...")
        
        try:
            response = client.get_leaderboard(arena=arena)
            
            if response.get("status") == "success" or "data" in response:
                data = response.get("data", response)
                leaderboard_data[arena] = data
                
                print(f"✓ Successfully retrieved {arena} arena data")
                print(f"  Total models: {data.get('model_count', 0)}")
                
                if arena == "agent":
                    print(f"  Last updated: {data.get('last_updated', 'N/A')}")
                    print(f"  Total sessions: {data.get('total_sessions', 0):,}")
                else:
                    print(f"  Leaderboard: {data.get('leaderboard_slug', 'N/A')}")
                    
            else:
                print(f"✗ Failed to retrieve data for {arena}")
                
        except Exception as e:
            print(f"✗ Error fetching {arena}: {e}")
            continue
    
    # Analyze and display results
    print("\n" + "=" * 80)
    print("Leaderboard Rankings Summary")
    print("=" * 80)
    
    for arena, data in leaderboard_data.items():
        models = data.get("models", [])
        
        if not models:
            print(f"\nNo models found for {arena} arena")
            continue
        
        print(f"\n📊 {arena.upper()} ARENA - Top 5 Models:")
        print("-" * 80)
        
        # Show top 5 models
        for i, model in enumerate(models[:5], 1):
            print(f"  {format_model_info(model)}")
        
        # Calculate and display arena statistics
        if len(models) > 5:
            print(f"\n  ... and {len(models) - 5} more models")
        
        # For agent arena, show signal score analysis
        if arena == "agent" and models:
            top_model = models[0]
            signal_scores = top_model.get("signal_scores", {})
            
            if signal_scores:
                print(f"\n  🎯 Top Model ({top_model.get('model')}) Signal Scores:")
                for signal_name, score in signal_scores.items():
                    print(f"     - {signal_name}: {score:.4f}")
    
    # Comparative analysis
    print("\n" + "=" * 80)
    print("Comparative Analysis")
    print("=" * 80)
    
    if len(leaderboard_data) > 1:
        print("\nArena Comparison:")
        print("-" * 80)
        
        for arena, data in leaderboard_data.items():
            model_count = data.get("model_count", 0)
            top_model = data.get("models", [{}])[0] if data.get("models") else {}
            top_name = top_model.get("model", "N/A")
            
            print(f"  {arena:20} → {model_count:3} models | Top: {top_name}")
    
    print("\n" + "=" * 80)
    print("Analysis Complete")
    print("=" * 80)


if __name__ == "__main__":
    main()

All endpoints · 1 totalmissing one? ·

Get the full leaderboard rankings for a specified arena. Returns all ranked models with their scores, confidence intervals, and metadata. The agent arena returns signal-based scores (task outcome, steerability, tool hallucination, etc.), while other arenas return ELO-style ratings with vote counts.

Input

Param	Type	Description
arena	string	The arena/modality to fetch rankings for. Accepts exactly one of: agent, text, text-to-image, image-edit, text-to-video, image-to-video, video-edit, code/webdev, code/image-to-webdev.

Response

{
  "type": "object",
  "fields": {
    "arena": "string",
    "models": "array of model ranking objects",
    "model_count": "integer",
    "last_updated": "string (ISO datetime, agent arena only)",
    "total_sessions": "integer (agent arena only)",
    "leaderboard_slug": "string (non-agent arenas only)"
  },
  "sample": {
    "data": {
      "arena": "agent",
      "models": [
        {
          "rank": 1,
          "model": "GPT 5.5 (High)",
          "license": "Proprietary",
          "sessions": 27140,
          "avg_score": {
            "ci": 0.0129,
            "value": 0.0922,
            "pipelines": 5
          },
          "signal_ci": {
            "steerability": 0.0239,
            "task_outcome_explicit": 0.023
          },
          "rank_spread": {
            "max": 5,
            "min": 1
          },
          "organization": "OpenAI",
          "signal_scores": {
            "steerability": 0.0959,
            "task_outcome_explicit": 0.0613
          }
        }
      ],
      "model_count": 20,
      "last_updated": "2026-06-08T13:00:00.000Z",
      "total_sessions": 463644
    },
    "status": "success"
  }
}

About the Arena API

What the API returns

The get_leaderboard endpoint accepts an arena parameter and returns a ranked list of AI models for that modality. The response always includes the arena name, a models array of ranking objects, and a model_count integer. The models array carries per-model scores and confidence intervals, allowing you to compare statistical separation between models rather than just raw rank order.

Agent arena vs. other arenas

The agent arena response includes fields that the other arenas do not: last_updated (ISO datetime), total_sessions (the number of evaluation sessions behind the rankings), and signal-based sub-scores such as task outcome, steerability, and tool hallucination rate. Non-agent arenas — text, text-to-image, image-edit — return ELO-style ratings and expose a leaderboard_slug string instead. These structural differences mean you should branch on the arena field in your response handling.

Supported arenas

The arena input accepts exactly one value per call: agent, text, text-to-image, image-edit, or text (code). There is no batch or multi-arena endpoint; separate calls are needed to compare rankings across modalities. The model_count field lets you quickly confirm how many models are ranked without iterating the full models array.

Reliability & maintenance

The Arena API is a managed, monitored endpoint for arena.ai — not a raw scraper you maintain. Every endpoint is automatically health-checked on a schedule, and when arena.ai changes and a check fails, the API is automatically queued for repair and re-verified. It is built to keep working as the site underneath it changes.

This isn't an official arena.ai API — it's an independent, maintained REST wrapper over public data. Where the source has no official API (or only a limited one), Parse gives you a stable contract over a source that never promised one, and keeps it current. Need a new endpoint or field? You can revise it yourself in plain English and the agent rebuilds it against the live site in minutes — contributing the change back to the shared API is free.

Will this API break when the source site changes?+

It's built not to. Every endpoint is health-checked on a schedule with automated test probes. When the source site changes and a check fails, the API is automatically queued for repair and re-verified — that's the self-healing layer. Each API page shows when its endpoints were last verified. And because marketplace APIs are shared, any fix reaches everyone using it.

Is this an official API from the source site?+

No — Parse APIs are independent, managed REST wrappers over publicly available data. That is the point: where a site has no official API (or only a limited one), Parse gives you a maintained, monitored endpoint for that data and keeps it working as the site changes — so you get a stable contract over a source that never promised one.

Can I fix or extend this API myself if I need a new endpoint or field?+

Yes — and you don't have to wait on us. This API was generated by the Parse agent, which stays attached. Describe the change in plain English ("add an endpoint that returns reviews", "fix the price field") in the revise box on the API page or via the revise_api MCP tool, and the agent rebuilds it against the live site in minutes. Contributing the change back to the public API is free.

What happens if I call an endpoint that has an issue?+

Errors are machine-readable: a bad call returns a clean status with the list of available endpoints and a repair hint, so an agent (or you) can recover or trigger a fix instead of failing silently. Confirmed failures feed the automatic repair queue.

Common use cases

Build a model selection dashboard that surfaces the current top-ranked agents and their task-outcome scores from the agent arena.
Track ELO-style rating changes for text models over time by polling get_leaderboard with arena=text on a schedule.
Compare image generation models by pulling text-to-image and image-edit arena rankings side by side using model_count to normalize comparisons.
Alert engineering teams when a preferred model drops below a threshold rank in any arena, using confidence interval fields to filter noise.
Populate a live leaderboard UI that shows total_sessions and last_updated for the agent arena to signal data freshness to end users.
Audit tool hallucination rates across agent models to shortlist candidates for production agentic workflows.

Pricing & limitsSee full pricing →

Tier	Price	Credits/month	Rate limit
Free	$0/mo	100	5 req/min
Hobby	$30/mo	1,000	20 req/min
Developer	$100/mo	5,000	100 req/min

One credit = one API call regardless of which marketplace API you call. Exceeding the rate limit returns a 429 response. Authenticate with the X-API-Key header.

Frequently asked questions

Does arena.ai have an official developer API?+

Arena.ai does not publish a documented public developer API for leaderboard data. This Parse API provides structured access to the leaderboard data available at arena.ai/agent and related pages.

How does the agent arena response differ from other arenas?+

The agent arena returns signal-based sub-scores (task outcome, steerability, tool hallucination), a total_sessions count, and a last_updated ISO datetime. Non-agent arenas return ELO-style ratings and a leaderboard_slug field instead. Both include arena, models, and model_count.

Can I retrieve historical leaderboard rankings or track rank changes over time?+

The API returns the current leaderboard snapshot only; there is no built-in historical endpoint. You can store responses over time in your own database to build a history. If you need a dedicated historical-data endpoint, you can fork this API on Parse and revise it to add one.

Are video or code arenas available as separate arena values?+

The arena parameter currently accepts agent, text, text-to-image, and image-edit. A distinct video or code arena value is not currently exposed. You can fork this API on Parse and revise it to add support for additional arena slugs if they become available on arena.ai.

What does the `models` array contain for each ranked model?+

Each object in models includes the model's rank, score, and confidence interval bounds. The agent arena entries also carry the individual signal-based sub-scores. Exact field names per model depend on the arena; the agent arena entries are the most granular.

Page content last updated June 9, 2026. Spec covers 1 endpoint from arena.ai.

Related APIs in Developer ToolsSee all →

crt.sh API

Search for SSL/TLS certificates across public transparency logs by domain, fingerprint, serial number, or public key, and retrieve detailed certificate information including issuer, validity dates, and certificate chain details. Monitor certificate issuance for domains you care about to track security changes and detect unauthorized certificates.

artificialanalysis.ai API

Compare and rank LLM models and providers across performance benchmarks, then dive into detailed specifications for any model to find the best fit for your needs. Discover performance metrics for specialized AI systems handling speech, images, and video, plus benchmark data for different hardware configurations.

python.org API

Access comprehensive Python release information including downloads, versions, and supported operating systems, plus stay updated with the latest Python news and events. Search across Python.org's resources and browse release files, details, and the FTP index all in one place.

nvidia.com API

alienvault.com API

Search and analyze global threat intelligence data including indicators of compromise, threat pulses, and adversary profiles from the Open Threat Exchange community. Monitor recent security alerts and access detailed information about threats and adversaries to strengthen your cybersecurity defenses.

lucide.dev API

Browse and download thousands of Lucide icons with instant search and category filtering to find exactly what you need. Get SVG files and metadata for each icon to integrate them seamlessly into your projects.

trends.google.com API

Discover what's trending right now in any country by accessing the top search topics with real-time search volume, growth rates, and related queries. Stay informed on trending categories and see which searches are gaining the most momentum in your target markets.

soliditylang.org API

Access comprehensive Solidity documentation, search language references, and browse blog posts to stay updated on development news. Query compiler bug data filtered by version to identify known issues and compatibility concerns across smart contract projects.