semanticscholar.org APIwww.semanticscholar.org ↗
Search millions of academic papers on Semantic Scholar and retrieve full metadata: abstracts, citation counts, references, authors, and TLDR summaries.
curl -X GET 'https://api.parse.bot/scraper/3592e139-8a42-48f1-8833-43555acf6680/search_papers?page=1&sort=relevance&query=transformer+attention&page_size=5' \ -H 'X-API-Key: $PARSE_API_KEY'
Typed Python client. Install the CLI, sign in, then pull this API’s generated client:
pip install parse-sdk parse login parse add --marketplace semanticscholar-org-api
parse add --marketplace pulls a pinned snapshot of this canonical API — it won’t change underneath you. To customize it, subscribe and swap to your own copy.
"""Walkthrough: Semantic Scholar SDK — search papers, drill into details."""
from parse_apis.Semantic_Scholar_API import SemanticScholar, Sort, PaperNotFound
client = SemanticScholar()
# Search papers sorted by citation count — limit caps total items fetched.
for paper in client.paper_summaries.search(query="transformer attention", sort=Sort.CITATION_COUNT, limit=5):
print(paper.title, paper.citation_count)
# Drill down: take one result and get full details via .details()
summary = client.paper_summaries.search(query="neural networks", sort=Sort.RELEVANCE, limit=1).first()
if summary:
full = summary.details()
print(full.title, full.abstract[:100] if full.abstract else "No abstract")
print(f" References: {full.reference_count}, Citations: {full.citation_count}")
# Direct paper lookup by known ID
paper = client.papers.get(paper_id=summary.paper_id)
print(paper.title, paper.year, paper.venue)
for author in paper.authors[:3]:
print(f" Author: {author.name}")
# Typed error handling for a missing paper
try:
client.papers.get(paper_id="does_not_exist")
except PaperNotFound as exc:
print(f"Paper not found: {exc.paper_id}")
print("exercised: paper_summaries.search / summary.details / papers.get / PaperNotFound")
Search for academic papers by keyword query. Returns paginated results with each paper's title, authors, year, and citation count. Supports sorting by relevance (default, ranked by search relevance score) or by citation count descending. The total number of matching papers is included in the response. When sorted by relevance, uses the standard search API (subject to rate limits handled via proxy rotation). When sorted by citation count, uses the bulk search API for reliable results.
| Param | Type | Description |
|---|---|---|
| page | integer | Page number for pagination (1-based). |
| sort | string | Sort order for results. |
| queryrequired | string | Search query keywords to match against paper titles and content. |
| page_size | integer | Number of results per page (1-100). |
{
"type": "object",
"fields": {
"page": "integer",
"papers": "array of paper summaries with paper_id, title, authors, year, citation_count",
"page_size": "integer",
"total_results": "integer"
},
"sample": {
"data": {
"page": 1,
"papers": [
{
"year": 2017,
"title": "Attention is All you Need",
"authors": [
{
"name": "John Doe",
"author_id": "40348417"
},
{
"name": "Jane Doe",
"author_id": "1846258"
}
],
"paper_id": "204e3073870fae3d05bcbc2f6a8e263d9b72e776",
"citation_count": 180883
}
],
"page_size": 3,
"total_results": 55272
},
"status": "success"
}
}About the semanticscholar.org API
This API provides 2 endpoints for querying Semantic Scholar's academic paper index and retrieving detailed paper metadata. Use search_papers to run keyword searches across paper titles and content with pagination and sort controls, or call get_paper with a Semantic Scholar paper ID to fetch the full record including abstract, citation count, reference count, venue, fields of study, and an AI-generated TLDR summary when available.
Search Academic Papers
The search_papers endpoint accepts a required query string and returns a paginated list of matching papers. Each result includes a paper_id (the 40-character hex identifier used across Semantic Scholar), title, authors array, year, and citation_count. You can control pagination with page (1-based) and page_size (1–100), and sort results either by relevance (default) or by citation count descending via the sort parameter. The response also includes total_results so you can calculate how many pages exist.
Retrieve Full Paper Details
The get_paper endpoint takes a paper_id from search results and returns the complete record for that paper. Fields include abstract, reference_count, venue, publication_date, fields_of_study (an array of discipline strings such as "Computer Science" or "Biology"), and tldr — a short machine-generated summary that appears when Semantic Scholar has generated one for the paper. Authors are returned as objects with both name and author_id fields.
Data Coverage and Limitations
Semantic Scholar indexes over 200 million academic papers across disciplines. The abstract and tldr fields can be null when a paper record lacks that data. Sorting in search_papers supports two modes: relevance ranking and citation count descending — there is no date-based sort option currently exposed. Paper IDs are stable Semantic Scholar identifiers and can be used directly in get_paper without any transformation.
The semanticscholar.org API is a managed, monitored endpoint for www.semanticscholar.org — not a raw scraper you maintain. Every endpoint is automatically health-checked on a schedule, and when www.semanticscholar.org changes and a check fails, the API is automatically queued for repair and re-verified. It is built to keep working as the site underneath it changes.
This isn't an official www.semanticscholar.org API — it's an independent, maintained REST wrapper over public data. Where the source has no official API (or only a limited one), Parse gives you a stable contract over a source that never promised one, and keeps it current. Need a new endpoint or field? You can revise it yourself in plain English and the agent rebuilds it against the live site in minutes — contributing the change back to the shared API is free.
Will this API break when the source site changes?+
Is this an official API from the source site?+
Can I fix or extend this API myself if I need a new endpoint or field?+
What happens if I call an endpoint that has an issue?+
- Build a literature review tool that ranks papers by
citation_countto surface the most-cited work on a topic - Populate a citation management app using
abstract,authors,venue, andyearreturned byget_paper - Filter recent research by
fields_of_studyto narrow domain-specific paper discovery pipelines - Display AI-generated
tldrsummaries to give readers a one-sentence overview before reading a full paper - Track a paper's
reference_countandcitation_countto analyze its influence and depth of prior work - Power a research alert system that searches
queryterms daily and surfaces newly indexed matching papers - Aggregate
author_idvalues from search results to build a co-authorship or collaboration graph
| Tier | Price | Credits/month | Rate limit |
|---|---|---|---|
| Free | $0/mo | 100 | 5 req/min |
| Hobby | $30/mo | 1,000 | 20 req/min |
| Developer | $100/mo | 5,000 | 250 req/min |
One credit = one API call regardless of which marketplace API you call. Exceeding the rate limit returns a 429 response. Authenticate with the X-API-Key header.
Does Semantic Scholar have an official developer API?+
What does `get_paper` return that `search_papers` does not?+
get_paper adds abstract, reference_count, venue, publication_date, fields_of_study, and the tldr summary string. The search_papers endpoint returns only paper_id, title, authors, year, and citation_count per result — enough to browse and select, but not enough for full metadata display without a follow-up call to get_paper.Can I retrieve author profiles or citation graphs, not just paper metadata?+
Is there a way to sort search results by publication date?+
sort parameter currently supports relevance (default) and citation count descending. Date-based sorting is not exposed. Each paper result does include a year field, so you can sort client-side on that value. You can also fork the API on Parse and revise it to add a date sort option if the underlying data supports it.How large can a single search response be, and how does pagination work?+
page_size accepts values from 1 to 100. The response includes total_results so you can compute the number of pages. Pagination is 1-based via the page parameter. For very broad queries, total_results may be large but only the requested page slice is returned.