docs.opensearch.org APIdocs.opensearch.org ↗
Access OpenSearch documentation via API: search across versions, fetch page content, navigation trees, breaking changes, and version history with 7 endpoints.
curl -X GET 'https://api.parse.bot/scraper/3f87876e-c890-4425-b122-c258b102f74a/get_page_content?path=%2Flatest%2Fquery-dsl%2F' \ -H 'X-API-Key: $PARSE_API_KEY'
Fetches the full textual content and metadata of a documentation page including headings, code blocks, and table of contents.
| Param | Type | Description |
|---|---|---|
| pathrequired | string | The URL path of the documentation page (e.g. '/latest/query-dsl/') |
{
"type": "object",
"fields": {
"toc": "array of objects with text and anchor fields",
"url": "string, full URL of the page",
"title": "string, page title from h1",
"content": "string, full text content of the main body",
"headings": "array of objects with level, text, and anchor fields",
"breadcrumbs": "array of strings representing the breadcrumb trail",
"code_blocks": "array of strings, each a code block from the page"
},
"sample": {
"data": {
"toc": [],
"url": "https://docs.opensearch.org/latest/query-dsl/",
"title": "Query DSL",
"content": "Documentation\nQuery DSL\nOpenSearch provides a search language called...",
"headings": [
{
"text": "A note on Unicode special characters in text fields",
"level": 2,
"anchor": null
},
{
"text": "Expensive queries",
"level": 2,
"anchor": null
}
],
"breadcrumbs": [],
"code_blocks": [
"GET testindex/_search\n{\n \"query\": {\n \"match_all\": { \n }\n }\n}\n"
]
},
"status": "success"
}
}About the docs.opensearch.org API
This API exposes 7 endpoints for querying the OpenSearch documentation at docs.opensearch.org, covering full-text search, page content extraction, version listing, and navigation structure. The search_docs endpoint accepts a keyword query and an optional version parameter, returning matched pages with titles, URLs, content snippets, and ancestor breadcrumbs. The get_page_content endpoint delivers structured data including headings, table of contents, code blocks, and breadcrumbs for any documentation path.
Page Content and Search
The get_page_content endpoint takes a path parameter (e.g. /latest/query-dsl/) and returns the full body content as a string alongside structured headings (with level, text, and anchor fields), a toc array, breadcrumbs, and all code_blocks extracted from the page. This makes it straightforward to ingest documentation text for indexing, analysis, or display without parsing HTML yourself.
The search_docs endpoint accepts a query string and an optional version filter (e.g. latest, 2.19, 3.6). Results include url, type, version, versionLabel, content snippet, title, and an ancestors array showing where in the docs hierarchy the match lives. Searching without a version queries across all indexed content.
Versioning and Navigation
list_versions returns three fields: latest_version, active_versions, and archived_versions — useful for validating which version strings are valid inputs to other endpoints. get_navigation_tree accepts an optional version and returns a recursive navigation_tree of objects with title, url, and children, mirroring the sidebar structure of the docs site.
Release and Change Tracking
get_version_history fetches the version history table from the docs, returning an array of history objects with fields for OpenSearch version, Release highlights, Release date, and links. get_breaking_changes returns an array of breaking_changes objects, each with a version_context heading and the change text, grouped as they appear in the documentation. summarize_page extracts a concise summary from the first substantial paragraph of a page, along with a key_topics list of h2 and h3 headings — useful for building quick overviews without fetching the full page content.
- Build a documentation search interface that queries
search_docswith version filtering to scope results to a specific OpenSearch release. - Populate a changelog feed by polling
get_version_historyandget_breaking_changesto surface new releases and incompatibilities. - Generate a documentation site map or sidebar by traversing the recursive
navigation_treereturned byget_navigation_tree. - Extract and index all code samples from documentation pages using the
code_blocksarray fromget_page_content. - Build a version-aware upgrade assistant by cross-referencing
breaking_changesentries with the target version fromlist_versions. - Create a documentation digest tool that uses
summarize_pageto generate short overviews of key_topics without fetching full page content. - Validate that a documentation path exists and retrieve its breadcrumb hierarchy before linking to it in an external tool.
| Tier | Price | Credits/month | Rate limit |
|---|---|---|---|
| Free | $0/mo | 100 | 5 req/min |
| Hobby | $30/mo | 1,000 | 20 req/min |
| Developer | $100/mo | 5,000 | 250 req/min |
One credit = one API call regardless of which marketplace API you call. Exceeding the rate limit returns a 429 response. Authenticate with the X-API-Key header.
Does OpenSearch have an official developer API for its documentation?+
What does `get_breaking_changes` return, and how are results organized?+
get_breaking_changes returns a breaking_changes array where each object contains a version_context string (the heading grouping the changes) and a change string describing the individual breaking change. The version input parameter lets you scope results to a specific documentation version such as latest, 3.6, or 2.19.Does the API cover plugin-specific or security documentation pages, or only core docs?+
get_page_content or summarize_page, and search_docs queries across all indexed sections. Coverage depends on what is published on the docs site.Can I get paginated search results or retrieve a result count from `search_docs`?+
search_docs returns a results array but does not currently expose pagination parameters, total hit counts, or offset controls. The API covers keyword search with optional version filtering. You can fork it on Parse and revise it to add pagination or result-count fields.Does the API expose historical archived docs content beyond what `list_versions` returns in `archived_versions`?+
list_versions identifies which version strings are archived, and other endpoints like get_navigation_tree, get_page_content, and search_docs accept those version strings as inputs. Archived versions are accessible where the underlying docs site retains them. Content availability for very old archived versions depends on what remains published.