Data source: arXiv.org
Overview
arXiv wraps arXiv.org, handling authentication, pagination, and rate limits for you. This tutorial covers all 6 tools with working code examples you can copy and run.
Prerequisites
- Sign up at https://context.gnist.ai/signup for a free API key (100 calls/day).
- Choose your integration method: MCP protocol or REST API.
Connect via MCP
Add to your MCP client config (Claude Desktop, Cursor, etc.):
{
"mcpServers": {
"gnist-arxiv": {
"url": "https://context.gnist.ai/mcp/arxiv/",
"headers": {
"Gnist-API-Key": "YOUR_API_KEY"
}
}
}
}
Tools (6)
search_arxiv_papers
Search scientific papers on arXiv by keyword, category, and date range. Args: query: Search terms (e.g. "attention mechanism transformer", "CRISPR gene editing"). Searches across title, abstract, and author fields. category: arXiv subject category code (e.g. "cs.AI", "q-fin.TR"). Optional. date_from: Only return papers submitted from this date onward (YYYY-MM-DD). Optional. date_to: Only return papers submitted up to this date (YYYY-MM-DD). Optional. max_results: Number of results to return (1–25, default 10). Returns: Dictionary with 'count' and 'papers' list. Each paper has arxiv_id, title, authors, abstract, categories, primary_category, published, updated, pdf_url, doi.
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | required | Search terms (e.g. "attention mechanism transformer", "CRISPR gene editing"). Searches across title, abstract, and author fields. |
category | any | optional | arXiv subject category code (e.g. "cs.AI", "q-fin.TR"). Optional. |
date_from | any | optional | Only return papers submitted from this date onward (YYYY-MM-DD). Optional. |
date_to | any | optional | Only return papers submitted up to this date (YYYY-MM-DD). Optional. |
max_results | integer | optional | Number of results to return (1–25, default 10). (default: 10) |
curl -X POST "https://context.gnist.ai/mcp/arxiv/" \
-H "Content-Type: application/json" \
-H "Gnist-API-Key: YOUR_API_KEY" \
-d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "search_arxiv_papers", "arguments": {"query": "attention mechanism transformer"}}}'
import httpx
resp = httpx.post(
"https://context.gnist.ai/mcp/arxiv/",
headers={"Gnist-API-Key": "YOUR_API_KEY"},
json={'id': 1,
'jsonrpc': '2.0',
'method': 'tools/call',
'params': {'arguments': {'query': 'attention mechanism transformer'},
'name': 'search_arxiv_papers'}},
)
print(resp.json())
get_arxiv_paper
Get full details for a specific arXiv paper by ID. Args: arxiv_id: arXiv ID (e.g. "2106.09685", "2106.09685v1", or "http://arxiv.org/abs/2106.09685"). Returns: Full paper record: arxiv_id, title, authors, abstract, categories, primary_category, published, updated, pdf_url, doi.
| Parameter | Type | Required | Description |
|---|---|---|---|
arxiv_id | string | required | arXiv ID (e.g. "2106.09685", "2106.09685v1", or "http://arxiv.org/abs/2106.09685"). |
curl -X POST "https://context.gnist.ai/mcp/arxiv/" \
-H "Content-Type: application/json" \
-H "Gnist-API-Key: YOUR_API_KEY" \
-d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "get_arxiv_paper", "arguments": {"arxiv_id": "2106.09685"}}}'
import httpx
resp = httpx.post(
"https://context.gnist.ai/mcp/arxiv/",
headers={"Gnist-API-Key": "YOUR_API_KEY"},
json={'id': 1,
'jsonrpc': '2.0',
'method': 'tools/call',
'params': {'arguments': {'arxiv_id': '2106.09685'}, 'name': 'get_arxiv_paper'}},
)
print(resp.json())
get_arxiv_author_papers
Search papers by a specific arXiv author, sorted by submission date (newest first). Args: author_name: Author's name as it appears on arXiv (e.g. "Yann LeCun", "Hinton"). Partial last names work; full names improve precision. max_results: Number of results to return (1–25, default 10). Returns: Dictionary with 'count' and 'papers' list.
| Parameter | Type | Required | Description |
|---|---|---|---|
author_name | string | required | Author's name as it appears on arXiv (e.g. "Yann LeCun", "Hinton"). Partial last names work; full names improve precision. |
max_results | integer | optional | Number of results to return (1–25, default 10). (default: 10) |
curl -X POST "https://context.gnist.ai/mcp/arxiv/" \
-H "Content-Type: application/json" \
-H "Gnist-API-Key: YOUR_API_KEY" \
-d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "get_arxiv_author_papers", "arguments": {"author_name": "Yann LeCun"}}}'
import httpx
resp = httpx.post(
"https://context.gnist.ai/mcp/arxiv/",
headers={"Gnist-API-Key": "YOUR_API_KEY"},
json={'id': 1,
'jsonrpc': '2.0',
'method': 'tools/call',
'params': {'arguments': {'author_name': 'Yann LeCun'},
'name': 'get_arxiv_author_papers'}},
)
print(resp.json())
get_arxiv_recent
Get the most recently submitted papers in an arXiv subject category. Args: category: arXiv category code (e.g. "cs.LG", "q-fin.TR"). Use list_arxiv_categories() to see all supported codes. max_results: Number of results to return (1–25, default 10). Returns: Dictionary with 'count' and 'papers' list.
| Parameter | Type | Required | Description |
|---|---|---|---|
category | string | required | arXiv category code (e.g. "cs.LG", "q-fin.TR"). Use list_arxiv_categories() to see all supported codes. |
max_results | integer | optional | Number of results to return (1–25, default 10). (default: 10) |
curl -X POST "https://context.gnist.ai/mcp/arxiv/" \
-H "Content-Type: application/json" \
-H "Gnist-API-Key: YOUR_API_KEY" \
-d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "get_arxiv_recent", "arguments": {"category": "cs.LG"}}}'
import httpx
resp = httpx.post(
"https://context.gnist.ai/mcp/arxiv/",
headers={"Gnist-API-Key": "YOUR_API_KEY"},
json={'id': 1,
'jsonrpc': '2.0',
'method': 'tools/call',
'params': {'arguments': {'category': 'cs.LG'}, 'name': 'get_arxiv_recent'}},
)
print(resp.json())
list_arxiv_categories
Return all supported arXiv subject categories with descriptions. Returns: Dict mapping category code (e.g. "cs.AI") to description. Use these codes in search_arxiv_papers(category=...) and get_arxiv_recent(category=...).
curl -X POST "https://context.gnist.ai/mcp/arxiv/" \
-H "Content-Type: application/json" \
-H "Gnist-API-Key: YOUR_API_KEY" \
-d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "list_arxiv_categories", "arguments": {}}}'
import httpx
resp = httpx.post(
"https://context.gnist.ai/mcp/arxiv/",
headers={"Gnist-API-Key": "YOUR_API_KEY"},
json={'id': 1,
'jsonrpc': '2.0',
'method': 'tools/call',
'params': {'arguments': {}, 'name': 'list_arxiv_categories'}},
)
print(resp.json())
report_feedback
Report a bug, feature request, or general feedback for this data source. Use this when something doesn't work as expected, when you'd like a new feature, or when you have suggestions for improvement. Args: feedback: Describe the issue or suggestion. feedback_type: One of 'bug', 'feature_request', or 'general'.
| Parameter | Type | Required | Description |
|---|---|---|---|
feedback | string | required | |
feedback_type | string | optional | (default: general) |
curl -X POST "https://context.gnist.ai/mcp/arxiv/" \
-H "Content-Type: application/json" \
-H "Gnist-API-Key: YOUR_API_KEY" \
-d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "report_feedback", "arguments": {"feedback": "example"}}}'
import httpx
resp = httpx.post(
"https://context.gnist.ai/mcp/arxiv/",
headers={"Gnist-API-Key": "YOUR_API_KEY"},
json={'id': 1,
'jsonrpc': '2.0',
'method': 'tools/call',
'params': {'arguments': {'feedback': 'example'}, 'name': 'report_feedback'}},
)
print(resp.json())
Common Patterns
Use
search_arxiv_papers to find items, then get_arxiv_paper to get full details. This two-step pattern is common for exploring data before drilling down.Several tools support
limit, offset, or page parameters. Start with small limits during development, then increase for production queries.Use date range parameters to narrow results to a specific time window. Dates are typically in
YYYY-MM-DD format.FAQ
What data does arXiv provide?
Search and retrieve academic preprints across physics, math, CS, and more. It exposes 6 tools: search_arxiv_papers, get_arxiv_paper, get_arxiv_author_papers, get_arxiv_recent, list_arxiv_categories, report_feedback.
What do I need to get started?
A Gnist API key (free tier: 100 calls/day). Sign up at https://context.gnist.ai/signup.
What format does the arXiv API return?
JSON, via either MCP protocol (JSON-RPC 2.0) or REST API.