GnistAI GnistAI
Log in

Getting Started with arXiv

Search and retrieve academic preprints across physics, math, CS, and more.

All Tutorials   |   Overview   |   Playground   |   MCP   |   REST API   |   Home
Science

Data source: arXiv.org

Overview

arXiv wraps arXiv.org, handling authentication, pagination, and rate limits for you. This tutorial covers all 6 tools with working code examples you can copy and run.

Prerequisites

  1. Sign up at https://context.gnist.ai/signup for a free API key (100 calls/day).
  2. Choose your integration method: MCP protocol or REST API.

Connect via MCP

Add to your MCP client config (Claude Desktop, Cursor, etc.):

MCP Config
{
  "mcpServers": {
    "gnist-arxiv": {
      "url": "https://context.gnist.ai/mcp/arxiv/",
      "headers": {
        "Gnist-API-Key": "YOUR_API_KEY"
      }
    }
  }
}

Tools (6)

search_arxiv_papers

Search scientific papers on arXiv by keyword, category, and date range. Args: query: Search terms (e.g. "attention mechanism transformer", "CRISPR gene editing"). Searches across title, abstract, and author fields. category: arXiv subject category code (e.g. "cs.AI", "q-fin.TR"). Optional. date_from: Only return papers submitted from this date onward (YYYY-MM-DD). Optional. date_to: Only return papers submitted up to this date (YYYY-MM-DD). Optional. max_results: Number of results to return (1–25, default 10). Returns: Dictionary with 'count' and 'papers' list. Each paper has arxiv_id, title, authors, abstract, categories, primary_category, published, updated, pdf_url, doi.

ParameterTypeRequiredDescription
querystringrequiredSearch terms (e.g. "attention mechanism transformer", "CRISPR gene editing"). Searches across title, abstract, and author fields.
categoryanyoptionalarXiv subject category code (e.g. "cs.AI", "q-fin.TR"). Optional.
date_fromanyoptionalOnly return papers submitted from this date onward (YYYY-MM-DD). Optional.
date_toanyoptionalOnly return papers submitted up to this date (YYYY-MM-DD). Optional.
max_resultsintegeroptionalNumber of results to return (1–25, default 10). (default: 10)
curl -X POST "https://context.gnist.ai/mcp/arxiv/" \
  -H "Content-Type: application/json" \
  -H "Gnist-API-Key: YOUR_API_KEY" \
  -d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "search_arxiv_papers", "arguments": {"query": "attention mechanism transformer"}}}'
import httpx

resp = httpx.post(
    "https://context.gnist.ai/mcp/arxiv/",
    headers={"Gnist-API-Key": "YOUR_API_KEY"},
    json={'id': 1,
 'jsonrpc': '2.0',
 'method': 'tools/call',
 'params': {'arguments': {'query': 'attention mechanism transformer'},
            'name': 'search_arxiv_papers'}},
)
print(resp.json())

get_arxiv_paper

Get full details for a specific arXiv paper by ID. Args: arxiv_id: arXiv ID (e.g. "2106.09685", "2106.09685v1", or "http://arxiv.org/abs/2106.09685"). Returns: Full paper record: arxiv_id, title, authors, abstract, categories, primary_category, published, updated, pdf_url, doi.

ParameterTypeRequiredDescription
arxiv_idstringrequiredarXiv ID (e.g. "2106.09685", "2106.09685v1", or "http://arxiv.org/abs/2106.09685").
curl -X POST "https://context.gnist.ai/mcp/arxiv/" \
  -H "Content-Type: application/json" \
  -H "Gnist-API-Key: YOUR_API_KEY" \
  -d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "get_arxiv_paper", "arguments": {"arxiv_id": "2106.09685"}}}'
import httpx

resp = httpx.post(
    "https://context.gnist.ai/mcp/arxiv/",
    headers={"Gnist-API-Key": "YOUR_API_KEY"},
    json={'id': 1,
 'jsonrpc': '2.0',
 'method': 'tools/call',
 'params': {'arguments': {'arxiv_id': '2106.09685'}, 'name': 'get_arxiv_paper'}},
)
print(resp.json())

get_arxiv_author_papers

Search papers by a specific arXiv author, sorted by submission date (newest first). Args: author_name: Author's name as it appears on arXiv (e.g. "Yann LeCun", "Hinton"). Partial last names work; full names improve precision. max_results: Number of results to return (1–25, default 10). Returns: Dictionary with 'count' and 'papers' list.

ParameterTypeRequiredDescription
author_namestringrequiredAuthor's name as it appears on arXiv (e.g. "Yann LeCun", "Hinton"). Partial last names work; full names improve precision.
max_resultsintegeroptionalNumber of results to return (1–25, default 10). (default: 10)
curl -X POST "https://context.gnist.ai/mcp/arxiv/" \
  -H "Content-Type: application/json" \
  -H "Gnist-API-Key: YOUR_API_KEY" \
  -d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "get_arxiv_author_papers", "arguments": {"author_name": "Yann LeCun"}}}'
import httpx

resp = httpx.post(
    "https://context.gnist.ai/mcp/arxiv/",
    headers={"Gnist-API-Key": "YOUR_API_KEY"},
    json={'id': 1,
 'jsonrpc': '2.0',
 'method': 'tools/call',
 'params': {'arguments': {'author_name': 'Yann LeCun'},
            'name': 'get_arxiv_author_papers'}},
)
print(resp.json())

get_arxiv_recent

Get the most recently submitted papers in an arXiv subject category. Args: category: arXiv category code (e.g. "cs.LG", "q-fin.TR"). Use list_arxiv_categories() to see all supported codes. max_results: Number of results to return (1–25, default 10). Returns: Dictionary with 'count' and 'papers' list.

ParameterTypeRequiredDescription
categorystringrequiredarXiv category code (e.g. "cs.LG", "q-fin.TR"). Use list_arxiv_categories() to see all supported codes.
max_resultsintegeroptionalNumber of results to return (1–25, default 10). (default: 10)
curl -X POST "https://context.gnist.ai/mcp/arxiv/" \
  -H "Content-Type: application/json" \
  -H "Gnist-API-Key: YOUR_API_KEY" \
  -d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "get_arxiv_recent", "arguments": {"category": "cs.LG"}}}'
import httpx

resp = httpx.post(
    "https://context.gnist.ai/mcp/arxiv/",
    headers={"Gnist-API-Key": "YOUR_API_KEY"},
    json={'id': 1,
 'jsonrpc': '2.0',
 'method': 'tools/call',
 'params': {'arguments': {'category': 'cs.LG'}, 'name': 'get_arxiv_recent'}},
)
print(resp.json())

list_arxiv_categories

Return all supported arXiv subject categories with descriptions. Returns: Dict mapping category code (e.g. "cs.AI") to description. Use these codes in search_arxiv_papers(category=...) and get_arxiv_recent(category=...).

curl -X POST "https://context.gnist.ai/mcp/arxiv/" \
  -H "Content-Type: application/json" \
  -H "Gnist-API-Key: YOUR_API_KEY" \
  -d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "list_arxiv_categories", "arguments": {}}}'
import httpx

resp = httpx.post(
    "https://context.gnist.ai/mcp/arxiv/",
    headers={"Gnist-API-Key": "YOUR_API_KEY"},
    json={'id': 1,
 'jsonrpc': '2.0',
 'method': 'tools/call',
 'params': {'arguments': {}, 'name': 'list_arxiv_categories'}},
)
print(resp.json())

report_feedback

Report a bug, feature request, or general feedback for this data source. Use this when something doesn't work as expected, when you'd like a new feature, or when you have suggestions for improvement. Args: feedback: Describe the issue or suggestion. feedback_type: One of 'bug', 'feature_request', or 'general'.

ParameterTypeRequiredDescription
feedbackstringrequired
feedback_typestringoptional (default: general)
curl -X POST "https://context.gnist.ai/mcp/arxiv/" \
  -H "Content-Type: application/json" \
  -H "Gnist-API-Key: YOUR_API_KEY" \
  -d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "report_feedback", "arguments": {"feedback": "example"}}}'
import httpx

resp = httpx.post(
    "https://context.gnist.ai/mcp/arxiv/",
    headers={"Gnist-API-Key": "YOUR_API_KEY"},
    json={'id': 1,
 'jsonrpc': '2.0',
 'method': 'tools/call',
 'params': {'arguments': {'feedback': 'example'}, 'name': 'report_feedback'}},
)
print(resp.json())

Common Patterns

Search then retrieve
Use search_arxiv_papers to find items, then get_arxiv_paper to get full details. This two-step pattern is common for exploring data before drilling down.
Pagination
Several tools support limit, offset, or page parameters. Start with small limits during development, then increase for production queries.
Date range filtering
Use date range parameters to narrow results to a specific time window. Dates are typically in YYYY-MM-DD format.

FAQ

What data does arXiv provide?

Search and retrieve academic preprints across physics, math, CS, and more. It exposes 6 tools: search_arxiv_papers, get_arxiv_paper, get_arxiv_author_papers, get_arxiv_recent, list_arxiv_categories, report_feedback.

What do I need to get started?

A Gnist API key (free tier: 100 calls/day). Sign up at https://context.gnist.ai/signup.

What format does the arXiv API return?

JSON, via either MCP protocol (JSON-RPC 2.0) or REST API.

Next Steps

Related Tutorials