Data source: PubChem (NCBI)
Overview
PubChem wraps PubChem (NCBI), handling authentication, pagination, and rate limits for you. This tutorial covers all 5 tools with working code examples you can copy and run.
Prerequisites
- Sign up at https://context.gnist.ai/signup for a free API key (100 calls/day).
- Choose your integration method: MCP protocol or REST API.
Connect via MCP
Add to your MCP client config (Claude Desktop, Cursor, etc.):
{
"mcpServers": {
"gnist-pubchem": {
"url": "https://context.gnist.ai/mcp/pubchem/",
"headers": {
"Gnist-API-Key": "YOUR_API_KEY"
}
}
}
}
Tools (5)
get_compound
Look up a chemical compound by name, CID, InChIKey, or SMILES. PubChem covers 100M+ chemical structures. Returns molecular formula, weight, SMILES, InChI, IUPAC name, and key physicochemical descriptors. Args: identifier: The compound identifier. Examples: - By name: "aspirin", "caffeine", "glucose" - By CID: "2244" (aspirin's PubChem CID) - By InChIKey: "BSYNRYMUTXBXSQ-UHFFFAOYSA-N" - By SMILES: "CC(=O)Oc1ccccc1C(=O)O" namespace: How to interpret the identifier. One of: "name" (default), "cid", "inchikey", "smiles". Returns: Compound record with cid, iupac_name, molecular_formula, molecular_weight, canonical_smiles, isomeric_smiles, inchi, inchikey, xlogp, exact_mass, tpsa, hbond_donors, hbond_acceptors, rotatable_bonds, heavy_atom_count, charge, and complexity.
| Parameter | Type | Required | Description |
|---|---|---|---|
identifier | string | required | The compound identifier. Examples: - By name: "aspirin", "caffeine", "glucose" - By CID: "2244" (aspirin's PubChem CID) - By InChIKey: "BSYNRYMUTXBXSQ-UHFFFAOYSA-N" - By SMILES: "CC(=O)O... |
namespace | string | optional | How to interpret the identifier. One of: "name" (default), "cid", "inchikey", "smiles". (default: name) |
curl -X POST "https://context.gnist.ai/mcp/pubchem/" \
-H "Content-Type: application/json" \
-H "Gnist-API-Key: YOUR_API_KEY" \
-d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "get_compound", "arguments": {"identifier": "12345"}}}'
import httpx
resp = httpx.post(
"https://context.gnist.ai/mcp/pubchem/",
headers={"Gnist-API-Key": "YOUR_API_KEY"},
json={'id': 1,
'jsonrpc': '2.0',
'method': 'tools/call',
'params': {'arguments': {'identifier': '12345'}, 'name': 'get_compound'}},
)
print(resp.json())
search_compounds
Search PubChem for chemical compounds by name or keyword. Returns a compact list of matching compounds (CID, IUPAC name, formula, weight). Use get_compound with the returned CID for full details. Args: query: Chemical name or keyword (e.g., "aspirin", "beta-lactam", "serotonin"). max_results: Number of results to return (1–50, default 10). Returns: List of matching compounds with cid, iupac_name, molecular_formula, molecular_weight.
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | required | Chemical name or keyword (e.g., "aspirin", "beta-lactam", "serotonin"). |
max_results | integer | optional | Number of results to return (1–50, default 10). (default: 10) |
curl -X POST "https://context.gnist.ai/mcp/pubchem/" \
-H "Content-Type: application/json" \
-H "Gnist-API-Key: YOUR_API_KEY" \
-d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "search_compounds", "arguments": {"query": "renewable energy"}}}'
import httpx
resp = httpx.post(
"https://context.gnist.ai/mcp/pubchem/",
headers={"Gnist-API-Key": "YOUR_API_KEY"},
json={'id': 1,
'jsonrpc': '2.0',
'method': 'tools/call',
'params': {'arguments': {'query': 'renewable energy'},
'name': 'search_compounds'}},
)
print(resp.json())
get_compound_properties
Fetch physicochemical properties for a PubChem compound by CID. Returns a richer property set than get_compound, including stereochemistry counts, covalent units, and monoisotopic mass. Args: cid: PubChem Compound ID (e.g., 2244 for aspirin). Found in search_compounds results. properties: Optional list of specific property names to fetch. If omitted, returns all standard properties. Available properties include: MolecularFormula, MolecularWeight, CanonicalSMILES, IsomericSMILES, InChI, InChIKey, IUPACName, XLogP, ExactMass, MonoisotopicMass, TPSA, Complexity, Charge, HBondDonorCount, HBondAcceptorCount, RotatableBondCount, HeavyAtomCount, CovalentUnitCount, AtomStereoCount, BondStereoCount. Returns: Dict of property name → value for the requested properties.
| Parameter | Type | Required | Description |
|---|---|---|---|
cid | integer | required | PubChem Compound ID (e.g., 2244 for aspirin). Found in search_compounds results. |
properties | any | optional | Optional list of specific property names to fetch. If omitted, returns all standard properties. Available properties include: MolecularFormula, MolecularWeight, CanonicalSMILES, IsomericSMILES, InC... |
curl -X POST "https://context.gnist.ai/mcp/pubchem/" \
-H "Content-Type: application/json" \
-H "Gnist-API-Key: YOUR_API_KEY" \
-d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "get_compound_properties", "arguments": {"cid": 5}}}'
import httpx
resp = httpx.post(
"https://context.gnist.ai/mcp/pubchem/",
headers={"Gnist-API-Key": "YOUR_API_KEY"},
json={'id': 1,
'jsonrpc': '2.0',
'method': 'tools/call',
'params': {'arguments': {'cid': 5}, 'name': 'get_compound_properties'}},
)
print(resp.json())
find_similar_compounds
Find structurally similar compounds using 2D Tanimoto similarity. Uses PubChem's fast 2D fingerprint similarity search. Useful for identifying drug analogs, structural scaffolds, or related molecules. Args: cid: PubChem Compound ID to use as the query structure. threshold: Tanimoto similarity threshold (0–100, default 90). Higher values return only very close structural matches. 90 is a common threshold for drug analog searches. max_results: Number of similar compounds to return (1–50, default 10). Returns: List of similar compounds with cid, iupac_name, molecular_formula, molecular_weight.
| Parameter | Type | Required | Description |
|---|---|---|---|
cid | integer | required | PubChem Compound ID to use as the query structure. |
threshold | integer | optional | Tanimoto similarity threshold (0–100, default 90). Higher values return only very close structural matches. 90 is a common threshold for drug analog searches. (default: 90) |
max_results | integer | optional | Number of similar compounds to return (1–50, default 10). (default: 10) |
curl -X POST "https://context.gnist.ai/mcp/pubchem/" \
-H "Content-Type: application/json" \
-H "Gnist-API-Key: YOUR_API_KEY" \
-d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "find_similar_compounds", "arguments": {"cid": 5}}}'
import httpx
resp = httpx.post(
"https://context.gnist.ai/mcp/pubchem/",
headers={"Gnist-API-Key": "YOUR_API_KEY"},
json={'id': 1,
'jsonrpc': '2.0',
'method': 'tools/call',
'params': {'arguments': {'cid': 5}, 'name': 'find_similar_compounds'}},
)
print(resp.json())
report_feedback
Report a bug, feature request, or general feedback for this data source. Use this when something doesn't work as expected, when you'd like a new feature, or when you have suggestions for improvement. Args: feedback: Describe the issue or suggestion. feedback_type: One of 'bug', 'feature_request', or 'general'.
| Parameter | Type | Required | Description |
|---|---|---|---|
feedback | string | required | |
feedback_type | string | optional | (default: general) |
curl -X POST "https://context.gnist.ai/mcp/pubchem/" \
-H "Content-Type: application/json" \
-H "Gnist-API-Key: YOUR_API_KEY" \
-d '{"jsonrpc": "2.0", "method": "tools/call", "id": 1, "params": {"name": "report_feedback", "arguments": {"feedback": "example"}}}'
import httpx
resp = httpx.post(
"https://context.gnist.ai/mcp/pubchem/",
headers={"Gnist-API-Key": "YOUR_API_KEY"},
json={'id': 1,
'jsonrpc': '2.0',
'method': 'tools/call',
'params': {'arguments': {'feedback': 'example'}, 'name': 'report_feedback'}},
)
print(resp.json())
Common Patterns
Use
search_compounds to find items, then get_compound to get full details. This two-step pattern is common for exploring data before drilling down.Several tools support
limit, offset, or page parameters. Start with small limits during development, then increase for production queries.FAQ
What data does PubChem provide?
Chemical compound data — structures, properties, bioactivity, and safety information. It exposes 5 tools: get_compound, search_compounds, get_compound_properties, find_similar_compounds, report_feedback.
What do I need to get started?
A Gnist API key (free tier: 100 calls/day). Sign up at https://context.gnist.ai/signup.
What format does the PubChem API return?
JSON, via either MCP protocol (JSON-RPC 2.0) or REST API.