Skills › Research & Science › Bioinformatics & life science
Reactome Database
"Query Reactome REST API for pathway analysis, enrichment, gene-pathway mapping, disease pathways, molecular interactions, expression analysis, for systems biology studies."
Tools: reactome2py,requests
The full skill
—
name: reactome-database
description: "Query Reactome REST API for pathway analysis, enrichment, gene-pathway mapping, disease pathways, molecular interactions, expression analysis, for systems biology studies."
—
# Reactome Database
## Overview
Reactome is a free, open-source, curated pathway database with 2,825+ human pathways. Query biological pathways, perform overrepresentation and expression analysis, map genes to pathways, explore molecular interactions via REST API and Python client for systems biology research.
## When to Use This Skill
This skill should be used when:
– Performing pathway enrichment analysis on gene or protein lists
– Analyzing gene expression data to identify relevant biological pathways
– Querying specific pathway information, reactions, or molecular interactions
– Mapping genes or proteins to biological pathways and processes
– Exploring disease-related pathways and mechanisms
– Visualizing analysis results in the Reactome Pathway Browser
– Conducting comparative pathway analysis across species
## Core Capabilities
Reactome provides two main API services and a Python client library:
### 1. Content Service – Data Retrieval
Query and retrieve biological pathway data, molecular interactions, and entity information.
**Common operations:**
– Retrieve pathway information and hierarchies
– Query specific entities (proteins, reactions, complexes)
– Get participating molecules in pathways
– Access database version and metadata
– Explore pathway compartments and locations
**API Base URL:** `https://reactome.org/ContentService`
### 2. Analysis Service – Pathway Analysis
Perform computational analysis on gene lists and expression data.
**Analysis types:**
– **Overrepresentation Analysis**: Identify statistically significant pathways from gene/protein lists
– **Expression Data Analysis**: Analyze gene expression datasets to find relevant pathways
– **Species Comparison**: Compare pathway data across different organisms
**API Base URL:** `https://reactome.org/AnalysisService`
### 3. reactome2py Python Package
Python client library that wraps Reactome API calls for easier programmatic access.
**Installation:**
“`bash
uv pip install reactome2py
“`
**Note:** The reactome2py package (version 3.0.0, released January 2021) is functional but not actively maintained. For the most up-to-date functionality, consider using direct REST API calls.
## Querying Pathway Data
### Using Content Service REST API
The Content Service uses REST protocol and returns data in JSON or plain text formats.
**Get database version:**
“`python
import requests
response = requests.get("https://reactome.org/ContentService/data/database/version")
version = response.text
print(f"Reactome version: {version}")
“`
**Query a specific entity:**
“`python
import requests
entity_id = "R-HSA-69278" # Example pathway ID
response = requests.get(f"https://reactome.org/ContentService/data/query/{entity_id}")
data = response.json()
“`
**Get participating molecules in a pathway:**
“`python
import requests
event_id = "R-HSA-69278"
response = requests.get(
f"https://reactome.org/ContentService/data/event/{event_id}/participatingPhysicalEntities"
)
molecules = response.json()
“`
### Using reactome2py Package
“`python
import reactome2py
from reactome2py import content
# Query pathway information
pathway_info = content.query_by_id("R-HSA-69278")
# Get database version
version = content.get_database_version()
“`
**For detailed API endpoints and parameters**, refer to `references/api_reference.md` in this skill.
## Performing Pathway Analysis
### Overrepresentation Analysis
Submit a list of gene/protein identifiers to find enriched pathways.
**Using REST API:**
“`python
import requests
# Prepare identifier list
identifiers = ["TP53", "BRCA1", "EGFR", "MYC"]
data = "\n".join(identifiers)
# Submit analysis
response = requests.post(
"https://reactome.org/AnalysisService/identifiers/",
headers={"Content-Type": "text/plain"},
data=data
)
result = response.json()
token = result["summary"]["token"] # Save token to retrieve results later
# Access pathways
for pathway in result["pathways"]:
print(f"{pathway['stId']}: {pathway['name']} (p-value: {pathway['entities']['pValue']})")
“`
**Retrieve analysis by token:**
“`python
# Token is valid for 7 days
response = requests.get(f"https://reactome.org/AnalysisService/token/{token}")
results = response.json()
“`
### Expression Data Analysis
Analyze gene expression datasets with quantitative values.
**Input format (TSV with header starting with #):**
“`
#Gene Sample1 Sample2 Sample3
TP53 2.5 3.1 2.8
BRCA1 1.2 1.5 1.3
EGFR 4.5 4.2 4.8
“`
**Submit expression data:**
“`python
import requests
# Read TSV file
with open("expression_data.tsv", "r") as f:
data = f.read()
response = requests.post(
"https://reactome.org/AnalysisService/identifiers/",
headers={"Content-Type": "text/plain"},
data=data
)
result = response.json()
“`
### Species Projection
Map identifiers to human pathways exclusively using the `/projection/` endpoint:
“`python
response = requests.post(
"https://reactome.org/AnalysisService/identifiers/projection/",
headers={"Content-Type": "text/plain"},
data=data
)
“`
## Visualizing Results
Analysis results can be visualized in the Reactome Pathway Browser by constructing URLs with the analysis token:
“`python
token = result["summary"]["token"]
pathway_id = "R-HSA-69278"
url = f"https://reactome.org/PathwayBrowser/#{pathway_id}&DTAB=AN&ANALYSIS={token}"
print(f"View results: {url}")
“`
## Working with Analysis Tokens
– Analysis tokens are valid for **7 days**
– Tokens allow retrieval of previously computed results without re-submission
– Store tokens to access results across sessions
– Use `GET /token/{TOKEN}` endpoint to retrieve results
## Data Formats and Identifiers
### Supported Identifier Types
Reactome accepts various identifier formats:
– UniProt accessions (e.g., P04637)
– Gene symbols (e.g., TP53)
– Ensembl IDs (e.g., ENSG00000141510)
– EntrezGene IDs (e.g., 7157)
– ChEBI IDs for small molecules
The system automatically detects identifier types.
### Input Format Requirements
**For overrepresentation analysis:**
– Plain text list of identifiers (one per line)
– OR single column in TSV format
**For expression analysis:**
– TSV format with mandatory header row starting with "#"
– Column 1: identifiers
– Columns 2+: numeric expression values
– Use period (.) as decimal separator
### Output Format
All API responses return JSON containing:
– `pathways`: Array of enriched pathways with statistical metrics
– `summary`: Analysis metadata and token
– `entities`: Matched and unmapped identifiers
– Statistical values: pValue, FDR (false discovery rate)
## Helper Scripts
This skill includes `scripts/reactome_query.py`, a helper script for common Reactome operations:
“`bash
# Query pathway information
python scripts/reactome_query.py query R-HSA-69278
# Perform overrepresentation analysis
python scripts/reactome_query.py analyze gene_list.txt
# Get database version
python scripts/reactome_query.py version
“`
## Additional Resources
– **API Documentation**: https://reactome.org/dev
– **User Guide**: https://reactome.org/userguide
– **Documentation Portal**: https://reactome.org/documentation
– **Data Downloads**: https://reactome.org/download-data
– **reactome2py Docs**: https://reactome.github.io/reactome2py/
For comprehensive API endpoint documentation, see `references/api_reference.md` in this skill.
## Current Database Statistics (Version 94, September 2025)
– 2,825 human pathways
– 16,002 reactions
– 11,630 proteins
– 2,176 small molecules
– 1,070 drugs
– 41,373 literature references