Skill

SkillsResearch & Science › Bioinformatics & life science

Reactome Database

"Query Reactome REST API for pathway analysis, enrichment, gene-pathway mapping, disease pathways, molecular interactions, expression analysis, for systems biology studies."

Freerisk: low
reactomedatabasepythonuniprot

Tools: reactome2py,requests

The full skill

— name: reactome-database description: "Query Reactome REST API for pathway analysis, enrichment, gene-pathway mapping, disease pathways, molecular interactions, expression analysis, for systems biology studies." — # Reactome Database ## Overview Reactome is a free, open-source, curated pathway database with 2,825+ human pathways. Query biological pathways, perform overrepresentation and expression analysis, map genes to pathways, explore molecular interactions via REST API and Python client for systems biology research. ## When to Use This Skill This skill should be used when: – Performing pathway enrichment analysis on gene or protein lists – Analyzing gene expression data to identify relevant biological pathways – Querying specific pathway information, reactions, or molecular interactions – Mapping genes or proteins to biological pathways and processes – Exploring disease-related pathways and mechanisms – Visualizing analysis results in the Reactome Pathway Browser – Conducting comparative pathway analysis across species ## Core Capabilities Reactome provides two main API services and a Python client library: ### 1. Content Service – Data Retrieval Query and retrieve biological pathway data, molecular interactions, and entity information. **Common operations:** – Retrieve pathway information and hierarchies – Query specific entities (proteins, reactions, complexes) – Get participating molecules in pathways – Access database version and metadata – Explore pathway compartments and locations **API Base URL:** `https://reactome.org/ContentService` ### 2. Analysis Service – Pathway Analysis Perform computational analysis on gene lists and expression data. **Analysis types:** – **Overrepresentation Analysis**: Identify statistically significant pathways from gene/protein lists – **Expression Data Analysis**: Analyze gene expression datasets to find relevant pathways – **Species Comparison**: Compare pathway data across different organisms **API Base URL:** `https://reactome.org/AnalysisService` ### 3. reactome2py Python Package Python client library that wraps Reactome API calls for easier programmatic access. **Installation:** “`bash uv pip install reactome2py “` **Note:** The reactome2py package (version 3.0.0, released January 2021) is functional but not actively maintained. For the most up-to-date functionality, consider using direct REST API calls. ## Querying Pathway Data ### Using Content Service REST API The Content Service uses REST protocol and returns data in JSON or plain text formats. **Get database version:** “`python import requests response = requests.get("https://reactome.org/ContentService/data/database/version") version = response.text print(f"Reactome version: {version}") “` **Query a specific entity:** “`python import requests entity_id = "R-HSA-69278" # Example pathway ID response = requests.get(f"https://reactome.org/ContentService/data/query/{entity_id}") data = response.json() “` **Get participating molecules in a pathway:** “`python import requests event_id = "R-HSA-69278" response = requests.get( f"https://reactome.org/ContentService/data/event/{event_id}/participatingPhysicalEntities" ) molecules = response.json() “` ### Using reactome2py Package “`python import reactome2py from reactome2py import content # Query pathway information pathway_info = content.query_by_id("R-HSA-69278") # Get database version version = content.get_database_version() “` **For detailed API endpoints and parameters**, refer to `references/api_reference.md` in this skill. ## Performing Pathway Analysis ### Overrepresentation Analysis Submit a list of gene/protein identifiers to find enriched pathways. **Using REST API:** “`python import requests # Prepare identifier list identifiers = ["TP53", "BRCA1", "EGFR", "MYC"] data = "\n".join(identifiers) # Submit analysis response = requests.post( "https://reactome.org/AnalysisService/identifiers/", headers={"Content-Type": "text/plain"}, data=data ) result = response.json() token = result["summary"]["token"] # Save token to retrieve results later # Access pathways for pathway in result["pathways"]: print(f"{pathway['stId']}: {pathway['name']} (p-value: {pathway['entities']['pValue']})") “` **Retrieve analysis by token:** “`python # Token is valid for 7 days response = requests.get(f"https://reactome.org/AnalysisService/token/{token}") results = response.json() “` ### Expression Data Analysis Analyze gene expression datasets with quantitative values. **Input format (TSV with header starting with #):** “` #Gene Sample1 Sample2 Sample3 TP53 2.5 3.1 2.8 BRCA1 1.2 1.5 1.3 EGFR 4.5 4.2 4.8 “` **Submit expression data:** “`python import requests # Read TSV file with open("expression_data.tsv", "r") as f: data = f.read() response = requests.post( "https://reactome.org/AnalysisService/identifiers/", headers={"Content-Type": "text/plain"}, data=data ) result = response.json() “` ### Species Projection Map identifiers to human pathways exclusively using the `/projection/` endpoint: “`python response = requests.post( "https://reactome.org/AnalysisService/identifiers/projection/", headers={"Content-Type": "text/plain"}, data=data ) “` ## Visualizing Results Analysis results can be visualized in the Reactome Pathway Browser by constructing URLs with the analysis token: “`python token = result["summary"]["token"] pathway_id = "R-HSA-69278" url = f"https://reactome.org/PathwayBrowser/#{pathway_id}&DTAB=AN&ANALYSIS={token}" print(f"View results: {url}") “` ## Working with Analysis Tokens – Analysis tokens are valid for **7 days** – Tokens allow retrieval of previously computed results without re-submission – Store tokens to access results across sessions – Use `GET /token/{TOKEN}` endpoint to retrieve results ## Data Formats and Identifiers ### Supported Identifier Types Reactome accepts various identifier formats: – UniProt accessions (e.g., P04637) – Gene symbols (e.g., TP53) – Ensembl IDs (e.g., ENSG00000141510) – EntrezGene IDs (e.g., 7157) – ChEBI IDs for small molecules The system automatically detects identifier types. ### Input Format Requirements **For overrepresentation analysis:** – Plain text list of identifiers (one per line) – OR single column in TSV format **For expression analysis:** – TSV format with mandatory header row starting with "#" – Column 1: identifiers – Columns 2+: numeric expression values – Use period (.) as decimal separator ### Output Format All API responses return JSON containing: – `pathways`: Array of enriched pathways with statistical metrics – `summary`: Analysis metadata and token – `entities`: Matched and unmapped identifiers – Statistical values: pValue, FDR (false discovery rate) ## Helper Scripts This skill includes `scripts/reactome_query.py`, a helper script for common Reactome operations: “`bash # Query pathway information python scripts/reactome_query.py query R-HSA-69278 # Perform overrepresentation analysis python scripts/reactome_query.py analyze gene_list.txt # Get database version python scripts/reactome_query.py version “` ## Additional Resources – **API Documentation**: https://reactome.org/dev – **User Guide**: https://reactome.org/userguide – **Documentation Portal**: https://reactome.org/documentation – **Data Downloads**: https://reactome.org/download-data – **reactome2py Docs**: https://reactome.github.io/reactome2py/ For comprehensive API endpoint documentation, see `references/api_reference.md` in this skill. ## Current Database Statistics (Version 94, September 2025) – 2,825 human pathways – 16,002 reactions – 11,630 proteins – 2,176 small molecules – 1,070 drugs – 41,373 literature references