Publication:

A knowledge graph foundation model for neurological disease

Loading...
Thumbnail Image

Date

2025-06-24

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Noori, Ayush. 2025. A Knowledge Graph Foundation Model for Neurological Disease. Bachelors Thesis, Harvard University Engineering and Applied Sciences.

Abstract

Neurological disorders are the leading global cause of disability, yet most lack disease-modifying treatments or cures. Efforts to understand, diagnose, and treat these conditions have been frustrated by disease complexity and heterogeneity. To address these challenges, we introduce CIPHER, a foundation model for neurological disease. CIPHER is a 578-million-parameter heterogeneous graph Transformer for neuroscientific discovery and precision medicine. CIPHER is trained on NeuroKG, a multiscale knowledge graph assembled from 36 large-scale biomedical databases and containing 147,020 nodes and 7,366,745 edges. To provide brain-specific context, NeuroKG integrates 2,480,956 neurons and 888,263 non-neuronal cells from a human brain single-cell RNA-sequencing atlas, as well as 387,483 nuclei from patients with Parkinson’s disease and matched controls. We evaluate CIPHER across a broad range of tasks, diseases, and datasets: a CRISPR/Cas9 essentiality screen in hPSC-derived dopamine neurons; GWAS and RVAS hits in Parkinson’s and Alzheimer’s; genome and proteome-wide alpha-synuclein-related experimental screens, including overexpression and proximity labeling assays and a targeted exome screen in n = 496 synucleinopathy patients; pesticides toxic to patient-derived dopaminergic neurons; FDA-approved treatments for 25 neurological diseases; and cerebrospinal fluid proteomics from n = 956 subjects representing multiple genetic subtypes of Parkinson’s disease and healthy controls. CIPHER demonstrates strong performance at disease-associated gene and pesticide discovery, in silico experimentation, hypothesis generation, therapeutic prioritization, and disease subtyping, paving the way toward AI-driven neuroscientific discovery.

Description

Other Available Sources

Research Data

Keywords

AI-guided science, Artificial intelligence, Graph representation learning, Knowledge graph, Neurological disease, Parkinson's disease, Computer science, Neurosciences

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories