Publication: A knowledge graph foundation model for neurological disease
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Neurological disorders are the leading global cause of disability, yet most lack disease-modifying treatments or cures. Efforts to understand, diagnose, and treat these conditions have been frustrated by disease complexity and heterogeneity. To address these challenges, we introduce CIPHER, a foundation model for neurological disease. CIPHER is a 578-million-parameter heterogeneous graph Transformer for neuroscientific discovery and precision medicine. CIPHER is trained on NeuroKG, a multiscale knowledge graph assembled from 36 large-scale biomedical databases and containing 147,020 nodes and 7,366,745 edges. To provide brain-specific context, NeuroKG integrates 2,480,956 neurons and 888,263 non-neuronal cells from a human brain single-cell RNA-sequencing atlas, as well as 387,483 nuclei from patients with Parkinson’s disease and matched controls. We evaluate CIPHER across a broad range of tasks, diseases, and datasets: a CRISPR/Cas9 essentiality screen in hPSC-derived dopamine neurons; GWAS and RVAS hits in Parkinson’s and Alzheimer’s; genome and proteome-wide alpha-synuclein-related experimental screens, including overexpression and proximity labeling assays and a targeted exome screen in n = 496 synucleinopathy patients; pesticides toxic to patient-derived dopaminergic neurons; FDA-approved treatments for 25 neurological diseases; and cerebrospinal fluid proteomics from n = 956 subjects representing multiple genetic subtypes of Parkinson’s disease and healthy controls. CIPHER demonstrates strong performance at disease-associated gene and pesticide discovery, in silico experimentation, hypothesis generation, therapeutic prioritization, and disease subtyping, paving the way toward AI-driven neuroscientific discovery.