Person: Marks, Debora
Loading...
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
Marks
First Name
Debora
Name
Marks, Debora
22 results
Search Results
Now showing 1 - 10 of 22
Publication Disease variant prediction with deep generative models of evolutionary data(Springer Science and Business Media LLC, 2021-10-27) Frazer, Jonathan; Notin, Pascal; Bernardes Pereira De, Mafalda; Gomez, Aidan; Min, Joseph; Brock, Kelly; Gal, Yarin; Marks, DeboraPublication Learning from prepandemic data to forecast viral escape(Springer Science and Business Media LLC, 2023-10-11) Thadani, Nicole; Gurev, Sarah; Notin, Pascal; Youssef, Noor; Rollins, Nathan; Ritter, Daniel; Sander, Chris; Gal, Yarin; Marks, DeboraSummaryEffective pandemic preparedness relies on anticipating viral mutations that are able to evade host immune responses in order to facilitate vaccine and therapeutic design. However, current strategies for viral evolution prediction are not available early in a pandemic – experimental approaches require host polyclonal antibodies to test against and existing computational methods draw heavily from current strain prevalence to make reliable predictions of variants of concern. To address this, we developed EVEscape, a generalizable, modular framework that combines fitness predictions from a deep learning model of historical sequences with biophysical structural information. EVEscape quantifies the viral escape potential of mutations at scale and has the advantage of being applicable before surveillance sequencing, experimental scans, or 3D structures of antibody complexes are available. We demonstrate that EVEscape, trained on sequences available prior to 2020, is as accurate as high-throughput experimental scans at anticipating pandemic variation for SARS-CoV-2 and is generalizable to other viruses including Influenza, HIV, and understudied viruses with pandemic potential such as Lassa and Nipah. We provide continually updated escape scores for all current strains of SARS-CoV-2 and predict likely additional mutations to forecast emerging strains as a tool for ongoing vaccine development (evescape.org).Publication Sequence co-evolution gives 3D contacts and structures of protein complexes(eLife Sciences Publications, Ltd, 2014) Hopf, Thomas A; Schärfe, Charlotta P I; Rodrigues, João P G L M; Green, Anna; Kohlbacher, Oliver; Sander, Chris; Bonvin, Alexandre M J J; Marks, DeboraProtein–protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions, and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein–protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequences, we expect that the method can be generalized to genome-wide elucidation of protein–protein interaction networks and used for interaction predictions at residue resolution. DOI: http://dx.doi.org/10.7554/eLife.03430.001Publication Genetic variation in human drug-related genes(BioMed Central, 2017) Schärfe, Charlotta Pauline Irmgard; Tremmel, Roman; Schwab, Matthias; Kohlbacher, Oliver; Marks, DeboraBackground: Variability in drug efficacy and adverse effects are observed in clinical practice. While the extent of genetic variability in classic pharmacokinetic genes is rather well understood, the role of genetic variation in drug targets is typically less studied. Methods: Based on 60,706 human exomes from the ExAC dataset, we performed an in-depth computational analysis of the prevalence of functional variants in 806 drug-related genes, including 628 known drug targets. We further computed the likelihood of 1236 FDA-approved drugs to be affected by functional variants in their targets in the whole ExAC population as well as different geographic sub-populations. Results: We find that most genetic variants in drug-related genes are very rare (f < 0.1%) and thus will likely not be observed in clinical trials. Furthermore, we show that patient risk varies for many drugs and with respect to geographic ancestry. A focused analysis of oncological drug targets indicates that the probability of a patient carrying germline variants in oncological drug targets is, at 44%, high enough to suggest that not only somatic alterations but also germline variants carried over into the tumor genome could affect the response to antineoplastic agents. Conclusions: This study indicates that even though many variants are very rare and thus likely not observed in clinical trials, four in five patients are likely to carry a variant with possibly functional effects in a target for commonly prescribed drugs. Such variants could potentially alter drug efficacy. Electronic supplementary material The online version of this article (doi:10.1186/s13073-017-0502-5) contains supplementary material, which is available to authorized users.Publication FreeContact: fast and free software for protein contact prediction from residue co-evolution(BioMed Central, 2014) Kaján, László; Hopf, Thomas; Kalaš, Matúš; Marks, Debora; Rost, BurkhardBackground: 20 years of improved technology and growing sequences now renders residue-residue contact constraints in large protein families through correlated mutations accurate enough to drive de novo predictions of protein three-dimensional structure. The method EVfold broke new ground using mean-field Direct Coupling Analysis (EVfold-mfDCA); the method PSICOV applied a related concept by estimating a sparse inverse covariance matrix. Both methods (EVfold-mfDCA and PSICOV) are publicly available, but both require too much CPU time for interactive applications. On top, EVfold-mfDCA depends on proprietary software. Results: Here, we present FreeContact, a fast, open source implementation of EVfold-mfDCA and PSICOV. On a test set of 140 proteins, FreeContact was almost eight times faster than PSICOV without decreasing prediction performance. The EVfold-mfDCA implementation of FreeContact was over 220 times faster than PSICOV with negligible performance decrease. EVfold-mfDCA was unavailable for testing due to its dependency on proprietary software. FreeContact is implemented as the free C++ library “libfreecontact”, complete with command line tool “freecontact”, as well as Perl and Python modules. All components are available as Debian packages. FreeContact supports the BioXSD format for interoperability. Conclusions: FreeContact provides the opportunity to compute reliable contact predictions in any environment (desktop or cloud).Publication Inferring Pairwise Interactions from Biological Data Using Maximum-Entropy Probability Models(Public Library of Science, 2015) Stein, Richard R.; Marks, Debora; Sander, ChrisMaximum entropy-based inference methods have been successfully used to infer direct interactions from biological datasets such as gene expression data or sequence ensembles. Here, we review undirected pairwise maximum-entropy probability models in two categories of data types, those with continuous and categorical random variables. As a concrete example, we present recently developed inference methods from the field of protein contact prediction and show that a basic set of assumptions leads to similar solution strategies for inferring the model parameters in both variable types. These parameters reflect interactive couplings between observables, which can be used to predict global properties of the biological system. Such methods are applicable to the important problems of protein 3-D structure prediction and association of gene–gene networks, and they enable potential applications to the analysis of gene alteration patterns and to protein design.Publication Structure, Dynamics and Implied Gating Mechanism of a Human Cyclic Nucleotide-Gated Channel(Public Library of Science, 2014) Gofman, Yana; Schärfe, Charlotta; Marks, Debora; Haliloglu, Turkan; Ben-Tal, NirCyclic nucleotide-gated (CNG) ion channels are nonselective cation channels, essential for visual and olfactory sensory transduction. Although the channels include voltage-sensor domains (VSDs), their conductance is thought to be independent of the membrane potential, and their gating regulated by cytosolic cyclic nucleotide–binding domains. Mutations in these channels result in severe, degenerative retinal diseases, which remain untreatable. The lack of structural information on CNG channels has prevented mechanistic understanding of disease-causing mutations, precluded structure-based drug design, and hampered in silico investigation of the gating mechanism. To address this, we built a 3D model of the cone tetrameric CNG channel, based on homology to two distinct templates with known structures: the transmembrane (TM) domain of a bacterial channel, and the cyclic nucleotide-binding domain of the mouse HCN2 channel. Since the TM-domain template had low sequence-similarity to the TM domains of the CNG channels, and to reconcile conflicts between the two templates, we developed a novel, hybrid approach, combining homology modeling with evolutionary coupling constraints. Next, we used elastic network analysis of the model structure to investigate global motions of the channel and to elucidate its gating mechanism. We found the following: (i) In the main mode of motion, the TM and cytosolic domains counter-rotated around the membrane normal. We related this motion to gating, a proposition that is supported by previous experimental data, and by comparison to the known gating mechanism of the bacterial KirBac channel. (ii) The VSDs could facilitate gating (supplementing the pore gate), explaining their presence in such ‘voltage-insensitive’ channels. (iii) Our elastic network model analysis of the CNGA3 channel supports a modular model of allosteric gating, according to which protein domains are quasi-independent: they can move independently, but are coupled to each other allosterically.Publication PconsFold: improved contact predictions improve protein models(Oxford University Press, 2014) Michel, Mirco; Hayat, Sikander; Skwark, Marcin J.; Sander, Chris; Marks, Debora; Elofsson, ArneMotivation: Recently it has been shown that the quality of protein contact prediction from evolutionary information can be improved significantly if direct and indirect information is separated. Given sufficiently large protein families, the contact predictions contain sufficient information to predict the structure of many protein families. However, since the first studies contact prediction methods have improved. Here, we ask how much the final models are improved if improved contact predictions are used. Results: In a small benchmark of 15 proteins, we show that the TM-scores of top-ranked models are improved by on average 33% using PconsFold compared with the original version of EVfold. In a larger benchmark, we find that the quality is improved with 15–30% when using PconsC in comparison with earlier contact prediction methods. Further, using Rosetta instead of CNS does not significantly improve global model accuracy, but the chemistry of models generated with Rosetta is improved. Availability: PconsFold is a fully automated pipeline for ab initio protein structure prediction based on evolutionary information. PconsFold is based on PconsC contact prediction and uses the Rosetta folding protocol. Due to its modularity, the contact prediction tool can be easily exchanged. The source code of PconsFold is available on GitHub at https://www.github.com/ElofssonLab/pcons-fold under the MIT license. PconsC is available from http://c.pcons.net/. Contact: arne@bioinfo.se Supplementary information: Supplementary data are available at Bioinformatics online.Publication Antiparallel protocadherin homodimers use distinct affinity- and specificity-mediating regions in cadherin repeats 1-4(eLife Sciences Publications, Ltd, 2016) Nicoludis, John M.; Vogt, Bennett; Green, Anna; Schärfe, Charlotta PI; Marks, Debora; Gaudet, RachelleProtocadherins (Pcdhs) are cell adhesion and signaling proteins used by neurons to develop and maintain neuronal networks, relying on trans homophilic interactions between their extracellular cadherin (EC) repeat domains. We present the structure of the antiparallel EC1-4 homodimer of human PcdhγB3, a member of the γ subfamily of clustered Pcdhs. Structure and sequence comparisons of α, β, and γ clustered Pcdh isoforms illustrate that subfamilies encode specificity in distinct ways through diversification of loop region structure and composition in EC2 and EC3, which contains isoform-specific conservation of primarily polar residues. In contrast, the EC1/EC4 interface comprises hydrophobic interactions that provide non-selective dimerization affinity. Using sequence coevolution analysis, we found evidence for a similar antiparallel EC1-4 interaction in non-clustered Pcdh families. We thus deduce that the EC1-4 antiparallel homodimer is a general interaction strategy that evolved before the divergence of these distinct protocadherin families. DOI: http://dx.doi.org/10.7554/eLife.18449.001Publication Core Genes Evolve Rapidly in the Long-Term Evolution Experiment with Escherichia coli(Oxford University Press, 2017) Maddamsetti, Rohan; Hatcher, Philip J.; Green, Anna; Williams, Barry L.; Marks, Debora; Lenski, Richard E.Bacteria can evolve rapidly under positive selection owing to their vast numbers, allowing their genes to diversify by adapting to different environments. We asked whether the same genes that evolve rapidly in the long-term evolution experiment (LTEE) with Escherichia coli have also diversified extensively in nature. To make this comparison, we identified ∼2000 core genes shared among 60 E. coli strains. During the LTEE, core genes accumulated significantly more nonsynonymous mutations than flexible (i.e., noncore) genes. Furthermore, core genes under positive selection in the LTEE are more conserved in nature than the average core gene. In some cases, adaptive mutations appear to modify protein functions, rather than merely knocking them out. The LTEE conditions are novel for E. coli, at least in relation to its evolutionary history in nature. The constancy and simplicity of the environment likely favor the complete loss of some unused functions and the fine-tuning of others.
- «
- 1 (current)
- 2
- 3
- »