Person:

Galagan, James E.

Loading...
Profile Picture

Email Address

AA Acceptance Date

Birth Date

Research Projects

Organizational Units

Job Title

Last Name

Galagan

First Name

James E.

Name

Galagan, James E.

Search Results

Now showing 1 - 10 of 17
  • Publication

    Large-scale identification of genetic design strategies using local search

    (Nature Publishing Group, 2009) Lun, Desmond S; Rockwell, G; Guido, Nicholas; Baym, Michael; Kelner, Jonathan A; Berger, Bonnie; Galagan, James E.; Church, George

    In the past decade, computational methods have been shown to be well suited to unraveling the complex web of metabolic reactions in biological systems. Methods based on flux–balance analysis (FBA) and bi-level optimization have been used to great effect in aiding metabolic engineering. These methods predict the result of genetic manipulations and allow for the best set of manipulations to be found computationally. Bi-level FBA is, however, limited in applicability because the required computational time and resources scale poorly as the size of the metabolic system and the number of genetic manipulations increase. To overcome these limitations, we have developed Genetic Design through Local Search (GDLS), a scalable, heuristic, algorithmic method that employs an approach based on local search with multiple search paths, which results in effective, low-complexity search of the space of genetic manipulations. Thus, GDLS is able to find genetic designs with greater in silico production of desired metabolites than can feasibly be found using a globally optimal search and performs favorably in comparison with heuristic searches based on evolutionary algorithms and simulated annealing.

  • Publication

    Comparative Genomic Characterization of Francisella tularensis Strains Belonging to Low and High Virulence Subspecies

    (Public Library of Science, 2009) Champion, Mia D.; Zeng, Qiandong; Nix, Eli B.; Nano, Francis E.; Keim, Paul; Kodira, Chinnappa D.; Koehrsen, Michael; Pearson, Matthew; Howarth, Clint; Larson, Lisa; White, Jared; Alvarado, Lucia; Forsman, Mats; Bearden, Scott W.; Sjöstedt, Anders; Titball, Richard; Michell, Stephen L.; Birren, Bruce; Borowsky, Mark L; Young, Sarah; Engels, Reinhard; Galagan, James E.

    Tularemia is a geographically widespread, severely debilitating, and occasionally lethal disease in humans. It is caused by infection by a gram-negative bacterium, Francisella tularensis. In order to better understand its potency as an etiological agent as well as its potential as a biological weapon, we have completed draft assemblies and report the first complete genomic characterization of five strains belonging to the following different Francisella subspecies (subsp.): the F. tularensis subsp. tularensis FSC033, F. tularensis subsp. holarctica FSC257 and FSC022, and F. tularensis subsp. novicida GA99-3548 and GA99-3549 strains. Here, we report the sequencing of these strains and comparative genomic analysis with recently available public Francisella sequences, including the rare F. tularensis subsp. mediasiatica FSC147 strain isolate from the Central Asian Region. We report evidence for the occurrence of large-scale rearrangement events in strains of the holarctica subspecies, supporting previous proposals that further phylogenetic subdivisions of the Type B clade are likely. We also find a significant enrichment of disrupted or absent ORFs proximal to predicted breakpoints in the FSC022 strain, including a genetic component of the Type I restriction-modification defense system. Many of the pseudogenes identified are also disrupted in the closely related rarely human pathogenic F. tularensis subsp. mediasiatica FSC147 strain, including modulator of drug activity B (mdaB) (FTT0961), which encodes a known NADPH quinone reductase involved in oxidative stress resistance. We have also identified genes exhibiting sequence similarity to effectors of the Type III (T3SS) and components of the Type IV secretion systems (T4SS). One of the genes, msrA2 (FTT1797c), is disrupted in F. tularensis subsp. mediasiatica and has recently been shown to mediate bacterial pathogen survival in host organisms. Our findings suggest that in addition to the duplication of the Francisella Pathogenicity Island, and acquisition of individual loci, adaptation by gene loss in the more recently emerged tularensis, holarctica, and mediasiatica subspecies occurred and was distinct from evolutionary events that differentiated these subspecies, and the novicida subspecies, from a common ancestor. Our findings are applicable to future studies focused on variations in Francisella subspecies pathogenesis, and of broader interest to studies of genomic pathoadaptation in bacteria.

  • Publication

    Genomic Analysis of the Basal Lineage Fungus Rhizopus oryzae Reveals a Whole-Genome Duplication

    (Public Library of Science, 2009) Ma, Li-Jun; Ibrahim, Ashraf S.; Skory, Christopher; Grabherr, Manfred G.; Burger, Gertraud; Butler, Margi; Elias, Marek; Idnurm, Alexander; Lang, B. Franz; Sone, Teruo; Abe, Ayumi; Corrochano, Luis M.; Fu, Jianmin; Hansberg, Wilhelm; Kim, Jung-Mi; Kodira, Chinnappa D.; Koehrsen, Michael J.; Miranda-Saavedra, Diego; O'Leary, Sinead; Ortiz-Castellanos, Lucila; Poulter, Russell; Rodriguez-Romero, Julio; Ruiz-Herrera, José; Shen, Yao-Qing; Zeng, Qiandong; Birren, Bruce W.; Cuomo, Christina A.; Wickes, Brian L.; Calvo, Sarah; Engels, Reinhard; Galagan, James E.; Liu, Bo

    Rhizopus oryzae is the primary cause of mucormycosis, an emerging, life-threatening infection characterized by rapid angioinvasive growth with an overall mortality rate that exceeds 50%. As a representative of the paraphyletic basal group of the fungal kingdom called “zygomycetes,” R. oryzae is also used as a model to study fungal evolution. Here we report the genome sequence of R. oryzae strain 99–880, isolated from a fatal case of mucormycosis. The highly repetitive 45.3 Mb genome assembly contains abundant transposable elements (TEs), comprising approximately 20% of the genome. We predicted 13,895 protein-coding genes not overlapping TEs, many of which are paralogous gene pairs. The order and genomic arrangement of the duplicated gene pairs and their common phylogenetic origin provide evidence for an ancestral whole-genome duplication (WGD) event. The WGD resulted in the duplication of nearly all subunits of the protein complexes associated with respiratory electron transport chains, the V-ATPase, and the ubiquitin–proteasome systems. The WGD, together with recent gene duplications, resulted in the expansion of multiple gene families related to cell growth and signal transduction, as well as secreted aspartic protease and subtilase protein families, which are known fungal virulence factors. The duplication of the ergosterol biosynthetic pathway, especially the major azole target, lanosterol 14α-demethylase (ERG11), could contribute to the variable responses of R. oryzae to different azole drugs, including voriconazole and posaconazole. Expanded families of cell-wall synthesis enzymes, essential for fungal cell integrity but absent in mammalian hosts, reveal potential targets for novel and R. oryzae-specific diagnostic and therapeutic treatments.

  • Publication

    Short-Term Genome Evolution of Listeria Monocytogenes in a Non-Controlled Environment

    (BioMed Central, 2008) Orsi, Renato H; Lauer, Peter; Nusbaum, Chad; Birren, Bruce W; Ivy, Reid A; Graves, Lewis M; Swaminathan, Bala; Wiedmann, Martin; Borowsky, Mark L; Young, Sarah K.; Galagan, James E.; Sun, Qi

    Background: While increasing data on bacterial evolution in controlled environments are available, our understanding of bacterial genome evolution in natural environments is limited. We thus performed full genome analyses on four Listeria monocytogenes, including human and food isolates from both a 1988 case of sporadic listeriosis and a 2000 listeriosis outbreak, which had been linked to contaminated food from a single processing facility. All four isolates had been shown to have identical subtypes, suggesting that a specific L. monocytogenes strain persisted in this processing plant over at least 12 years. While a genome sequence for the 1988 food isolate has been reported, we sequenced the genomes of the 1988 human isolate as well as a human and a food isolate from the 2000 outbreak to allow for comparative genome analyses. Results: The two L. monocytogenes isolates from 1988 and the two isolates from 2000 had highly similar genome backbone sequences with very few single nucleotide (nt) polymorphisms (1 – 8 SNPs/isolate; confirmed by re-sequencing). While no genome rearrangements were identified in the backbone genome of the four isolates, a 42 kb prophage inserted in the chromosomal comK gene showed evidence for major genome rearrangements. The human-food isolate pair from each 1988 and 2000 had identical prophage sequence; however, there were significant differences in the prophage sequences between the 1988 and 2000 isolates. Diversification of this prophage appears to have been caused by multiple homologous recombination events or possibly prophage replacement. In addition, only the 2000 human isolate contained a plasmid, suggesting plasmid loss or acquisition events. Surprisingly, besides the polymorphisms found in the comK prophage, a single SNP in the tRNA Thr-4 prophage represents the only SNP that differentiates the 1988 isolates from the 2000 isolates. Conclusion: Our data support the hypothesis that the 2000 human listeriosis outbreak was caused by a L. monocytogenes strain that persisted in a food processing facility over 12 years and show that genome sequencing is a valuable and feasible tool for retrospective epidemiological analyses. Short-term evolution of L. monocytogenes in non-controlled environments appears to involve limited diversification beyond plasmid gain or loss and prophage diversification, highlighting the importance of phages in bacterial evolution.

  • Publication

    Inferring Carbon Sources from Gene Expression Profiles Using Metabolic Flux Models

    (Public Library of Science, 2012) Brandes, Aaron; Lun, Desmond S.; Ip, Kuhn; Zucker, Jeremy Daniel Hofeld; Colijn, Caroline; Weiner, Brian; Galagan, James E.

    Background: Bacteria have evolved the ability to efficiently and resourcefully adapt to changing environments. A key means by which they optimize their use of available nutrients is through adjustments in gene expression with consequent changes in enzyme activity. We report a new method for drawing environmental inferences from gene expression data. Our method prioritizes a list of candidate carbon sources for their compatibility with a gene expression profile using the framework of flux balance analysis to model the organism’s metabolic network. Principal Findings: For each of six gene expression profiles for Escherichia coli grown under differing nutrient conditions, we applied our method to prioritize a set of eighteen different candidate carbon sources. Our method ranked the correct carbon source as one of the top three candidates for five of the six expression sets when used with a genome-scale model. The correct candidate ranked fifth in the remaining case. Additional analyses show that these rankings are robust with respect to biological and measurement variation, and depend on specific gene expression, rather than general expression level. The gene expression profiles are highly adaptive: simulated production of biomass averaged 94.84% of maximum when the in silico carbon source matched the in vitro source of the expression profile, and 65.97% when it did not. Conclusions: Inferences about a microorganism’s nutrient environment can be made by integrating gene expression data into a metabolic framework. This work demonstrates that reaction flux limits for a model can be computed which are realistic in the sense that they affect in silico growth in a manner analogous to that in which a microorganism’s alteration of gene expression is adaptive to its nutrient environment.

  • Publication

    Comparative analysis of mycobacterium and related actinomycetes yields insight into the evolution of mycobacterium tuberculosis pathogenesis

    (BioMed Central, 2012) Weiner, Brian; Raman, Sahadevan; Dolganov, Gregory; Peterson, Matthew; Riley, Robert; Abeel, Thomas; White, Jared; Sisk, Peter; Stolte, Christian; Koehrsen, Mike; Yamamoto, Robert T; Iacobelli-Martinez, Milena; Kidd, Matthew J; Maer, Andreia M; Schoolnik, Gary K; Regev, Aviv; McGuire, Abigail Manson; Park, Sang T.; Wapinski, Ilan; Zucker, Jeremy Daniel Hofeld; Galagan, James E.

    Background: The sequence of the pathogen Mycobacterium tuberculosis (Mtb) strain H37Rv has been available for over a decade, but the biology of the pathogen remains poorly understood. Genome sequences from other Mtb strains and closely related bacteria present an opportunity to apply the power of comparative genomics to understand the evolution of Mtb pathogenesis. We conducted a comparative analysis using 31 genomes from the Tuberculosis Database (TBDB.org), including 8 strains of Mtb and M. bovis, 11 additional Mycobacteria, 4 Corynebacteria, 2 Streptomyces, Rhodococcus jostii RHA1, Nocardia farcinia, Acidothermus cellulolyticus, Rhodobacter sphaeroides, Propionibacterium acnes, and Bifidobacterium longum. Results: Our results highlight the functional importance of lipid metabolism and its regulation, and reveal variation between the evolutionary profiles of genes implicated in saturated and unsaturated fatty acid metabolism. It also suggests that DNA repair and molybdopterin cofactors are important in pathogenic Mycobacteria. By analyzing sequence conservation and gene expression data, we identify nearly 400 conserved noncoding regions. These include 37 predicted promoter regulatory motifs, of which 14 correspond to previously validated motifs, as well as 50 potential noncoding RNAs, of which we experimentally confirm the expression of four. Conclusions: Our analysis of protein evolution highlights gene families that are associated with the adaptation of environmental Mycobacteria to obligate pathogenesis. These families include fatty acid metabolism, DNA repair, and molybdopterin biosynthesis. Our analysis reinforces recent findings suggesting that small noncoding RNAs are more common in Mycobacteria than previously expected. Our data provide a foundation for understanding the genome and biology of Mtb in a comparative context, and are available online and through TBDB.org.

  • Publication

    Interpreting Expression Data with Metabolic Flux Models: Predicting Mycobacterium tuberculosis Mycolic Acid Production

    (Public Library of Science, 2009) Colijn, Caroline; Brandes, Aaron; Zucker, Jeremy Daniel Hofeld; Lun, Desmond S.; Weiner, Brian; Farhat, Maha; Cheng, Tan-Yun; Moody, David; Murray, Megan; Galagan, James E.

    Metabolism is central to cell physiology, and metabolic disturbances play a role in numerous disease states. Despite its importance, the ability to study metabolism at a global scale using genomic technologies is limited. In principle, complete genome sequences describe the range of metabolic reactions that are possible for an organism, but cannot quantitatively describe the behaviour of these reactions. We present a novel method for modeling metabolic states using whole cell measurements of gene expression. Our method, which we call E-Flux (as a combination of flux and expression), extends the technique of Flux Balance Analysis by modeling maximum flux constraints as a function of measured gene expression. In contrast to previous methods for metabolically interpreting gene expression data, E-Flux utilizes a model of the underlying metabolic network to directly predict changes in metabolic flux capacity. We applied E-Flux to Mycobacterium tuberculosis, the bacterium that causes tuberculosis (TB). Key components of mycobacterial cell walls are mycolic acids which are targets for several first-line TB drugs. We used E-Flux to predict the impact of 75 different drugs, drug combinations, and nutrient conditions on mycolic acid biosynthesis capacity in M. tuberculosis, using a public compendium of over 400 expression arrays. We tested our method using a model of mycolic acid biosynthesis as well as on a genome-scale model of M. tuberculosis metabolism. Our method correctly predicts seven of the eight known fatty acid inhibitors in this compendium and makes accurate predictions regarding the specificity of these compounds for fatty acid biosynthesis. Our method also predicts a number of additional potential modulators of TB mycolic acid biosynthesis. E-Flux thus provides a promising new approach for algorithmically predicting metabolic state from gene expression data.

  • Publication

    Independent Large Scale Duplications in Multiple M. tuberculosis Lineages Overlapping the Same Genomic Region

    (Public Library of Science, 2012) Weiner, Brian; Victor, Thomas C.; Warren, Robert M.; Plikaytis, Bonnie B.; Posey, James E.; van Helden, Paul D.; Gey van Pittius, Nicolass C.; Koehrsen, Michael; Sisk, Peter; Stolte, Christian; White, Jared; Gagneux, Sebastien; Birren, Bruce; Gomez, James; Sloutsky, Alexander; Hung, Deborah; Murray, Megan; Galagan, James E.

    Mycobacterium tuberculosis, the causative agent of most human tuberculosis, infects one third of the world's population and kills an estimated 1.7 million people a year. With the world-wide emergence of drug resistance, and the finding of more functional genetic diversity than previously expected, there is a renewed interest in understanding the forces driving genome evolution of this important pathogen. Genetic diversity in M. tuberculosis is dominated by single nucleotide polymorphisms and small scale gene deletion, with little or no evidence for large scale genome rearrangements seen in other bacteria. Recently, a single report described a large scale genome duplication that was suggested to be specific to the Beijing lineage. We report here multiple independent large-scale duplications of the same genomic region of M. tuberculosis detected through whole-genome sequencing. The duplications occur in strains belonging to both M. tuberculosis lineage 2 and 4, and are thus not limited to Beijing strains. The duplications occur in both drug-resistant and drug susceptible strains. The duplicated regions also have substantially different boundaries in different strains, indicating different originating duplication events. We further identify a smaller segmental duplication of a different genomic region of a lab strain of H37Rv. The presence of multiple independent duplications of the same genomic region suggests either instability in this region, a selective advantage conferred by the duplication, or both. The identified duplications suggest that large-scale gene duplication may be more common in M. tuberculosis than previously considered.

  • Publication

    How Accurate Can Genetic Predictions Be?

    (BioMed Central, 2012) Dreyfuss, Jonathan M; Levner, D; Galagan, James E.; Church, George; Ramoni, Marco F

    Background: Pre-symptomatic prediction of disease and drug response based on genetic testing is a critical component of personalized medicine. Previous work has demonstrated that the predictive capacity of genetic testing is constrained by the heritability and prevalence of the tested trait, although these constraints have only been approximated under the assumption of a normally distributed genetic risk distribution. Results: Here, we mathematically derive the absolute limits that these factors impose on test accuracy in the absence of any distributional assumptions on risk. We present these limits in terms of the best-case receiver-operating characteristic (ROC) curve, consisting of the best-case test sensitivities and specificities, and the AUC (area under the curve) measure of accuracy. We apply our method to genetic prediction of type 2 diabetes and breast cancer, and we additionally show the best possible accuracy that can be obtained from integrated predictors, which can incorporate non-genetic features. Conclusion: Knowledge of such limits is valuable in understanding the implications of genetic testing even before additional associations are identified.

  • Publication

    Resource Competition May Lead to Effective Treatment of Antibiotic Resistant Infections

    (Public Library of Science, 2013) Gomes, Antonio L. C.; Galagan, James E.; Segrè, Daniel

    Drug resistance is a common problem in the fight against infectious diseases. Recent studies have shown conditions (which we call antiR) that select against resistant strains. However, no specific drug administration strategies based on this property exist yet. Here, we mathematically compare growth of resistant versus sensitive strains under different treatments (no drugs, antibiotic, and antiR), and show how a precisely timed combination of treatments may help defeat resistant strains. Our analysis is based on a previously developed model of infection and immunity in which a costly plasmid confers antibiotic resistance. As expected, antibiotic treatment increases the frequency of the resistant strain, while the plasmid cost causes a reduction of resistance in the absence of antibiotic selection. Our analysis suggests that this reduction occurs under competition for limited resources. Based on this model, we estimate treatment schedules that would lead to a complete elimination of both sensitive and resistant strains. In particular, we derive an analytical expression for the rate of resistance loss, and hence for the time necessary to turn a resistant infection into sensitive (tclear). This time depends on the experimentally measurable rates of pathogen division, growth and plasmid loss. Finally, we estimated tclear for a specific case, using available empirical data, and found that resistance may be lost up to 15 times faster under antiR treatment when compared to a no treatment regime. This strategy may be particularly suitable to treat chronic infection. Finally, our analysis suggests that accounting explicitly for a resistance-decaying rate may drastically change predicted outcomes in host-population models.