Person: Angelino, Elaine Lee
Loading...
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
Angelino
First Name
Elaine Lee
Name
Angelino, Elaine Lee
7 results
Search Results
Now showing 1 - 7 of 7
Publication Computational Caches(ACM Press, 2013) Waterland, Amos; Angelino, Elaine Lee; Cubuk, Ekin; Kaxiras, Efthimios; Adams, Ryan Prescott; Appavoo, Jonathan; Seltzer, MargoCaching is a well-known technique for speeding up computation. We cache data from file systems and databases; we cache dynamically generated code blocks; we cache page translations in TLBs. We propose to cache the act of computation, so that we can apply it later and in different contexts. We use a state-space model of computation to support such caching, involving two interrelated parts: speculatively memoized predicted/resultant state pairs that we use to accelerate sequential computation, and trained probabilistic models that we use to generate predicted states from which to speculatively execute. The key techniques that make this approach feasible are designing probabilistic models that automatically focus on regions of program execution state space in which prediction is tractable and identifying state space equivalence classes so that predictions need not be exact.Publication Accelerating Markov chain Monte Carlo via parallel predictive prefetching(2014-10-21) Angelino, Elaine Lee; Seltzer, Margo I.; Adams, Ryan Prescott; Seltzer, Margo; Adams, Ryan; Kohler, EddieWe present a general framework for accelerating a large class of widely used Markov chain Monte Carlo (MCMC) algorithms. This dissertation demonstrates that MCMC inference can be accelerated in a model of parallel computation that uses speculation to predict and complete computational work ahead of when it is known to be useful. By exploiting fast, iterative approximations to the target density, we can speculatively evaluate many potential future steps of the chain in parallel. In Bayesian inference problems, this approach can accelerate sampling from the target distribution, without compromising exactness, by exploiting subsets of data. It takes advantage of whatever parallel resources are available, but produces results exactly equivalent to standard serial execution. In the initial burn-in phase of chain evaluation, it achieves speedup over serial evaluation that is close to linear in the number of available cores.Publication Flash Caching on the Storage Client(USENIX Association, 2013) Holland, David A.; Angelino, Elaine Lee; Wald, Gideon; Seltzer, MargoFlash memory has recently become popular as a caching medium. Most uses to date are on the storage server side. We investigate a different structure: flash as a cache on the client side of a networked storage environment. We use trace-driven simulation to explore the design space. We consider a wide range of configurations and policies to determine the potential client-side caches might offer and how best to arrange them. Our results show that the flash cache writeback policy does not significantly affect performance. Write-through is sufficient; this greatly simplifies cache consistency handling. We also find that the chief benefit of the flash cache is its size, not its persistence. Cache persistence offers additional performance benefits at system restart at essentially no runtime cost. Finally, for some workloads a large flash cache allows using miniscule amounts of RAM for file caching (e.g., 256 KB) leaving more memory available for application use.Publication A Composite of Multiple Signals Distinguishes Causal Variants in Regions of Positive Selection(American Association for Advancement of Science, 2010) Shylakhter, Ilya; Karlsson, Elinor K; Byrne, Elizabeth; Morales, Shannon; Frieden, Gabriel; Hostetter, Elizabeth; Angelino, Elaine Lee; Garber, Manuel; Zuk, Or; Lander, Eric; Schaffner, Stephen; Sabeti, Pardis; Grossman, SharonThe human genome contains hundreds of regions whose patterns of genetic variation indicate recent positive natural selection, yet for most the underlying gene and the advantageous mutation remain unknown. We developed a method, composite of multiple signals (CMS), that combines tests for multiple signals of selection and increases resolution by up to 100-fold. By applying CMS to candidate regions from the International Haplotype Map, we localized population-specific selective signals to 55 kilobases (median), identifying known and novel causal variants. CMS can not just identify individual loci but implicates precise variants selected by evolution.Publication Provenance Integration Requires Reconciliation(2011) Angelino, Elaine Lee; Braun, Uri; Holland, David; Macko, Peter; Margo, Daniel; Seltzer, MargoWhile there has been a great deal of research on provenance systems, there has been little discussion about challenges that arise when making different provenance systems interoperate. In fact, most of the literature focuses on provenance systems in isolation and does not discuss interoperability – what it means, its requirements, and how to achieve it. We designed the Provenance-Aware Storage System to be a general- purpose substrate on top of which it would be “easy” to add other provenance-aware systems in a way that would provide “seamless integration” for the provenance captured at each level. While the system did exactly what we wanted on toy problems, when we began integrating StarFlow, a Python-based workflow/provenance system, we discovered that integration is far trickier and more subtle than anyone has suggested in the literature. This work describes our experience undertaking the integration of StarFlow and PASS, identifying several important additions to existing provenance models necessary for interoperability among provenance systems.Publication StarFlow: A Script-Centric Data Analysis Environment(Springer, 2010) Angelino, Elaine Lee; Yamins, Daniel Louis Kanef; Seltzer, MargoWe introduce StarFlow, a script-centric environment for data analysis. StarFlow has four main features: (1) extraction of control and data-flow dependencies through a novel combination of static analysis, dynamic runtime analysis, and user annotations, (2) command-line tools for exploring and propagating changes through the resulting dependency network, (3) support for workflow abstractions enabling robust parallel executions of complex analysis pipelines, and (4) a seamless interface with the Python scripting language. We describe a range of real applications of StarFlow, including automatic parallelization of complex workflows in the cloud.Publication Identification and Functional Validation of the Novel Antimalarial Resistance Locus PF10_0355 in Plasmodium falciparum(Public Library of Science, 2011) Van tyne, Daria; Park, Daniel John; Schaffner, Stephen; Neafsey, Daniel; Angelino, Elaine Lee; Cortese, Joseph F.; Barnes, Kayla G.; Rosen, David M.; Lukens, Amanda; Daniels, Rachel; Milner, Danny; Johnson, Charles A.; Shlyakhter, Ilya; Grossman, Sharon; Becker, Justin S.; Yamins, Daniel Louis Kanef; Karlsson, Elinor K; Ndiaye, Daouda; Sarr, Ousmane; Mboup, Souleymane; Happi, Christian; Furlotte, Nicholas A.; Eskin, Eleazar; Kang, Hyun Min; Hartl, Daniel; Birren, Bruce W.; Wiegand, Roger; Lander, Eric; Wirth, Dyann; Volkman, Sarah; Sabeti, PardisThe Plasmodium falciparum parasite's ability to adapt to environmental pressures, such as the human immune system and antimalarial drugs, makes malaria an enduring burden to public health. Understanding the genetic basis of these adaptations is critical to intervening successfully against malaria. To that end, we created a high-density genotyping array that assays over 17,000 single nucleotide polymorphisms (~1 SNP/kb), and applied it to 57 culture-adapted parasites from three continents. We characterized genome-wide genetic diversity within and between populations and identified numerous loci with signals of natural selection, suggesting their role in recent adaptation. In addition, we performed a genome-wide association study (GWAS), searching for loci correlated with resistance to thirteen antimalarials; we detected both known and novel resistance loci, including a new halofantrine resistance locus, PF10_0355. Through functional testing we demonstrated that PF10_0355 overexpression decreases sensitivity to halofantrine, mefloquine, and lumefantrine, but not to structurally unrelated antimalarials, and that increased gene copy number mediates resistance. Our GWAS and follow-on functional validation demonstrate the potential of genome-wide studies to elucidate functionally important loci in the malaria parasite genome.