Person: Wait, Alexander
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
First Name
Name
Search Results
Publication Accurate Whole-Genome Sequencing and Haplotyping from 10 to 20 Human Cells
(Nature Publishing Group, 2012) Peters, Brock A.; Kermani, Bahram G.; Sparks, Andrew B.; Alferov, Oleg; Hong, Peter; Alexeev, Andrei; Jiang, Yuan; Dahl, Fredrik; Tang, Y. Tom; Haas, Juergen; Robasky, Kimberly; Lee, Je-Hyuk; Peterson, Joseph E.; Perazich, Helena; Yeung, George; Liu, Jia; Chen, Linsu; Kennemer, Michael I.; Pothuraju, Kaliprasad; Konvicka, Karel; Tsoupko-Sitnikov, Mike; Pant, Krishna P.; Ebert, Jessica C.; Nilsen, Geoffrey B.; Baccash, Jonathan; Halpern, Aaron L.; Church, George; Drmanac, Radoje; Wait, Alexander; Ball, MadeleineRecent advances in whole genome sequencing have brought the vision of personal genomics and genomic medicine closer to reality. However, current methods lack clinical accuracy and the ability to describe the context (haplotypes) in which genome variants co-occur in a cost-effective manner. Here we describe a low-cost DNA sequencing and haplotyping process, Long Fragment Read (LFR) technology, similar to sequencing long single DNA molecules without cloning or separation of metaphase chromosomes. In this study, ten LFR libraries were made using only ~100 pg of human DNA per sample. Up to 97% of the heterozygous single nucleotide variants (SNVs) were assembled into long haplotype contigs. Removal of false positive SNVs not phased by multiple LFR haplotypes resulted in a final genome error rate of 1 in 10 Mb. Cost-effective and accurate genome sequencing and haplotyping from 10-20 human cells, as demonstrated here, will enable comprehensive genetic studies and diverse clinical applications.
Publication Swift: Primary Data Analysis for the Illumina Solexa Sequencing Platform
(Oxford University Press, 2009) Whiteford, Nava; Skelly, Tom; Curtis, Christina; Ritchie, Matt E.; Löhr, Andrea; Wait, Alexander; Abnizova, Irina; Brown, CliveMotivation: Primary data analysis methods are of critical importance in second generation DNA sequencing. Improved methods have the potential to increase yield and reduce the error rates. Openly documented analysis tools enable the user to understand the primary data, this is important for the optimization and validity of their scientific work. Results: In this article, we describe Swift, a new tool for performing primary data analysis on the Illumina Solexa Sequencing Platform. Swift is the first tool, outside of the vendors own software, which completes the full analysis process, from raw images through to base calls. As such it provides an alternative to, and independent validation of, the vendor supplied tool. Our results show that Swift is able to increase yield by 13.8%, at comparable error rate. Availability and Implementation: Swift is implemented in C++and supported under Linux. It is supplied under an open source license (LGPL3), allowing researchers to build upon the platform. Swift is available from http://swiftng.sourceforge.net. Supplementary information: Supplementary data are available at Bioinformatics online.
Publication A Survey of Genomic Traces Reveals a Common Sequencing Error, RNA Editing, and DNA Editing
(Public Library of Science, 2010) Wait, Alexander; Levanon, Erez Y.; Zecharia, Tomer; Clegg, Tom; Church, GeorgeWhile it is widely held that an organism's genomic information should remain constant, several protein families are known to modify it. Members of the AID/APOBEC protein family can deaminate DNA. Similarly, members of the ADAR family can deaminate RNA. Characterizing the scope of these events is challenging. Here we use large genomic data sets, such as the two billion sequences in the NCBI Trace Archive, to look for clusters of mismatches of the same type, which are a hallmark of editing events caused by APOBEC3 and ADAR. We align 603,249,815 traces from the NCBI trace archive to their reference genomes. In clusters of mismatches of increasing size, at least one systematic sequencing error dominates the results (G-to-A). It is still present in mismatches with 99% accuracy and only vanishes in mismatches at 99.99% accuracy or higher. The error appears to have entered into about 1% of the HapMap, possibly affecting other users that rely on this resource. Further investigation, using stringent quality thresholds, uncovers thousands of mismatch clusters with no apparent defects in their chromatograms. These traces provide the first reported candidates of endogenous DNA editing in human, further elucidating RNA editing in human and mouse and also revealing, for the first time, extensive RNA editing in Xenopus tropicalis. We show that the NCBI Trace Archive provides a valuable resource for the investigation of the phenomena of DNA and RNA editing, as well as setting the stage for a comprehensive mapping of editing events in large-scale genomic datasets.