Person: Taylor, Bradley Ryan
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
First Name
Name
Search Results
Publication An Analytical Method for Inferring Library Identity From Illumina NGS Read Groups
(2017-04-04) Taylor, Bradley Ryan; Denkin, Steven M.; Farjoun, YossiWhen working with genetic sequence data, it is important to know the individual, sample, and sequencing library from which the data derives. While multiple software packages exist that can detect when data has been swapped between samples, there are presently no methods to detect library mislabeling. Such mislabellings can negatively impact downstream analyses. Here, we present a tool for reconstructing library relationships from aligned sequence read groups. The basic approach relies on quantifying the similarity of read groups based on their distributions of duplicated insert molecules. Several similarity measures were considered. Library-identity decisions based on similarity-modeling resulted in >91% sensitivity for identifying pairs of read groups from the same library and >98% specificity for identifying pairs from different libraries. Further improvements were seen through unbiased clustering of read groups. Without analytical methods to detect library misidentification, there is no way to know how pervasive this problem is in sequencing data sets. The present tool addresses this unmet need, and provides re- searchers with unique insight into their data’s chain of evidence.