Knowledge-Based Reconstruction of mRNA Transcripts with Short Sequencing Reads for Transcriptome Research

DSpace/Manakin Repository

Knowledge-Based Reconstruction of mRNA Transcripts with Short Sequencing Reads for Transcriptome Research

Show simple item record

dc.contributor.author Seok, Junhee
dc.contributor.author Xu, Weihong
dc.contributor.author Jiang, Hui
dc.contributor.author Davis, Ronald W.
dc.contributor.author Xiao, Wenzhong
dc.date.accessioned 2012-03-28T21:02:41Z
dc.date.issued 2012
dc.identifier.citation Seok, Junhee, Weihong Xu, Hui Jiang, Ronald W. Davis, and Wenzhong Xiao. 2012. Knowledge-based reconstruction of mRNA transcripts with short sequencing reads for transcriptome research. PLoS ONE 7(2): e31440. en_US
dc.identifier.issn 1932-6203 en_US
dc.identifier.uri http://nrs.harvard.edu/urn-3:HUL.InstRepos:8461900
dc.description.abstract While most transcriptome analyses in high-throughput clinical studies focus on gene level expression, the existence of alternative isoforms of gene transcripts is a major source of the diversity in the biological functionalities of the human genome. It is, therefore, essential to annotate isoforms of gene transcripts for genome-wide transcriptome studies. Recently developed mRNA sequencing technology presents an unprecedented opportunity to discover new forms of transcripts, and at the same time brings bioinformatic challenges due to its short read length and incomplete coverage for the transcripts. In this work, we proposed a computational approach to reconstruct new mRNA transcripts from short sequencing reads with reference information of known transcripts in existing databases. The prior knowledge helped to define exon boundaries and fill in the transcript regions not covered by sequencing data. This approach was demonstrated using a deep sequencing data set of human muscle tissue with transcript annotations in RefSeq as prior knowledge. We identified 2,973 junctions, 7,471 exons, and 7,571 transcripts not previously annotated in RefSeq. 73% of these new transcripts found supports from UCSC Known Genes, Ensembl or EST transcript annotations. In addition, the reconstructed transcripts were much longer than those from de novo approaches that assume no prior knowledge. These previously un-annotated transcripts can be integrated with known transcript annotations to improve both the design of microarrays and the follow-up analyses of isoform expression. The overall results demonstrated that incorporating transcript annotations from genomic databases significantly helps the reconstruction of novel transcripts from short sequencing reads for transcriptome research. en_US
dc.language.iso en_US en_US
dc.publisher Public Library of Science en_US
dc.relation.isversionof doi:10.1371/journal.pone.0031440 en_US
dc.relation.hasversion http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3270033/pdf/ en_US
dash.license LAA
dc.subject biology en_US
dc.subject biochemistry en_US
dc.subject nucleic acids en_US
dc.subject biophysics en_US
dc.subject computational biology en_US
dc.subject genomics en_US
dc.subject genome analysis tools en_US
dc.subject molecular genetics en_US
dc.subject genetics en_US
dc.subject molecular cell biology en_US
dc.title Knowledge-Based Reconstruction of mRNA Transcripts with Short Sequencing Reads for Transcriptome Research en_US
dc.type Journal Article en_US
dc.description.version Version of Record en_US
dc.relation.journal PLoS ONE en_US
dash.depositing.author Xiao, Wenzhong
dc.date.available 2012-03-28T21:02:41Z

Files in this item

Files Size Format View
3270033.pdf 394.3Kb PDF View/Open

This item appears in the following Collection(s)

Show simple item record

 
 

Search DASH


Advanced Search
 
 

Submitters