Publication: Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius
Open/View Files
Date
2010
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Public Library of Science
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Al-Swailem, Abdulaziz M., Maher M. Shehata, Faisel M. Abu-Duhier, Essam J. Al-Yamani, Khalid A. Al-Busadah, Mohammed S. Al-Arawi, Ali Y. Al-Khider, et al. 2010. Sequencing, analysis, and annotation of expressed sequence tags for Camelus dromedarius. PLoS ONE 5(5): e10720.
Research Data
Abstract
Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ~40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism.
Description
Other Available Sources
Keywords
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service