Show simple item record

dc.contributor.authorWall, Dennis Paul
dc.contributor.authorKudtarkar, Parul
dc.contributor.authorFusaro, Vincent Alfred
dc.contributor.authorPivovarov, Rimma
dc.contributor.authorPatil, Prasad
dc.contributor.authorTonellato, Peter J
dc.date.accessioned2012-03-29T19:29:31Z
dc.date.issued2010
dc.identifier.citationWall, Dennis P., Parul Kudtarkar, Vincent A. Fusaro, Rimma Pivovarov, Prasad Patil, and Peter J. Tonellato. 2010. Cloud computing for comparative genomics. BMC Bioinformatics 11:259.en_US
dc.identifier.issn1471-2105en_US
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:8462974
dc.description.abstractBackground: Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. Results: We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. Conclusions: The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.en_US
dc.language.isoen_USen_US
dc.publisherBioMed Centralen_US
dc.relation.isversionofhttp://www.biomedcentral.com/1471-2105/11/259en_US
dc.relation.hasversionhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098063/pdf/en_US
dash.licenseLAA
dc.titleCloud Computing for Comparative Genomicsen_US
dc.typeJournal Articleen_US
dc.description.versionVersion of Recorden_US
dc.relation.journalBMC Bioinformaticsen_US
dash.depositing.authorWall, Dennis Paul
dc.date.available2012-03-29T19:29:31Z
dash.affiliation.otherHMS^Center for Biomedical Informatics at Countwayen_US
dash.affiliation.otherHMS^Pediatrics-Children's Hospitalen_US
dc.identifier.doi10.1186/1471-2105-11-259
dash.contributor.affiliatedFusaro, Vincent Alfred
dash.contributor.affiliatedWall, Dennis Paul
dash.contributor.affiliatedTonellato, Peter


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record