| Title: | Evaluation of a Large-Scale Biomedical Data Annotation Initiative |
| Author: |
Pitzer, Erik; Hinske, Christian; Galante, Pedro; Ohno-Machado, Lucila; Lacson, Ronilda C.
Note: Order does not necessarily reflect citation order of authors. |
| Citation: | Lacson, Ronilda, Erik Pitzer, Christian Hinske, Pedro Galante, and Lucila Ohno-Machado. 2009. Evaluation of a large-scale biomedical data annotation initiative. BMC Bioinformatics 10(Suppl 9): S10. |
| Full Text & Related Files: |
2745681.pdf (243.4Kb; PDF)
|
| Abstract: | Background: This study describes a large-scale manual re-annotation of data samples in the Gene Expression Omnibus (GEO), using variables and values derived from the National Cancer Institute thesaurus. A framework is described for creating an annotation scheme for various diseases that is flexible, comprehensive, and scalable. The annotation structure is evaluated by measuring coverage and agreement between annotators. Results: There were 12,500 samples annotated with approximately 30 variables, in each of six disease categories – breast cancer, colon cancer, inflammatory bowel disease (IBD), rheumatoid arthritis (RA), systemic lupus erythematosus (SLE), and Type 1 diabetes mellitus (DM). The annotators provided excellent variable coverage, with known values for over 98% of three critical variables: disease state, tissue, and sample type. There was 89% strict inter-annotator agreement and 92% agreement when using semantic and partial similarity measures. Conclusion: We show that it is possible to perform manual re-annotation of a large repository in a reliable manner. |
| Published Version: | doi:10.1186/1471-2105-10-S9-S10 |
| Other Sources: | http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2745681/pdf/ |
| Terms of Use: | This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAA |
| Citable link to this page: | http://nrs.harvard.edu/urn-3:HUL.InstRepos:4931095 |
Contact administrator regarding this item (to report mistakes or request changes)