Publication:
ClinVar data parsing

Thumbnail Image

Date

2017

Journal Title

Journal ISSN

Volume Title

Publisher

F1000Research
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Zhang, Xiaolei, Eric V. Minikel, Anne H. O'Donnell-Luria, Daniel G. MacArthur, James S. Ware, and Ben Weisburd. 2017. “ClinVar data parsing.” Wellcome Open Research 2 (1): 33. doi:10.12688/wellcomeopenres.11640.1. http://dx.doi.org/10.12688/wellcomeopenres.11640.1.

Research Data

Abstract

This software repository provides a pipeline for converting raw ClinVar data files into analysis-friendly tab-delimited tables, and also provides these tables for the most recent ClinVar release. Separate tables are generated for genome builds GRCh37 and GRCh38 as well as for mono-allelic variants and complex multi-allelic variants. Additionally, the tables are augmented with allele frequencies from the ExAC and gnomAD datasets as these are often consulted when analyzing ClinVar variants. Overall, this work provides ClinVar data in a format that is easier to work with and can be directly loaded into a variety of popular analysis tools such as R, python pandas, and SQL databases.

Description

Keywords

Articles, Bioinformatics, Genomics, variant interpretation, ClinVar, XML parsing, Mendelian disease, pathogenic variants

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories