Publication:
Nonlinear dimensionality reduction methods for synthetic biology biobricks’ visualization

Thumbnail Image

Open/View Files

Date

2017

Journal Title

Journal ISSN

Volume Title

Publisher

BioMed Central
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Yang, Jiaoyun, Haipeng Wang, Huitong Ding, Ning An, and Gil Alterovitz. 2017. “Nonlinear dimensionality reduction methods for synthetic biology biobricks’ visualization.” BMC Bioinformatics 18 (1): 47. doi:10.1186/s12859-017-1484-4. http://dx.doi.org/10.1186/s12859-017-1484-4.

Research Data

Abstract

Background: Visualizing data by dimensionality reduction is an important strategy in Bioinformatics, which could help to discover hidden data properties and detect data quality issues, e.g. data noise, inappropriately labeled data, etc. As crowdsourcing-based synthetic biology databases face similar data quality issues, we propose to visualize biobricks to tackle them. However, existing dimensionality reduction methods could not be directly applied on biobricks datasets. Hereby, we use normalized edit distance to enhance dimensionality reduction methods, including Isomap and Laplacian Eigenmaps. Results: By extracting biobricks from synthetic biology database Registry of Standard Biological Parts, six combinations of various types of biobricks are tested. The visualization graphs illustrate discriminated biobricks and inappropriately labeled biobricks. Clustering algorithm K-means is adopted to quantify the reduction results. The average clustering accuracy for Isomap and Laplacian Eigenmaps are 0.857 and 0.844, respectively. Besides, Laplacian Eigenmaps is 5 times faster than Isomap, and its visualization graph is more concentrated to discriminate biobricks. Conclusions: By combining normalized edit distance with Isomap and Laplacian Eigenmaps, synthetic biology biobircks are successfully visualized in two dimensional space. Various types of biobricks could be discriminated and inappropriately labeled biobricks could be determined, which could help to assess crowdsourcing-based synthetic biology databases’ quality, and make biobricks selection. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1484-4) contains supplementary material, which is available to authorized users.

Description

Keywords

Visualization, Synthetic biology, Biobricks, Dimensionality reduction, Edit distance

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories