Publication:

Composable visualization systems for biological data

Loading...
Thumbnail Image

Open/View Files

Date

2025-01-15

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Manz, Trevor. 2025. Composable Visualization Systems for Biological Data. Doctoral Dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

The field of biology demands innovative visualization approaches to handle its expanding, integrated data landscape. This need has driven advancements in visualization design—such as genome browsers, molecular renderers, and image viewers—that enhance understanding of complex datasets. However, the push for specialization and novelty has fostered development practices that produce isolated, single-purpose applications. Monolithic tools require substantial engineering and often lack interoperability, making them difficult to repurpose or integrate into broader workflows. This thesis investigates whether composable, reusable components can break down these silos and increase adaptability and impact.

To address these challenges, I introduce infrastructure that bridges layers within the visualization ecosystem for distributing interactive tools in data-centric workflows. First, I present anywidget, a high-level specification and toolkit for authoring reusable, web-based visualizations across environments like computational notebooks and standalone applications. As a standard, anywidget facilitates interchange between visualization authors and users, enabling both general-purpose and domain-specific visualizations to interoperate. Second, I introduce Viv, a web-based bioimaging toolkit that visualizes multi-terabyte datasets directly from cloud storage without software installation. Viv operates at the layer between data producers and consumers, demonstrating how alignment with open data standards enhances accessibility and scalability. Together, these contributions promote modularity, accessibility, and sustainable visualization practices within and beyond biological data systems.

Building on this infrastructure, I show both the adaptation of existing toolkits and the creation of new, composable applications in computational notebooks. I introduce Gos, a Python library that embeds the Gosling visualization grammar into data-centric environments, reducing context switching and enhancing usability for data analysis. Finally, I present a framework for comparing embedding visualizations—originally motivated by single-cell biology—that leverages the modular components developed in this thesis. Unlike standalone applications tied to specific data types, this framework is adaptable across domains and data types, highlighting the flexibility of a modular approach.

This thesis introduces practices that extend the reach and adaptability of visualization tools by rethinking software development approaches. The approach amplifies the impact of specialized tools, fostering a more connected, sustainable ecosystem for visualization across diverse research domains.

Description

Other Available Sources

Research Data

Keywords

composability, computational notebooks, data visualization, genomics, human-computer interaction, software systems, Bioinformatics, Genetics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories