Publication: Composable visualization systems for biological data
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
The field of biology demands innovative visualization approaches to handle its expanding, integrated data landscape. This need has driven advancements in visualization design—such as genome browsers, molecular renderers, and image viewers—that enhance understanding of complex datasets. However, the push for specialization and novelty has fostered development practices that produce isolated, single-purpose applications. Monolithic tools require substantial engineering and often lack interoperability, making them difficult to repurpose or integrate into broader workflows. This thesis investigates whether composable, reusable components can break down these silos and increase adaptability and impact.
To address these challenges, I introduce infrastructure that bridges layers within the visualization ecosystem for distributing interactive tools in data-centric workflows. First, I present anywidget, a high-level specification and toolkit for authoring reusable, web-based visualizations across environments like computational notebooks and standalone applications. As a standard, anywidget facilitates interchange between visualization authors and users, enabling both general-purpose and domain-specific visualizations to interoperate. Second, I introduce Viv, a web-based bioimaging toolkit that visualizes multi-terabyte datasets directly from cloud storage without software installation. Viv operates at the layer between data producers and consumers, demonstrating how alignment with open data standards enhances accessibility and scalability. Together, these contributions promote modularity, accessibility, and sustainable visualization practices within and beyond biological data systems.
Building on this infrastructure, I show both the adaptation of existing toolkits and the creation of new, composable applications in computational notebooks. I introduce Gos, a Python library that embeds the Gosling visualization grammar into data-centric environments, reducing context switching and enhancing usability for data analysis. Finally, I present a framework for comparing embedding visualizations—originally motivated by single-cell biology—that leverages the modular components developed in this thesis. Unlike standalone applications tied to specific data types, this framework is adaptable across domains and data types, highlighting the flexibility of a modular approach.
This thesis introduces practices that extend the reach and adaptability of visualization tools by rethinking software development approaches. The approach amplifies the impact of specialized tools, fostering a more connected, sustainable ecosystem for visualization across diverse research domains.