StarFlow: A Script-Centric Data Analysis Environment

DSpace/Manakin Repository

StarFlow: A Script-Centric Data Analysis Environment

Citable link to this page


Title: StarFlow: A Script-Centric Data Analysis Environment
Author: Angelino, Elaine Lee; Yamins, Daniel Louis Kanef; Seltzer, Margo I.

Note: Order does not necessarily reflect citation order of authors.

Citation: Angelino, Elaine, Daniel Yamins, and Margo Seltzer. 2010. StarFlow: a script-centric data analysis environment. Lecture Notes in Computer Science 6378: 236-250. Also published in Proceedings of the Third International Provenance and Annotation Workshop (IPAW 2010), Troy, NY, USA, June 15-16, 2010: Revised Selected Papers. Berlin: Springer.
Full Text & Related Files:
Abstract: We introduce StarFlow, a script-centric environment for data analysis. StarFlow has four main features: (1) extraction of control and data-flow dependencies through a novel combination of static analysis, dynamic runtime analysis, and user annotations, (2) command-line tools for exploring and propagating changes through the resulting dependency network, (3) support for workflow abstractions enabling robust parallel executions of complex analysis pipelines, and (4) a seamless interface with the Python scripting language. We describe a range of real applications of StarFlow, including automatic parallelization of complex workflows in the cloud.
Published Version:
Other Sources:
Terms of Use: This article is made available under the terms and conditions applicable to Open Access Policy Articles, as set forth at
Citable link to this page:
Downloads of this work:

Show full Dublin Core record

This item appears in the following Collection(s)


Search DASH

Advanced Search