Publication: Local Clustering in Provenance Graphs (Extended Version)
Open/View Files
Date
2013
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Macko, Peter, Daniel Margo, and Margo Seltzer. 2013. Local Clustering in Provenance Graphs (Extended Version). Harvard Computer Science Group Technical Report TR-03-13.
Research Data
Abstract
Systems that capture and store data provenance, the record of how an object has arrived at its current state, accumulate historical metadata over time, forming a large graph. Local clustering in these graphs, in which we start with a seed vertex and grow a cluster around it, is of paramount importance because it supports critical provenance applications such as identifying semantically meaningful tasks in an object’s history and selecting appropriate truncation points for returning an object’s ancestry or lineage. Generic graph clustering algorithms are not effective at producing semantically meaningful clusters in provenance graphs. We identify three key properties of provenance graphs and exploit them to justify two new centrality metrics we developed, specifically for use in performing local clustering on provenance graphs.
Description
Other Available Sources
Keywords
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service