Person:

Holland, David

Loading...
Profile Picture

Email Address

AA Acceptance Date

Birth Date

Research Projects

Organizational Units

Job Title

Last Name

Holland

First Name

David

Name

Holland, David

Search Results

Now showing 1 - 7 of 7
  • Publication

    Provenance Integration Requires Reconciliation

    (2011) Angelino, Elaine Lee; Braun, Uri; Holland, David; Macko, Peter; Margo, Daniel; Seltzer, Margo

    While there has been a great deal of research on provenance systems, there has been little discussion about challenges that arise when making different provenance systems interoperate. In fact, most of the literature focuses on provenance systems in isolation and does not discuss interoperability – what it means, its requirements, and how to achieve it. We designed the Provenance-Aware Storage System to be a general- purpose substrate on top of which it would be “easy” to add other provenance-aware systems in a way that would provide “seamless integration” for the provenance captured at each level. While the system did exactly what we wanted on toy problems, when we began integrating StarFlow, a Python-based workflow/provenance system, we discovered that integration is far trickier and more subtle than anyone has suggested in the literature. This work describes our experience undertaking the integration of StarFlow and PASS, identifying several important additions to existing provenance models necessary for interoperability among provenance systems.

  • Publication

    Choosing a Data Model and Query Language for Provenance

    (Springer, 2008) Holland, David; Braun, Uri; Maclean, Diana; Muniswamy-Reddy, Kiran-Kumar; Seltzer, Margo

    The ancestry relationships found in provenance form a directed graph. Many provenance queries require traversal of this graph. The data and query models for provenance should directly and naturally address this graph-centric nature of provenance. To that end, we set out the requirements for a provenance data and query model and discuss why the common solutions (relational, XML, RDF) fall short. A semistructured data model is more suited for handling provenance. We propose a query model based on the Lorel query language, and briefly describe how our query language PQL extends Lorel.

  • Publication

    Multicore OSes: Looking Forward from 1991, er, 2011

    (USENIX Association, 2011) Holland, David; Seltzer, Margo

    Upcoming multicore processors, with hundreds of cores or more in a single chip, require a degree of parallel scalability that is not currently available in today’s system software. Based on prior experience in the super-computing sector, the likely trend for multicore processors is away from shared memory and toward shared-nothing architectures based on message passing. In light of this, the lightweight messages and channels programming model, found among other places in Erlang, is likely the best way forward. This paper discusses what adopting this model entails, describes the architecture of an OS based on this model, and outlines a few likely implementation challenges.

  • Publication

    Layering in Provenance Systems

    (USENIX Association, 2009) Muniswamy-Reddy, Kiran-Kumar; Braun, Uri; Holland, David; Macko, Peter; Maclean, Diana; Margo, Daniel; Seltzer, Margo; Smogor, Robin

    Digital provenance describes the ancestry or history of a digital object. Most existing provenance systems, however, operate at only one level of abstraction: the sys- tem call layer, a workflow specification, or the high-level constructs of a particular application. The provenance collectable in each of these layers is different, and all of it can be important. Single-layer systems fail to account for the different levels of abstraction at which users need to reason about their data and processes. These systems cannot integrate data provenance across layers and cannot answer questions that require an integrated view of the provenance. We have designed a provenance collection structure facilitating the integration of provenance across multiple levels of abstraction, including a workflow engine, a web browser, and an initial runtime Python provenance tracking wrapper. We layer these components atop provenance-aware network storage (NFS) that builds upon a Provenance-Aware Storage System (PASS). We discuss the challenges of building systems that integrate provenance across multiple layers of abstraction, present how we augmented systems in each layer to integrate provenance, and present use cases that demonstrate how provenance spanning multiple layers provides functionality not available in existing systems. Our evaluation shows that the overheads imposed by layering provenance systems are reasonable.

  • Publication

    Layering in Provenance-Aware Storage Systems

    (2008) Muniswamy-Reddy, Kiran-Kumar; Barillari, Joseph; Braun, Uri; Holland, David; Maclean, Diana; Seltzer, Margo; Holland, Stephen D.

    Digital provenance describes the ancestry or history of a digital document. Provenance provides answers to questions such as: “How does the ancestry of these objects differ?” “Are there source code files tainted by proprietary software?” “How was this object created?” Prior systems used to collect and maintain provenance operate within a single layer of abstraction: the system call boundary, a workflow specification language, or in a domain-specific application level. The provenance collected at each of these layers of abstraction is different, and all of it is important at one time or another. All of these solutions fundamentally fail to account for the different layers of abstraction at which users need to reason about their data and processes. None of these systems support queries across different layers of abstraction to answer a question such as “The calculated values in my spreadsheet have changed. Is this due to a change in the spreadsheet, a difference in the spreadsheet application, the libraries being used, or the operating system being used?” We present an architecture for provenance collection that facilitates the integration of provenance across multiple layers of abstraction and across network boundaries. We show how the need to support provenance collection at multiple layers drives the architecture. We present provenance-aware use cases from the field of thermography and quantify system overheads, showing that we can provide new functionality with acceptable overhead.

  • Publication

    An Architecture A Day Keeps The Hacker Away

    (Association for Computing Machinery, 2005) Holland, David; Lim, Ada T.; Seltzer, Margo

    System security as it is practiced today is a losing battle. In this paper, we outline a possible comprehensive solution for binary-based attacks, using virtual machines, machine descriptions, and randomization to achieve broad heterogeneity at the machine level. This heterogeneity increases the “cost” of broad-based binary attacks to a sufficiently high level that they cease to become feasible. The convergence of several recent technologies appears to make our approach achievable at a reasonable cost, with only moderate run-time overhead.

  • Publication

    Provenance-Aware Storage Systems

    (2006) Muniswamy-Reddy, Kiran-Kumar; Holland, David; Braun, Uri; Seltzer, Margo

    A Provenance-Aware Storage System (PASS) is a storage system that automatically collects and maintains provenance or lineage, the complete history or ancestry of an item. We discuss the advantages of treating provenance as meta-data collected and maintained by the storage system, rather than as manual annotations stored in a separately administered database. We present a PASS implementation, discussing the challenges and performance cost, and the new functionality it enables. We show that with reasonable overhead, we can provide useful functionality not available in today’s file systems or provenance management systems.