Show simple item record

dc.contributor.advisorSeltzer, Margo I.en_US
dc.contributor.authorMacko, Peteren_US
dc.date.accessioned2015-03-18T13:08:52Z
dc.date.created2015-03en_US
dc.date.issued2015-01-16en_US
dc.date.submitted2015en_US
dc.identifier.citationMacko, Peter. 2015. LLAMA: A Persistent, Mutable Representation for Graphs. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.en_US
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:14226042
dc.description.abstractGraph-structured data is large, ever-changing, and ubiquitous. These features demand that graph analytic applications both compute on and modify large graphs efficiently, and it is also beneficial for analysts to be able to focus just on the last few hours or days of data. We present LLAMA, a graph storage and analysis system that supports mutability and out-of-memory execution, and SLOTH, a sliding window extension of LLAMA that efficiently maintains a view of just the most recently added parts of a graph. LLAMA performs comparably to immutable main-memory analysis systems for graphs that fit in memory and significantly outperforms existing out-of-memory analysis systems for graphs that exceed main memory. It bases its implementation on the compressed sparse row (CSR) representation, which is a read-only representation commonly used for graph analytics. We augment this representation to support mutability and persistence using a novel implementation of multi-versioned array snapshots, making it ideal for applications that receive a steady stream of new data, but need to perform whole-graph analysis on consistent views of the data. We leverage the multi-versioned nature of this representation to build SLOTH, a sliding window data structure that advances the window by creating a snapshot from new data and aging out old data by deleting the corresponding snapshots. We compare LLAMA to state-of-the-art systems on representative graph analysis workloads, showing that LLAMA scales well both out-of-memory and across parallel cores. Our evaluation shows that LLAMA's mutability introduces modest overheads of 3-18% relative to immutable CSR for in memory execution and that it outperforms state-of-the-art out-of-memory systems in most cases, with a best case improvement of a factor of 5 on breadth-first-search. We evaluate SLOTH with various sliding window configurations and demonstrate that LLAMA is a good building block for sliding window computation.en_US
dc.format.mimetypeapplication/pdfen_US
dc.language.isoenen_US
dash.licenseLAAen_US
dc.subjectComputer Scienceen_US
dc.titleLLAMA: A Persistent, Mutable Representation for Graphsen_US
dc.typeThesis or Dissertationen_US
dash.depositing.authorMacko, Peteren_US
dc.date.available2015-03-18T13:08:52Z
thesis.degree.date2015en_US
thesis.degree.grantorGraduate School of Arts & Sciencesen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophyen_US
dc.contributor.committeeMemberWaldo, James H.en_US
dc.contributor.committeeMemberChong, Stephenen_US
dc.contributor.committeeMemberIdreos, Stratosen_US
dc.type.materialtexten_US
thesis.degree.departmentEngineering and Applied Sciences - Computer Scienceen_US
dash.identifier.vireohttp://etds.lib.harvard.edu/gsas/admin/view/108en_US
dc.description.keywordsgraphs; data structures; CSR; updates; persistence; streaming; sliding windowsen_US
dash.author.emailpmacko@gmail.comen_US
dash.identifier.drsurn-3:HUL.DRS.OBJECT:25119199en_US
dash.contributor.affiliatedMacko, Peter


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record