Show simple item record

dc.contributor.authorSturmann, Lilian
dc.date.accessioned2020-09-14T15:48:56Z
dc.date.created2019-05
dc.date.issued2020-03-03
dc.date.submitted2019
dc.identifier.citationSturmann, Lilian. 2019. Using Performance Variation for Instrumentation Placement in Distributed Systems. Master's thesis, Harvard Extension School.
dc.identifier.urihttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37365088*
dc.description.abstractDistributed systems are now ubiquitous in the infrastructures underpinning our everyday lives, yet diagnosing performance problems in these systems remains extremely challenging. The current state of the art for problem diagnosis in these systems relies on data from instrumentation in the system, but the placement of this instrumentation is an unsolved challenge in systems research and in production environments. This work presents an implementation and evaluation of a performance variation-based tool that helps developers understand where instrumentation should be placed in a distributed system to better diagnose current and future performance problems. This tool identifies under-instrumented regions in these systems by localizing performance variation seen in system requests. Contributions of this work include the tool itself; implementations of several methods for localizing performance variation, including a method that prioritizes performance variation deeper in request call graphs; a conversion module that can also function as a stand-alone toolkit to allow the performance variation-based tool to be used across a variety of systems, including those instrumented using the Open Tracing model as well as those using a more general directed acyclic graph (DAG) models; and several experiments evaluating the tool and these methods on an open source distributed application. They key insight informing this work is that similar workflows in the same system should perform similarly. Building on existing workflow-centric tracing tools to profile system behavior, the tools and methods presented have the potential to significantly cut down on time spent diagnosing performance problems in distributed systems. The experiments evaluate their utility both for understanding where to place additional instrumentation for current problems in these systems as they arise, and for guiding informative placement of default system instrumentation to better handle future problems. Potentially, the tools and methods could also be adapted for use in a broader framework that seeks to dynamically tune instrumentation in running systems to the current system state.
dc.description.sponsorshipSoftware Engineering
dc.format.mimetypeapplication/pdf
dash.licenseLAA
dc.subjectdistributed systems
dc.subjecttracing
dc.subjectmonitoring
dc.titleUsing Performance Variation for Instrumentation Placement in Distributed Systems
dc.typeThesis or Dissertation
dash.depositing.authorSturmann, Lilian
dc.date.available2020-09-14T15:48:56Z
thesis.degree.date2019
thesis.degree.grantorHarvard Extension School
thesis.degree.grantorHarvard Extension School
thesis.degree.levelMasters
thesis.degree.levelMasters
thesis.degree.nameALM
thesis.degree.nameALM
dc.contributor.committeeMemberSambasivan, Raja R.
dc.contributor.committeeMemberJaume, Sylvain
dc.type.materialtext
thesis.degree.departmentSoftware Engineering
thesis.degree.departmentSoftware Engineering
dash.identifier.vireo
dc.identifier.orcid0000-0002-9911-1864
dash.author.emaillily.sturmann@gmail.com


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record