Publication:

Improving Data Placement Decisions for Heterogeneous Clustered File Systems

Loading...
Thumbnail Image

Date

2020-03-03

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Allen, Cyril Anthony. 2018. Improving Data Placement Decisions for Heterogeneous Clustered File Systems. Master's thesis, Harvard Extension School.

Abstract

With the advent of cloud computing, datacenters are using distributed applications more than ever. MapReduce is used to generate over 20 petabytes of data per day by using prodigious numbers of commodity servers (Dean & Ghemawat, 2008). Many companies use large scale clusters to perform various computational tasks via the open-source MapReduce implementation, Hadoop (White, 2012), or they can possess a virtualized datacenter, allowing them to migrate virtual machines between various machines for high-availability reasons. As economics change for hardware, it is likely that a scalable cloud will have the requirement to mix node types, which will lead to higher performance and higher capacity nodes to be mixed with lower performance, lower capacity nodes. This thesis presents an adaptive data placement method in the Nutanix distributed file system which will remedy some common problems found in many heterogeneous clustered file systems.

Description

Other Available Sources

Research Data

Keywords

filesystems, performance, distributed systems

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories