Publication: Autonomous Replication in Wide-Area Internetworks
Open/View Files
Date
1995
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Gwertzman, James. 1995. Autonomous Replication in Wide-Area Internetworks. Harvard Computer Science Group Technical Report TR-17-95.
Research Data
Abstract
The number of users connected to the Internet has been growing at an exponential rate, resulting in similar increases in network traffic and Internet server load. Advances in microprocessors and network technologies have kept up with growth so far, but we are reaching the limits of hardware solutions. In order for the Internet's growth to continue, we must efficiently distribute server load and reduce the network traffic generated by its various services. Traditional wide-area caching schemes are client initiated. Decisions on where and when to cache information are made without the benefit of the server's global knowledge of the situation. We introduce a technique--push-caching--that is server initiated; it leaves caching decisions to the server. The server uses its knowledge of network topology, geography, and access patterns to minimize network traffic and server load. The World Wide Web is an example of a large-scale distributed information system that will benefit from this geographical distribution, and we present an architecture that allows a Web server to autonomously replicate Web files. We use a trace-driven simulation of the Internet to evaluate several competing caching strategies. Our results show that while simple client caching reduces server load and network bandwidth demands by up to 30%, adding server-initiated caching reduces server load by an additional 20% and network bandwidth demands by an additional 10%. Furthermore, push-caching is more efficient than client-caching, using an order of magnitude less cache space for comparable bandwidth and load savings. To determine the optimal cache consistency protocol we used a generic server simulator to evaluate several cache-consistency protocols, and found that weak consistency protocols are sufficient for the World Wide Web since they use the same bandwidth as an atomic protocol, impose less server load, and return stale data less than 1% of the time.
Description
Other Available Sources
Keywords
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service