Transparent replication

Date

2006

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Increasing user expectations and demands have caused the evolution of web services away from single-server systems and toward distributed systems for their ability to provide improved throughput, improved availability and reduced response times. However, for a service to run on a distributed system, each running instance must be able to access data that are shared among the instances. Although existing off-the-shelf replication systems - e.g. distributed file systems [52, 61, 32, 38, 41], replicated databases [64, 75], distributed hash tables [58, 59, 63, 34], etc. - simplify access to shared data by exporting wellresearched interfaces, their implementations are typically not engineered for the unique environments presented by many web services. For example, replication systems that require synchronization across multiple nodes to handle modified data [38, 12] or systems that require all nodes to keep a copy of all data [64, 75] may not be practical for use in such services. Although the problem of general replication is not possible to solve [11, 62, 33] we focus our study on a class of single-writer services that we denote Information Dissemination Services that form a restrictive but important set of web services. Our research makes two key contributions. First, we show that for a class of single-writer services that we denote Information Dissemation Services TRIP replicates dynamic data in a manner that is nearly transparent to the service. We (1) develop a novel dual-channel replication algorithm for TRIP that utilizes spare network background traffic to speculatively replicate data in a safe, non-interfering fashion, (2) show how to integrate safe speculative replication with mechanisms that use invalidates to provide consistency, and (3) demonstrate how our combination of consistency and safe speculative replication allows us to provide near-ideal consistency, performance, and availability for Information Dissemination Services. Second, we show that the core principles behind building TRIP can be extended to build a new replication framework and more general replication toolkit. In particular, we show that it is possible to extend our dual-queue mechanisms developed for TRIP to a multi-writer environment where nodes can synchronize multiple incoming streams of data and consistency information. Our extension allows providing various forms of consistency for arbitrary topologies - two key properties provided by the PRACTI [6] (Partial Replication, Arbitrary Consistency, Topology Independence) architecture.

Description

Keywords

Citation