New coding techniques for distributed storage systems: enabling locality, availability and security
Abstract
Distributed storage systems (a.k.a. cloud storage networks) are becoming increasingly important, given the need to put away vast amounts of data that are being generated, analyzed and accessed across multiple disciplines today. Besides serving as backbone systems for large institutions such as CERN, Google and Microsoft, distributed storage systems have been instrumental in the emergence and rapid growth of the modern cloud computing framework. This dissertation takes a coding theoretic approach to address key issues related to designing these systems. First, the problem of enabling efficient mechanisms for restoring the state of the system after storage node failures in considered. In particular, this dissertation studies locally repairable codes that allow for reconstruction of the content stored on a failed node by contacting a small number of intact nodes. Since resilience to permanent loss of the stored information in the event of catastrophic failures is of primary interest in storage systems, explicit constructions for locally repairable codes with optimal minimum distance are presented. This dissertation also designs locally repairable codes that minimize repair-bandwidth, i.e., the amount of data downloaded during a node repair, in addition to the number of intact nodes contributing to the repair process. This dissertation further investigates a generalization of locally repairable codes where codes with the following property are studied: any small set of failed nodes is recoverable from a small number of other intact nodes. This is referred to as cooperative local repair. The main contributions in this regard are bounds on the minimum distance and the dimension of such codes, as well as explicit constructions of families of codes that enable cooperative local repair. The second issue addressed in this dissertation concerns management of hot data, i.e., the frequently accessed information. Towards this, the codes that allow for parallel accesses to information blocks are considered. This part of the dissertation explores the rate vs. minimum distance trade-off for such codes and presents explicit constructions of the codes that have high rate and large minimum distance while supporting (potentially scaling number of) parallel accesses to the stored information. One of the main contributions of the dissertation in this direction is a novel graph theoretic approach to construct batch codes. The batch codes as defined in the literature enable load balancing in a storage system in the sense that multiple requests for information blocks can be served in a parallel manner without downloading too much data from any particular storage node. Finally, this dissertation considers the issue of designing secure coding scheme for distributed storage systems. Given the decentralized nature of these systems and their increasing utilization to store valuable and confidential information, it is desirable that they be resilient against eavesdropping attacks. The problem of characterizing perfect secrecy capacity of distributed storage systems is studied. Using secrecy-precoding, this dissertation presents coding schemes that are secure against eavesdropping attacks while allowing for efficient node repair mechanisms.