File migration in large distributed computer systems

Date

1995-12

Journal Title

Journal ISSN

Volume Title

Publisher

Texas Tech University

Abstract

File Migration (FM) is a tool for distributed file allocation, which is a subset of distributed resource management. FM is a method of dynamically allocating individual files, and copies of files, throughout the nodes of a distributed computer network. The purpose of a distributed file allocation mechanism is to provide service while reducing the system wide costs associated with the storage and use of remotely accessed files.

The process of optimally allocating files to nodes in a distributed computing system is an NP-Complete problem. Allowing replicated file copies increases the difficulty of the problem. The search for good heuristic solutions for distributed file allocation requires a thorough understanding of the distributed system and its operating environment. Very little data exists quantifying the usage of resources within a distributed environment and the available data tends to be specific to one installation or small category of distributed computing environments.

This research presents a tool to aid in the development of distributed resource management through file migration. The tool is called the File Migration Analysis Tool (FMAT). FMAT is a fine grain simulation of a distributed file system and the significant portions of the distributed computing environment.

In this research, FMAT is shown to be an accurate simulator for large distributed computing systems. The operation and fidelity of FMAT was validated through direct comparison to the body of research in this field carried out by the researchers at the University of Toronto. This comparison showed the value of simulating large distributed systems using a fine grained approach. Also, including the ability to monitor system characteristics from the client perspective proved to be an important contribution of the FIVIAT simulation.

The value of FMAT as a tool to analyze multiple file migration engines and evaluate various distributed system topologies was demonstrated. Multiple decision engines were compared in the same distributed environment and a single engine was evaluated in multiple distributed computing environments.. These comparisons illustrated the importance of viewing system performance from the client node perspective and the impact underlying network and distributed system design has on the choice of file migration policies.

Description

Citation