IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Special Section on Parallel and Distributed Computing and Networking
Avoiding Performance Impacts by Re-Replication Workload Shifting in HDFS Based Cloud Storage
Thanda SHWEMasayoshi ARITSUGI
Author information
JOURNAL FREE ACCESS

2018 Volume E101.D Issue 12 Pages 2958-2967

Details
Abstract

Data replication in cloud storage systems brings a lot of benefits, such as fault tolerance, data availability, data locality and load balancing both from reliability and performance perspectives. However, each time a datanode fails, data blocks stored on the failed datanode must be restored to maintain replication level. This may be a large burden for the system in which resources are highly utilized with users' application workloads. Although there have been many proposals for replication, the approach of re-replication has not been properly addressed yet. In this paper, we present a deferred re-replication algorithm to dynamically shift the re-replication workload based on current resource utilization status of the system. As workload pattern varies depending on the time of the day, simulation results from synthetic workload demonstrate a large opportunity for minimizing impacts on users' application workloads with the simple algorithm that adjusts re-replication based on current resource utilization. Our approach can reduce performance impacts on users' application workloads while ensuring the same reliability level as default HDFS can provide.

Content from these authors
© 2018 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top