sponsors
usenix conference policies
Analysis of HDFS Under HBase: A Facebook Messages Case Study
Tyler Harter, University of Wisconsin—Madison; Dhruba Borthakur, Siying Dong, Amitanand Aiyer, and Liyin Tang, Facebook Inc.; Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau, University of Wisconsin—Madison
We present a multilayer study of the Facebook Messages stack, which is based on HBase and HDFS. We collect and analyze HDFS traces to identify potential improvements, which we then evaluate via simulation. Messages represents a new HDFS workload: whereas HDFS was built to store very large files and receive mostly sequential I/O, 90% of files are smaller than 15MB and I/O is highly random. We find hot data is too large to easily fit in RAM and cold data is too large to easily fit in flash; however, cost simulations show that adding a small flash tier improves performance more than equivalent spending on RAM or disks. HBase’s layered design offers simplicity, but at the cost of performance; our simulations show that network I/O can be halved if compaction bypasses the replication layer. Finally, although Messages is read-dominated, several features of the stack (i.e., logging, compaction, replication, and caching) amplify write I/O, causing writes to dominate disk I/O.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Tyler Harter and Dhruba Borthakur and Siying Dong and Amitanand Aiyer and Liyin Tang and Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau},
title = {Analysis of {HDFS} Under {HBase}: A Facebook Messages Case Study},
booktitle = {12th USENIX Conference on File and Storage Technologies (FAST 14)},
year = {2014},
isbn = {ISBN 978-1-931971-08-9},
address = {Santa Clara, CA},
pages = {199--212},
url = {https://www.usenix.org/conference/fast14/technical-sessions/presentation/harter},
publisher = {USENIX Association},
month = feb
}
connect with us