High Availabale Hadoop Deployment Modes with Enterprise-Level capabilities
A Berkaoui, Y Gahi - … Conference on Future Internet of Things …, 2023 - ieeexplore.ieee.org
A Berkaoui, Y Gahi
2023 10th International Conference on Future Internet of Things …, 2023•ieeexplore.ieee.orgBusinesses today recognize the tremendous value of data and seek to proactively leverage
it to drive growth and success. Therefore, they adopt a data-driven approach, collecting,
analyzing and using massive volumes of data from multiple sources. Data-driven companies
need to invest in advanced technologies such as data and analytics platforms, to unlock the
full potential of their data to gain significant competitive advantage. Accordingly, the
technology innovations supported these business trends over the years by continuously …
it to drive growth and success. Therefore, they adopt a data-driven approach, collecting,
analyzing and using massive volumes of data from multiple sources. Data-driven companies
need to invest in advanced technologies such as data and analytics platforms, to unlock the
full potential of their data to gain significant competitive advantage. Accordingly, the
technology innovations supported these business trends over the years by continuously …
Businesses today recognize the tremendous value of data and seek to proactively leverage it to drive growth and success. Therefore, they adopt a data-driven approach, collecting, analyzing and using massive volumes of data from multiple sources. Data-driven companies need to invest in advanced technologies such as data and analytics platforms, to unlock the full potential of their data to gain significant competitive advantage.Accordingly, the technology innovations supported these business trends over the years by continuously releasing new platforms, patterns and architectures capable of scaling and managing large data volumes. Apache Hadoop emerges as one of these platforms offering distributed storage and processing running on physical commodity hardware and capable of scaling to support petabytes of data. However, this kind of platforms when deployed in an enterprise datacenter, are usually confronted with the growing usage of virtualization as a core component of infrastructure provisioning.This paper addresses infrastructure considerations of Hadoop deployment within enterprise's datacenters. It covers deployment scenarios on both physical and virtual servers with respect to prerequisites, limitations and constraints of each one of them. Furthermore, it suggests a hybrid approach with enterprise level capabilities usually required in datacenters. Finally, the suggested infrastructure choices are exposed and discussed with perspectives and future research areas.
ieeexplore.ieee.org