LS1.1 - V6 Generalized Architecture of Big Data Systems
LS1.1 - V6 Generalized Architecture of Big Data Systems
LS1.1 - V6 Generalized Architecture of Big Data Systems
Systems
Pravin Y Pawar
Big data architecture style
• is designed to handle the ingestion, processing, and analysis of data that is too large or complex
for traditional database systems.
Source : https://docs.microsoft.com/en-us/azure/architecture/guide/architecture-styles/big-data
Big Data Applications
Workloads
• Big data solutions typically involve one or more of the following types of workload:
Store and process data in volumes too large for a traditional database
Transform unstructured data for analysis and reporting
Capture, process, and analyze unbounded streams of data in real time, or with low latency
Big data architecture Benefits
Advantages
• Technology choices
Variety of technology options in open source and from vendors are available
• Performance through parallelism
Big data solutions take advantage of parallelism, enabling high-performance solutions that scale to
large volumes of data.
• Elastic scale
All of the components in the big data architecture support scale-out provisioning, so that you can adjust
your solution to small or large workloads, and pay only for the resources that you use.
• Interoperability with existing solutions
The components of the big data architecture are also used for IoT processing and enterprise BI
solutions, enabling you to create an integrated solution across data workloads.
Big data architecture Challenges
Things to ponder upon
• Complexity
Big data solutions can be extremely complex, with numerous components to handle data ingestion from
multiple data sources. It can be challenging to build, test, and troubleshoot big data processes.
• Skillset
Many big data technologies are highly specialized, and use frameworks and languages that are not typical of
more general application architectures. On the other hand, big data technologies are evolving new APIs that
build on more established languages.
• Technology maturity
• Many of the technologies used in big data are evolving. While core Hadoop technologies such as Hive and
Pig have stabilized, emerging technologies such as Spark introduce extensive changes and enhancements
with each new release.
Thank You!
In our next session: Streaming Data Systems