Cloud Computing Unit-5
Cloud Computing Unit-5
MapReduce programming offers several benefits to help you gain valuable insights
from your big data:
An example of MapReduce
This is a very simple example of MapReduce. No matter the amount of data you need
to analyze, the key principles remain the same.
Assume you have five files, and each file contains two columns (a key and a value in
Hadoop terms) that represent a city and the corresponding temperature recorded in
that city for the various measurement days. The city is the key, and the temperature is
the value. For example: (Toronto, 20). Out of all the data we have collected, you want
to find the maximum temperature for each city across the data files (note that each file
might have the same city represented multiple times).
Using the MapReduce framework, you can break this down into five map tasks, where
each mapper works on one of the five files. The mapper task goes through the data and
returns the maximum temperature for each city.
For example, the results produced from one mapper task for the data above would
look like this: (Toronto, 20) (Whitby, 25) (New York, 22) (Rome, 33)
Virtualbox:
Oracle VM VirtualBox is defined as a tool for virtualizing x86 and AMD64/Intel64
computing architecture, enabling users to deploy desktops, servers, and
operating systems as virtual machines. One can use this solution to deploy as many
virtual machines as the host architecture has the resources for.
App Engine is a fully managed, serverless platform for developing and hosting
web applications at scale. You can choose from several popular languages, libraries,
and frameworks to develop your apps, and then let App Engine take care of
provisioning servers and scaling your app instances based on demand.
The flexible environment is optimal for applications with the following characteristics:
OpenStack
OpenStack is a free, open standard cloud computing platform. It is mostly deployed
as infrastructure-as-a-service (IaaS) in both public and private clouds where virtual servers and other
resources are made available to users. The software platform consists of interrelated components
that control diverse, multi-vendor hardware pools of processing, storage, and networking resources
throughout a data center. Users manage it either through a web-based dashboard,
through command-line tools, or through RESTful web services.
OpenStack began in 2010 as a joint project of Rackspace Hosting and NASA. As of 2012, it was
managed by the OpenStack Foundation, a non-profit corporate entity established in September
2012 to promote OpenStack software and its community. By 2018, more than 500 companies had
joined the project. In 2020 the foundation announced it would be renamed the Open Infrastructure
Foundation in 2021.
Federation in the Cloud:
1. In the federated cloud, the users can interact with the architecture
either centrally or in a decentralized manner. In centralized interaction,
the user interacts with a broker to mediate between them and the
organization. Decentralized interaction permits the user to interact
directly with the clouds in the federation.
2. Federated cloud can be practiced with various niches like commercial
and non-commercial.
3. The visibility of a federated cloud assists the user to interpret the
organization of several clouds in the federated environment.
4. Federated cloud can be monitored in two ways. MaaS (Monitoring as a
Service) provides information that aids in tracking contracted services
to the user. Global monitoring aids in maintaining the federated cloud.
5. The providers who participate in the federation publish their offers to a
central entity. The user interacts with this central entity to verify the
prices and propose an offer.
6. The marketing objects like infrastructure, software, and platform have
to pass through federation when consumed in the federated cloud.
Benefits of Federated Cloud:
The technologies that aid the cloud federation and cloud services are:
1. OpenNebula
It is a cloud computing platform for managing heterogeneous distributed data
center infrastructures. It can use the resources of its interoperability, leveraging
existing information technology assets, protecting the deals, and adding the
application programming interface (API).
2. Aneka coordinator
The Aneka coordinator is a proposition of the Aneka services and Aneka peer
components (network architectures) which give the cloud ability and
performance to interact with other cloud services.
3. Eucalyptus
Eucalyptus defines the pooling computational, storage, and network resources
that can be measured scaled up or down as application workloads change in
the utilization of the software. It is an open-source framework that performs the
storage, network, and many other computational resources to access the cloud
environment.
you should properly plan your environment and ensure that the business requirements will be
met by your proposed solution. For example, if you want to provide SSO for an extranet
application in your permiter network, you will need to ensure that your design includes an AD
forest and ADFS servers in the permiter network. You will also need to ensure that the
applications support claims-based authentication using ADFS. After you document business
requirements, you can begin designing your deployment. Figure 4.63 depicts an ADFS
deployment with an application installed in the perimeter network. ADFS in this design is
providing SSO for corporate users with existing user accounts in an internal AD forest.
ADFS has several prerequisites that must be met prior to deployment.
The prerequisites are:
▪
PKI—ADFS requires certificates to secure communications
between two environments. Self-signed certificates can be used
for testing and lab purposes but should not be used in production
deployments.
▪
Windows Server 2008 R2 Enterprise—ADFS servers require
Windows Server 2008 R2 Enterprise edition or greater.
▪
AD Domains—ADFS requires that an AD domain exists on both
the account and resource side.
▪
FS Web Agent installed on application server—The Web server
hosting the application will need the federation services Web
agent installed.
Future of Federation:
The next big evolution for the internet is Cloud Computing, where everyone from
individuals to major corporations and governments move their data storage and
processing into remote data centres. Although Cloud Computing has grown, developed
and evolved very rapidly over the last half decade, Cloud Federation continues being an
open issue in current cloud market.
Cloud Federation would address many existing limitations in cloud computing:
-Cloud end-users are often tied to a unique cloud provider, because of the different
APIs, image formats, and access methods exposed by different providers that make
very difficult for an average user to move its applications from one cloud to another, so
leading to a vendor lock-in problem.
-Many SMEs have their own on-premise private cloud infrastructures to support the
internal computing necessities and workloads. These infrastructures are often over-
sized to satisfy peak demand periods, and avoid performance slow-down. Hybrid cloud
(or cloud bursting) model is a solution to reduce the on-premise infrastructure size, so
that it can be dimensioned for an average load, and it is complemented with external
resources from a public cloud provider to satisfy peak demands.
-Many big companies (e.g. banks, hosting companies, etc.) and also many large
institutions maintain several distributed data-centers or server-farms, for example to
serve to multiple geographically distributed offices, to implement HA, or to guarantee
server proximity to the end user. Resources and networks in these distributed data-
centers are usually configured as non-cooperative separate elements, so that usually
every single service or workload is deployed in a unique site or replicated in multiple
sites.