0% found this document useful (0 votes)
41 views18 pages

Unit 5.1 DBMS

DBMS for school

Uploaded by

srnarayanan_slm
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
0% found this document useful (0 votes)
41 views18 pages

Unit 5.1 DBMS

DBMS for school

Uploaded by

srnarayanan_slm
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 18

Chapter 5 Web- based databases & Data Warehousing

5.1 Web Database

A Web database is a database application designed to be managed and accessed through the
Internet. Website operators can manage this collection of data and present analytical results based
on the data in the Web database application. Databases first appeared in the 1990s, and have been
an asset for businesses, allowing the collection of seemingly infinite amounts of data from infinite
amounts of customers.

A web database is essentially a database that can be accessed from a local network or the
internet instead of one that has its data stored on a desktop or its attached storage.

The Web-based database management system is one of the essential parts of DBMS and is
used to store web application data. A web-based Database management system is used to handle
those databases that are having data regarding E-commerce, E-business, blogs, e-mail, and other
online applications.

5.2 Requirements for Web-DBMS Integration

While many DBMS sellers are working for providing a proprietary database for
connectivity solutions with the Web, the majority of the organizations necessitate a more general
way out to prevent them from being tied into a single technology. Here are the lists of some of the
most significant necessities for the database integration applications within the Web. These
requirements are standards and not fully attainable at present. There is no ranking of orders, and so
the requirements are as follows:

 The ability and right to use valuable corporate data in a fully secured manner.
 Provides data and vendor's autonomous connectivity that allows freedom of choice in
selecting the DBMS for present and future use.
 The capability to interface to the database, independent of any proprietary Web browser
and/or Web server.

 openDatabase() method does not take the argument in server because openDatabase() and
openDatabaseSync() methods takes the name of the database, version of the database,
display name, estimated size in bytes of the data that is to be stored in the database.
openDtabase() method works on WorkerUtils and Window, openDatabaseSync() method
works on WorkerUtils.

 A connectivity solution that takes benefit of all the features of an organization's DBMS.
 An open-architectural structure that allows interoperability with a variety of systems and
technologies; such as:

 Different types of Web servers


 Microsoft's Distributed Common Object Model (DCOM) / Common Object Model
(COM)
 CORBA / IIOP
 Java / RMI which is Remote Method Invocation
 XML (Extensible Markup Language)
 Various Web services (SOAP, UDDI, etc.)

 A cost-reducing way which allows for scalability, development, and changes in strategic
directions and helps lessen the costs of developing and maintaining those applications
 Provides support for transactions that span multiple HTTP requests.
 Gives minimal administration overhead.

5.3 Benefits of the Web-DBMS Approach

Here are various benefits that come through the use of web-based DBMS are:

 Provides simplicity
 Web-DBMS is Platform independence
 Provides Graphical User Interface (GUI)
 Standardization
 Provides Cross-platform support
 Facilitates transparent network access
 Scalability
 Innovation
Applicable Uses
Businesses both large and small can use Web databases to create website polls, feedback
forms, client or customer and inventory lists. Personal Web database use can range from storing
personal email accounts to a home inventory to personal website analytics. The Web database is
entirely customizable to an individual's or business's needs.

MySQL
Often in the world of Web databases, MySQL (structured query language) will be
mentioned. When SQL feature is not supported then we use COMMIT, BEGIN, ROLLBACK
SQL features and are marked as bogus. User agent uses the statements that contain these three
words in case of failure of the support of SQL features. This is a relational database management
system that manages different Web databases. It operates as a server, and is an open source project.
MySQL is often included with Web hosting for managing either personal or business website
databases. It is a programming language, so is a more difficult to work with than a straight Web
database software program. NoSQL stands for Not only SQL
Database Types

Other than the SQL database, there are six different types of database systems. Here’s a summary
of them:

 Distributed database: This system depends on multiple data warehouses for the storage
and processing of records. It uses database replication to ensure uniformity of information
across the different physical sites.
 Cloud database: These are more modern databases that run in a virtual environment. They
have a high computing power for processing unlimited records. Best of all – it offers instant
upscaling of resources whenever the need arises.
 NoSQL database: NoSQL is the exact opposite of SQL set-ups. They are perfect for
handling large sets of unstructured data. As such, they run on the cloud across multiple
servers for better efficiency.
 Hierarchical database: Hierarchical DBMS stores information in a tree-like structure.
With this method, data is kept in categories that expand to various subcategories. The
approach supports the rational model for interlinking records.
 Centralized database: This web-based database stores data in a central location. The
configuration allows easy access of information by multiple users remotely. Furthermore,
it’s easier to configure and manage.
 Network database: Network databases are systems for managing enterprise operations.
They are ideal for organizations that handle multiple relational datasets. These can include
customers, transactions, staff, marketing, and so on.

5.4 Cloud database

A cloud database is a database service built and accessed through a cloud platform. The
cloud platform facilitates the storage, management, and retrieval of structured, unstructured data. It
serves many of the same functions as a traditional database with the added flexibility of cloud
computing. Users install software on a cloud infrastructure to implement the database. The
transction( ) and readTransaction ( ) are the two methods that take three arguments i.e. transaction
callback as the first argument, error callback as the second argument, success callback as the third
argument.
Key features:

 A database service built and accessed through a cloud platform

 One of the advantages of Cloud database is that it is automated

 Enables enterprise users to host databases without buying dedicated hardware

 Can be managed by the user or offered as a service and managed by a provider

 Can support relational databases (including MySQL and PostgreSQL) and NoSQL
databases (including MongoDB and Apache CouchDB)

 Accessed through a web interface or vendor-provided API

 changeVersion() method automatically verify version number and it also changes it like doing a
schema update. When this method is invoked it immediately returns and then run transaction steps
asynchronously by taking transaction callback as the third argument, error callback as the fourth
argument and success callback as the fifth argument.
Organizations that are implementing databases in the public cloud choose between the following
two deployment models:

1. Self-managed database. This is an infrastructure as a service (IaaS) environment, in which the


database runs in a virtual machine on a system operated by a cloud provider. The provider
manages and supports the cloud infrastructure, including servers, operating systems and storage
devices. But the user organization is responsible for database deployment, administration and
maintenance. As a result, it's akin to an on-premises deployment for the DBA, who retains full
management control of the database.

2. Managed database service. A cloud database can also be referred to as a Database as a service
(DBaaS) environments are fully managed by the vendor, which could be a cloud platform
provider or another database vendor that runs its cloud DBMS on a platform provider's
infrastructure. Under the DBaaS model, both the system infrastructure and the database platform
are managed for the customer. The DBaaS vendor handles provisioning, backups, scaling,
patching, upgrades and other basic database administration functions, while the DBA monitors
the database and coordinates with the vendor on some administrative tasks. Similar data
warehouse as a service (DWaaS) offerings are also available for deployments of cloud data
warehouses.

In addition, some cloud providers -- Amazon Web Services (AWS) and Oracle, for example
-- offer versions of their DBaaS technologies for installation in on-premises data centers as part of a
private cloud or a hybrid cloud infrastructure that combines public and private clouds. As with a
regular DBaaS environment, the provider deploys the databases on its own systems and manages
them for customers, except that it instead delivers the systems to a customer's data center to run
there and then manages the databases remotely.

5.5 Types of cloud databases

A wide variety of cloud databases are available, matching the different types of database
technologies that can be deployed on premises. At this point, every notable database vendor offers
its software in the cloud. That includes cloud-native databases developed specifically for use in
cloud environments and existing on-premises databases that now support the cloud.

The following are cloud database :

i. Amazon Web Services


ii. Snowflake Computing
iii. Google cloud spanner
iv. Oracle Database Cloud Services

Microsoft SQL Server : The following are the key types of databases that cloud users can take
advantage of:

 Relational databases. SQL-based relational software has dominated the database market since
the 1990s and remains the most widely used technology, particularly well suited for transaction
processing and other applications involving structured data.

 NoSQL databases. NoSQL systems forego the rigid schemas of relational databases, making
them a better option for unstructured data. There are four major NoSQL product categories:
document databases, graph databases, wide-column stores and key-value databases.

 Multimodel databases. They support more than one data model, enabling them to run a wider
set of applications. Many relational and NoSQL databases now qualify as multimodel through
add-ons -- for example, the addition of a graph module to a relational DBMS.

 Distributed SQL databases. Initially labeled as NewSQL, these technologies distribute


relational databases across multiple computing nodes to create transactional systems that can
provide NoSQL-like levels of scalability.

 Cloud data warehouses. Initially developed to provide data warehousing capabilities for
business intelligence and reporting applications, they typically now also support data lake
development, machine learning and other advanced analytics functions.

5.6 Cloud database benefits


Compared with running databases on premises, cloud databases offer the following potential IT and
business advantages to an organization:

 Increased scalability and flexibility. Cloud database systems can be easily scaled up by adding
more processing and storage capacity when workloads increase. Some vendors offer autoscaling
features that do so dynamically, without users even needing to submit a request. In addition, an
organization can quickly deploy new databases and shut down ones that it no longer needs,

 Elimination of IT infrastructure. Because the cloud provider is responsible for the system
infrastructure in a cloud database environment, an organization may be able to reduce its own IT
footprint by decommissioning systems, especially if it's moving on-premises databases to the
cloud.

 Faster access to new features. With on-premises databases, users typically need to wait for and
then install a software upgrade to get new features and functionality. DBaaS vendors can update
their cloud databases on an ongoing basis, enabling organizations to take advantage of new
features as soon as they're available.

 More reliable systems with guaranteed uptime. Cloud vendors provide high availability,
automated backup and disaster recovery capabilities that may be more advanced than what an
organization implements itself.

5.7 Mobile databases

Mobile databases make data from database applications available to mobile users, and they
support applications that involve data processing. In general, a mobile database enables a
connection between computing devices across a wireless mobile network.

Mobile computing makes it possible for users to communicate while on the move.

Properties of mobile databases

 It resides on mobile devices.

 It provides a communication link between a central database server and other mobile links
that allow a transfer.
 The mobile database connects the database of the central system.

 A mobile database enables users to view information while on the move.

 Mobile databases analyze data on mobile devices.

Types of mobile databases

Client-server mobile database

In the client-server, the database server connects the client machine with the running
application programs that users are using. The client machines have running application programs
that users are using. The programs are the ones in charge of query generation.

The computer network makes it possible for the generated query to get accessed by the
database. The database server will check the query, the required data for the query processing,
which user has sent the query, and the respective authenticity of the new use.

The processed result is allocated to the respective client machine, which displays the result
to the user. The central server checks the syntax of the commands from the client’s queries. The
server system is hidden so that the user does not know about the server’s hardware and software.

Peer-to-peer mobile database

It is a database stored in the users of a mobile peer-to-peer network. The database


maintenance activities get distributed among clients. Each device in mobile database that stores
data items, which forms the mobile peer-to-peer database.

The peer clients forward the request as many times as possible until the data items are
found. The concept of a peer-to-peer database proposes searching local information like
information of a temporary nature.

For example, if there is the availability of a parking slot in a specific geographical area.
Applications that use peer-to-peer mobile databases are:

 Social networks.

 Airport application.
 Mobile E-commerce.

 Transport safety and efficiency.

Advantages of mobile databases

 Limited bandwidth of wireless networks.

 It needs low power.

 It should enhance mobility.

 Disconnections.

 It does not require many resources.

Disadvantages of mobile databases

 It is less secure.

 Bandwidth is limited.

 The mobile database consumes more power.

Characteristics of mobile environments

 Limited bandwidth of wireless networks.

 It needs lower power.

 It should enhance mobility.

 Disconnections.

 It doesn’t require many resources.

Requirements of mobile databases

1. Memory footprint is the primary memory size a process is taking up. The size of the mobile
database affects the amount of memory a process will take. Mobile databases should have a
small print since mobile devices have limited memory.
2. Flash optimized storage system- Mobile database needs to be optimized to use the new
storage devices.

3. Data synchronization - Mobile database should have the synchronize functionality to


maintain consistency within the data.

4. Security - Mobile database should implement complete end security to ensure secure data
transfer.

5. Low power consumption- Optimization needs to be done in mobile databases for efficient
power consumption.

6. Self-management - Mobile databases need to understand its responsibilities and perform the
database administrator tasks.

7. Embeddable - Databases must be embeddable as a Dynamic Link Library (DLL) file so that
administrators can have direct access to mobile devices.

Transaction management in mobile database systems

 A transaction is a unit of work carrying out instructions within a database management


system. In a mobile environment, one should consider a restricted bandwidth because of the
mobility of the host.

 If the bandwidth is high, information is easily accessed as there is a strong connection, while
there is a weak connection if the bandwidth is low. Mobile transaction checks if the
bandwidth is high or low, it can switch from a powerful connection to an inadequate
connection or conversely.

5.8 Data Warehouse

A Data Warehouse (DW) is a relational database that is designed for query and analysis
rather than transaction processing. It includes historical data derived from transaction data from
single and multiple sources. The Data warehouse is not generally updated in real-time.

A Data Warehouse provides integrated, enterprise-wide, historical data and focuses on


providing support for decision-makers for data modeling and analysis. The data warehouse can
include Database table, flat files and online data.
A Data Warehouse is a group of data specific to the entire organization, not only to a particular
group of users. Data warehouse is used in decision support system.

It is not used for daily operations and transaction processing but used for making decisions.

A Data Warehouse can be viewed as a data system with the following attributes:

o It is a database designed for investigative tasks, using data from various applications.

o The multidimensional model of a data warehouse is known as a data cube

o It supports a relatively small number of clients with relatively long interactions.

o It includes current and historical data to provide a historical perspective of information.

o Data warehouse is based on a Multidimensional model

o Its usage is read-intensive.

o It contains a few large tables.

o System of data warehousing is mostly used for Reporting and data analysis

"Data Warehouse is a subject-oriented, integrated, and time-variant store of information in support


of management's decisions."

5.9 Data Warehouse Applications

A data warehouse helps business executives to organize, analyze, and use their data for decision
making. A data warehouse serves as a sole part of a plan-execute-assess "closed-loop" feedback
system for the enterprise management. Small logical units where data warehouses hold large
amounts of data are known as data miners. Data warehouses are widely used in the following
fields −

 Financial services
 Banking services
 Consumer goods
 Retail sectors
 Controlled manufacturing

5.10 Types of Data Warehouse Applications

Information processing, analytical processing, and data mining are the three types of data
warehouse applications
 Information Processing − A data warehouse allows to process the data stored in it. The
data can be processed by means of querying, basic statistical analysis, reporting using
crosstabs, tables, charts, or graphs.
 Analytical Processing − A data warehouse supports analytical processing of the
information stored in it. The data can be analyzed by means of basic OLAP operations,
including slice-and-dice, drill down, drill up, and pivoting.
 Data Mining − Data mining supports knowledge discovery by finding hidden patterns and
associations, constructing analytical models, performing classification and prediction.
These mining results can be presented using the visualization tools.


Sr.No Data Warehouse (OLAP) Operational Database(OLTP)
.

1 It involves historical processing of It involves day-to-day processing.


information.

2 OLAP systems are used by knowledge OLTP systems are used by clerks, DBAs, or
workers such as executives, managers, database professionals.
and analysts.

3 It is used to analyze the business. It is used to run the business.

4 It focuses on Information out. It focuses on Data in.

5 It is based on Star Schema, Snowflake It is based on Entity Relationship Model.


Schema, and Fact Constellation
Schema.

6 It focuses on Information out. It is application oriented.

7 It contains historical data. It contains current data.

8 It provides summarized and It provides primitive and highly detailed


consolidated data. data.

9 It provides summarized and It provides detailed and flat relational view


multidimensional view of data. of data.

10 The number of users is in hundreds. The number of users is in thousands.


11 The number of records accessed is in The number of records accessed is in tens.
millions.

12 These are highly flexible. It provides high performance.

5.11 Types of Data Warehouse Models

Enterprise warehouse, Data Mart and Virtual warehouse are the three types of data warehouse
applications.

Enterprise Warehouse

An Enterprise warehouse collects all of the records about subjects spanning the entire
organization. It supports corporate-wide data integration, usually from one or more operational
systems or external data providers, and it's cross-functional in scope. It generally contains detailed
information as well as summarized information and can range in estimate from a few gigabyte to
hundreds of gigabytes, terabytes, or beyond. The source of all data warehouse data is known as the
Operational environment

An enterprise data warehouse may be accomplished on traditional mainframes, UNIX super


servers, or parallel architecture platforms. It required extensive business modeling and may take
years to develop and build. The property of the data warehouse are

1. Collection from heterogeneous sources


2. Subject oriented
3. Time variant
Data Mart

A data mart includes a subset of corporate-wide data that is of value to a specific collection
of users. Datamart is defined as a subgroup of the data warehouses The scope is confined to
particular selected subjects. A data warehouse does not require transaction processing, recovery,
and concurrency controls, because it is physically stored and separate from the operational
database.

For example, a marketing data mart may restrict its subjects to the customer, items, and sales. The
data contained in the data marts tend to be summarized.

Data Marts is divided into two parts:

Independent Data Mart: Independent data mart is sourced from data captured from one or more
operational systems or external data providers, or data generally locally within a different
department or geographic area.

Dependent Data Mart: Dependent data marts are sourced exactly from enterprise data-
warehouses. Data Cleaning is Involves finding and correcting the errors in data

Virtual Warehouses

Virtual Data Warehouses is a set of perception over the operational database. For effective
query processing, only some of the possible summary vision may be materialized. A virtual
warehouse is simple to build but required excess capacity on operational database servers. It is a set
of views over operational databases

5.12 Massive Dataset

Data processing frameworks, such as Apache Hadoop and Spark, have been powering the
development of Big Data. Their ability to gather vast amounts of data from different data streams is
incredible, however, they need a data warehouse to analyze, manage, and query all the data.

Data Warehouse Architecture

There are three ways you can construct a data warehouse system. These approaches are classified
by the number of tiers in the architecture. Therefore, you can have a:

 Single-tier architecture
 Two-tier architecture
 Three-tier architecture

Single-tier Data Warehouse Architecture

The single-tier architecture is not a frequently practiced approach. The main goal of having
such an architecture is to remove redundancy by minimizing the amount of data stored.
Its primary disadvantage is that it doesn’t have a component that separates analytical
and transactional processing.

Two-tier Data Warehouse Architecture

A two-tier architecture includes a staging area for all data sources, before the data
warehouse layer. By adding a staging area between the sources and the storage repository, you
ensure all data loaded into the warehouse is cleansed and in the appropriate format.

This approach has certain network limitations. Additionally, you cannot expand it to support a
larger number of users.

Three-tier Data Warehouse Architecture

The three-tier approach is the most widely used architecture for data warehouse systems.
Essentially, it consists of three tiers:

1. The bottom tier is the database of the warehouse, where the cleansed and transformed data
is loaded. Back-end tools and utilities are used to feed data into from operational databases
or other external sources.
2. The middle tier is the application layer giving an abstracted view of the database. It
arranges the data to make it more suitable for analysis. This is done with an OLAP server,
implemented using the ROLAP or MOLAP model.
3. The top-tier is where the user accesses and interacts with the data. It represents the front-
end client layer. You can use reporting tools, query, analysis or data mining tools. The data
warehouse architecture contains query and reporting tools, analysis tools, and data mining
tools.

Data Warehouse Components

From the architectures outlined above, you notice some components overlap, while others are
unique to the number of tiers. Data warehouse database servers are known as the hearth of the
warehouse.

Below you will find some of the most important data warehouse components and their roles in the
system.

ETL Tools

ETL stands for Extract, Transform, and Load. The staging layer uses ETL tools to extract the
needed data from various formats and checks the quality before loading it into the data warehouse.

The data coming from the data source layer can come in a variety of formats. Before merging all
the data collected from multiple sources into a single database, the system must clean and organize
the information.
Data

Once the system cleans and organizes the data, it stores it in the data warehouse. The data
warehouse represents the central repository that stores metadata, summary data, and raw data
coming from each source.

 Metadata is the information that defines the data. Its primary role is to simplify working
with data instances. It allows data analysts to classify, locate, and direct queries to the
required data.
 Summary data is generated by the warehouse manager. It updates as new data loads into
the warehouse. This component can include lightly or highly summarized data. Its main role
is to speed up query performance.
 Raw data is the actual data loading into the repository, which has not been processed.
Having the data in its raw form makes it accessible for further processing and analysis.

Some of the tools used include:

 Reporting tools. They play a crucial role in understanding how your business is doing and
what should be done next. Reporting tools include visualizations such as graphs and charts
showing how data changes over time.
 OLAP tools. Online analytical processing tools which allow users to analyze
multidimensional data from multiple perspectives. These tools provide fast processing and
valuable analysis. They extract data from numerous relational data sets and reorganize it
into a multidimensional format.
 Data mining tools. Examine data sets to find patterns within the warehouse and the
correlation between them. Data mining also helps establish relationships when analyzing
multidimensional data.
Data Marts

Data marts allow you to have multiple groups within the system by segmenting the data in the
warehouse into categories. It partitions data, producing it for a particular user group. Data in
operational systems are typically fragmented and inconsistent.

For instance, you can use data marts to categorize information by departments within the company.

You might also like