DW Unit-2

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Q1: OLAP & OLTP (DIFF B/W) ?

OLAP stands for Online Analytical Processing. OLAP systems have the
capability to analyze database information of multiple systems at the current
time. The primary goal of OLAP Service is data analysis and not data
processing.
OLTP stands for Online Transaction Processing. OLTP has the work to
administer day-to-day transactions in any organization. The main goal of OLTP
is data processing not data analysis.

OLAP (Online Analytical OLTP (Online Transaction


Category Processing) Processing)

It is well-known as an online
It is well-known as an online
Definition database query management
database modifying system.
system.

Consists of historical data Consists of only operational


Data source
from various Databases. current data.

It makes use of a
Method It makes use of a data
standard database management
used warehouse.
system (DBMS).

It is subject-oriented. Used
It is application-oriented. Used for
Application for Data Mining, Analytics,
business tasks.
Decisions making, etc.

In an OLAP database, tables In an OLTP database, tables


Normalized
are not normalized. are normalized (3NF).

The data is used in planning,


Usage of The data is used to perform day-
problem-solving, and
data to-day fundamental operations.
decision-making.

It provides a multi-
It reveals a snapshot of present
Task dimensional view of different
business tasks.
business tasks.
OLAP (Online Analytical OLTP (Online Transaction
Category Processing) Processing)

It serves the purpose to


It serves the purpose to Insert,
extract information for
Purpose Update, and Delete information
analysis and decision-
from the database.
making.

The size of the data is relatively


Volume of A large amount of data is
small as the historical data is
data stored typically in TB, PB
archived in MB, and GB.

Relatively slow as the amount


Very Fast as the queries operate
Queries of data involved is large.
on 5% of the data.
Queries may take hours.

Q2: Enlist diff operations on OLTP ?

OLTP (Online Transaction Processing) systems are designed for fast, real-
time transaction processing. Here are some of the common operations
involved in OLTP systems:

1. Insert: Adding new records or data entries into the database. This could
involve inserting new customer orders, financial transactions, or any other
type of data that needs to be recorded.
2. Update: Modifying existing records in the database. This operation might
involve changing customer information, updating inventory levels, or
editing transaction details.
3. Delete: Removing records from the database. This operation could involve
deleting canceled orders, purging outdated data, or removing irrelevant
information.
4. Read: Retrieving specific data from the database. This operation is crucial
for accessing information such as customer details, product availability, or
transaction history.
5. Search: Querying the database for specific data based on certain criteria.
This operation might involve searching for all orders placed by a particular
customer, finding products within a certain price range, or locating
transactions within a specific time period.
6. Commit: Finalizing a transaction and making its changes permanent in the
database. This operation ensures that changes made during a transaction
are saved and can be accessed by other users or applications.
7. Rollback: Canceling or reverting changes made during a transaction that
has not yet been committed. This operation is important for maintaining
data consistency and integrity, especially in cases where errors occur during
transaction processing.
8. Concurrency Control: Managing simultaneous access to data by multiple
users or applications to prevent conflicts and ensure data consistency. This
involves techniques such as locking, timestamping, or optimistic
concurrency control.
9. Indexing: Organizing and optimizing data access by creating indexes on
frequently queried columns or fields. This operation helps improve the
performance of read and search operations by allowing the database to
quickly locate relevant data.
10.Transaction Management: Ensuring the ACID (Atomicity, Consistency,
Isolation, Durability) properties of transactions. This involves managing
transaction boundaries, handling commit and rollback operations, and
ensuring data integrity even in the event of system failures or errors.

Q3: Explain Data Mining in detail?

Data mining, also known as knowledge discovery in data (KDD), is the process of
uncovering patterns and other valuable information from large data sets. Given
the evolution of data warehousing technology and the growth of big data,
adoption of data mining techniques has rapidly accelerated over the last couple of
decades, assisting companies by transforming their raw data into useful
knowledge. However, despite the fact that that technology continuously evolves
to handle data at a large scale, leaders still face challenges with scalability and
automation.
Data mining process

The data mining process involves a number of steps from data collection to visualization to
extract valuable information from large data sets. As mentioned above, data mining techniques
are used to generate descriptions and predictions about a target data set.

Data mining usually consists of four main steps: setting objectives, data gathering and
preparation, applying data mining algorithms and evaluating results.

1.Set the business objectives:

2. Data preparation:

3. Model building and pattern mining:

4. Evaluation of results and implementation of knowledge:

Data mining techniques

Association rules: An association rule is a rule-based method for finding relationships between
variables in a given dataset.

Neural networks: Primarily leveraged for deep learning algorithms, neural networks process
training data by mimicking the interconnectivity of the human brain through layers of nodes.

Decision tree: This data mining technique uses classification or regression methods to classify
or predict potential outcomes based on a set of decisions.

K- nearest neighbor (KNN): K-nearest neighbor, also known as the KNN algorithm, is a non-
parametric algorithm that classifies data points based on their proximity and association to other
available data.

Data mining applications


Sales and marketing
Education
Operational optimization
Fraud detection
Q4: Enlist different operations on OLTP?
1. Data Entry: OLTP systems facilitate the entry of new data into the database. This could
include adding new records, updating existing records, or deleting records as necessary.
2. Concurrent Access: OLTP systems are designed to handle multiple transactions
simultaneously, allowing multiple users to access and modify the database concurrently
without interfering with each other.
3. Data Retrieval: OLTP systems support the retrieval of specific data from the database in
response to user queries. These queries are typically focused on retrieving small sets of
data relevant to specific transactions or inquiries.
4. ACID Compliance: OLTP systems ensure data integrity by adhering to the ACID
(Atomicity, Consistency, Isolation, Durability) properties. This ensures that transactions
are processed reliably and consistently, even in the presence of system failures or
concurrent access.
5. Indexing and Query Optimization: OLTP systems often utilize indexing and query
optimization techniques to improve the performance of data retrieval operations.
Indexes are created on frequently queried columns to speed up data access, and query
optimization techniques are employed to ensure efficient execution of database queries.
6. Transaction Management: OLTP systems manage transactions to ensure that they are
processed reliably and efficiently. This includes handling transaction concurrency,
ensuring data consistency, and maintaining data integrity throughout the transaction
lifecycle.
7. Backup and Recovery: OLTP systems implement backup and recovery mechanisms to
protect against data loss and ensure data availability in the event of system failures or
disasters.

Q4:Explain step by step process of data mining ?

You might also like