DW Unit-2
DW Unit-2
DW Unit-2
OLAP stands for Online Analytical Processing. OLAP systems have the
capability to analyze database information of multiple systems at the current
time. The primary goal of OLAP Service is data analysis and not data
processing.
OLTP stands for Online Transaction Processing. OLTP has the work to
administer day-to-day transactions in any organization. The main goal of OLTP
is data processing not data analysis.
It is well-known as an online
It is well-known as an online
Definition database query management
database modifying system.
system.
It makes use of a
Method It makes use of a data
standard database management
used warehouse.
system (DBMS).
It is subject-oriented. Used
It is application-oriented. Used for
Application for Data Mining, Analytics,
business tasks.
Decisions making, etc.
It provides a multi-
It reveals a snapshot of present
Task dimensional view of different
business tasks.
business tasks.
OLAP (Online Analytical OLTP (Online Transaction
Category Processing) Processing)
OLTP (Online Transaction Processing) systems are designed for fast, real-
time transaction processing. Here are some of the common operations
involved in OLTP systems:
1. Insert: Adding new records or data entries into the database. This could
involve inserting new customer orders, financial transactions, or any other
type of data that needs to be recorded.
2. Update: Modifying existing records in the database. This operation might
involve changing customer information, updating inventory levels, or
editing transaction details.
3. Delete: Removing records from the database. This operation could involve
deleting canceled orders, purging outdated data, or removing irrelevant
information.
4. Read: Retrieving specific data from the database. This operation is crucial
for accessing information such as customer details, product availability, or
transaction history.
5. Search: Querying the database for specific data based on certain criteria.
This operation might involve searching for all orders placed by a particular
customer, finding products within a certain price range, or locating
transactions within a specific time period.
6. Commit: Finalizing a transaction and making its changes permanent in the
database. This operation ensures that changes made during a transaction
are saved and can be accessed by other users or applications.
7. Rollback: Canceling or reverting changes made during a transaction that
has not yet been committed. This operation is important for maintaining
data consistency and integrity, especially in cases where errors occur during
transaction processing.
8. Concurrency Control: Managing simultaneous access to data by multiple
users or applications to prevent conflicts and ensure data consistency. This
involves techniques such as locking, timestamping, or optimistic
concurrency control.
9. Indexing: Organizing and optimizing data access by creating indexes on
frequently queried columns or fields. This operation helps improve the
performance of read and search operations by allowing the database to
quickly locate relevant data.
10.Transaction Management: Ensuring the ACID (Atomicity, Consistency,
Isolation, Durability) properties of transactions. This involves managing
transaction boundaries, handling commit and rollback operations, and
ensuring data integrity even in the event of system failures or errors.
Data mining, also known as knowledge discovery in data (KDD), is the process of
uncovering patterns and other valuable information from large data sets. Given
the evolution of data warehousing technology and the growth of big data,
adoption of data mining techniques has rapidly accelerated over the last couple of
decades, assisting companies by transforming their raw data into useful
knowledge. However, despite the fact that that technology continuously evolves
to handle data at a large scale, leaders still face challenges with scalability and
automation.
Data mining process
The data mining process involves a number of steps from data collection to visualization to
extract valuable information from large data sets. As mentioned above, data mining techniques
are used to generate descriptions and predictions about a target data set.
Data mining usually consists of four main steps: setting objectives, data gathering and
preparation, applying data mining algorithms and evaluating results.
2. Data preparation:
Association rules: An association rule is a rule-based method for finding relationships between
variables in a given dataset.
Neural networks: Primarily leveraged for deep learning algorithms, neural networks process
training data by mimicking the interconnectivity of the human brain through layers of nodes.
Decision tree: This data mining technique uses classification or regression methods to classify
or predict potential outcomes based on a set of decisions.
K- nearest neighbor (KNN): K-nearest neighbor, also known as the KNN algorithm, is a non-
parametric algorithm that classifies data points based on their proximity and association to other
available data.