DataWarehousing Interview QuestionsandAnswers
DataWarehousing Interview QuestionsandAnswers
Conventional (Slow) :
All the constraints and keys are validated against the data before, it is
loaded, this way data integrity is maintained.
Direct (Fast) :
All the constraints and keys are disabled before the data is loaded.
Once data is loaded, it is validated against all the constraints and keys.
If data is found invalid or dirty it is not included in index and all future
processes are skipped on this data.
What is OLTP?
OLTP is abbreviation of On-Line Transaction Processing. This system is
an application that modifies data the instance it receives and has a
large number of concurrent users.
What is OLAP?
OLAP is abbreviation of Online Analytical Processing. This system is an
application that collects, manages, processes and presents
multidimensional data for analysis and management purposes.
Data Source
OLTP: Operational data is from original data source of the data
OLAP: Consolidation data is from various source.
Process Goal
OLTP: Snapshot of business processes which does fundamental
business tasks
OLAP: Multi-dimensional views of business activities of planning and
decision making
http://www.SQLAuthority.com v1.0
Database Design
OLTP: Normalized small database. Speed will be not an issue due to
smaller database and normalization will not degrade performance. This
adopts entity relationship(ER) model and an application-oriented
database design.
OLAP: De-normalized large database. Speed is issue due to larger
database and de-normalizing will improve performance as there will be
lesser tables to scan while performing tasks. This adopts star,
snowflake or fact constellation mode of subject-oriented database
design.
What is ER Diagram?
Entity Relationship Diagrams are a major data modelling tool and will
http://www.SQLAuthority.com v1.0
help organize the data in your project into entities and define the
relationships between the entities. This process has proved to enable
the analyst to produce a good database structure so that the data can
be stored and retrieved in a most efficient manner.
What is ODS?
ODS is abbreviation of Operational Data Store. A database structure
that is a repository for near real-time operational data rather than long
term trend data. The ODS may further become the enterprise shared
operational database, allowing operational systems that are being re-
engineered to use the ODS as there operation databases.
What is ETL?
ETL is abbreviation of extract, transform, and load. ETL is software
that enables businesses to consolidate their disparate data while
moving it from place to place, and it doesn't really matter that that
data is in different forms or formats. The data can come from any
source.ETL is powerful enough to handle such data disparities. First,
the extract function reads data from a specified source database and
extracts a desired subset of data. Next, the transform function works
with the acquired data - using rules orlookup tables, or creating
combinations with other data - to convert it to the desired state.
Finally, the load function is used to write the resulting data to a target
database.
What is VLDB?
VLDB is abbreviation of Very Large DataBase. A one terabyte database
would normally be considered to be a VLDB. Typically, these are
decision support systems or transaction processing applications
serving large numbers of users.