DBMS Project
DBMS Project
Assignment-2
Answer 1
a)
The use of both the approaches depend on the type of company. The Bill
Inmon Approach:
Importance
b)
The ETL consists of Three steps which are, Extract, Transform and load.
Extract
The first step of this process is Extraction. In this step data
from various sources is extracted in various formats like
relational databases, No SQL, XML and flat files into the
staging area. Data is stored in a staging area first to avoid
damage to the data warehouse it is first loaded to the staging
area.
Transform
available data source and collection of statistics and
information about it takes place. It can also be defined as
the systematic analysis of the content of the data source.
It checks the data quality.
o Profiling
examination of data in the available data source and
collection of statistics and information about it takes
place. It can also be defined as the systematic analysis of
the content of the data source. It checks the data quality.
o Standardization
This step puts the data into a standard and normalized
format through a standard rule set for ease of use and
analysis. For example, if majority of the dates in the
data in the UK format and is in the English then the rest
of the data will also be converted into that format.
o Cleansing
In this step identifying and resolving of corrupt,
inaccurate or irrelevant data takes place. This boost the
reliability, consistency and value of your data. After
cleansing of data, all the good data is put in the
standardized format.
o Deduplication
During the ETL process the data is extracted from
various sources and often we find that there is
overlapping data. This causes duplicates in the data. In
this step the duplicate data is removed before loading it
in the data warehouse.
o Enrichment
Often times company’s extract data from the internet or
buy data from other companies to enrich their own data
and add more meaning to their own data. This step helps
enhance the quality of the company data.
Loading
Once the data has been successfully transformed, the data is
loaded into the data warehouse. Loading is the last step in the
ETL process.
Answer 2
The Strategic management consists of all the CXO’s. They are not
concerned with all day to day transaction or reports or queries but with
the future and what will happen next. Hence, they want predictive
modelling and want as much insight as they can get, since they are the
ones who come up with strategies. They use the DSS and ESS.
Answer 3
a) One of the most popular database management system, available
in the market is called relational database management system,
because they are very easy and simple to operate. These systems
are normalized by using data which are generally stored in tables.
The data in the system can be associated with the data that is
present in any of the tables, be it the same or different. While
relational models are sometimes less efficient than other systems,
this is not a major problem as most modern computers have high
processing power and memory, that can easily help brands to
overlook this small disadvantage.
1) ENTITY
An ERD entity is a definable thing or concept within a system, such as a
person/role (e.g. Student), object (e.g. Invoice), concept (e.g. Profile) or
event (e.g. Transaction) (note: In ERD, the term "entity" is often used
instead of "table", but they are the same). When determining entities, think
of them as nouns. In ER models, an entity is shown as a rounded rectangle,
with its name on top and its attributes listed in the body of the entity shape.
The ERD example below shows an example of an ER entity.
2) ATTRIBUTE
An attribute is simply one non-null cell in the spreadsheet, or the
conjunction of a column and row. It stores only one piece of data about
the object represented by the table in which the attribute belongs. For
example, the tuple can be an Invoice entity.
3) Record
In relational databases, a record is a group of related data held within
the same structure. More specifically, a record is a grouping of fields
within a table that reference one particular object. The term record is
frequently used synonymously with row
4) DATAFILE
A data file is any file that contains information, but not code; it is only
meant to be read or viewed and not executed. For example, this web
page, a letter you write in a word processor, and a text file are all
considered data files.