0% found this document useful (0 votes)
4 views9 pages

DBMS Project

The document discusses two primary approaches to data warehousing: the Inmon approach, which emphasizes a centralized data warehouse, and the Kimball approach, which focuses on data marts. It also outlines the ETL process, detailing the steps of extraction, transformation, and loading of data into a data warehouse. Additionally, it covers the hierarchy of data requirements in organizations and the significance of relational database management systems (RDBMS) and CRUD operations.

Uploaded by

AAKASH GUPTA
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
4 views9 pages

DBMS Project

The document discusses two primary approaches to data warehousing: the Inmon approach, which emphasizes a centralized data warehouse, and the Kimball approach, which focuses on data marts. It also outlines the ETL process, detailing the steps of extraction, transformation, and loading of data into a data warehouse. Additionally, it covers the hierarchy of data requirements in organizations and the significance of relational database management systems (RDBMS) and CRUD operations.

Uploaded by

AAKASH GUPTA
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 9

CS 101: Introduction to Computing & Information Systems

Assignment-2
Answer 1

a)
The use of both the approaches depend on the type of company. The Bill
Inmon Approach:

Inmon defines a data warehouse as more of a centralized


repository for the whole enterprise. Data warehouses stores
data at the lowest level of data i.e. ‘Atomic data’. In this
approach data marts are created after the data warehouse has
been created. This data warehouse provides the logical
framework for delivering business intelligence.
The Ralph Kimball Approach –
In this approach, data marts provide a thin view into the enterprise-wide
data, after that, whenever required they can be combined into a larger
data warehouse. Kimball defines data warehouse as “a copy of
transaction data specifically structured for query and analysis.

Importance

Both approaches are of importance and can be successfully used as per


the company’s requirements and it cannot be said that one is more
important than the other. In fact, a blend of both models can be used i.e.
Hybrid model. However, Kimball’s approach is more suited for smaller
companies, while Inmon’s approach is more suited to companies that can
afford to spend money on such a system.

b)

The ETL consists of Three steps which are, Extract, Transform and load.
 Extract
The first step of this process is Extraction. In this step data
from various sources is extracted in various formats like
relational databases, No SQL, XML and flat files into the
staging area. Data is stored in a staging area first to avoid
damage to the data warehouse it is first loaded to the staging
area.
 Transform
available data source and collection of statistics and
information about it takes place. It can also be defined as
the systematic analysis of the content of the data source.
It checks the data quality.
o Profiling
examination of data in the available data source and
collection of statistics and information about it takes
place. It can also be defined as the systematic analysis of
the content of the data source. It checks the data quality.

o Standardization
This step puts the data into a standard and normalized
format through a standard rule set for ease of use and
analysis. For example, if majority of the dates in the
data in the UK format and is in the English then the rest
of the data will also be converted into that format.
o Cleansing
In this step identifying and resolving of corrupt,
inaccurate or irrelevant data takes place. This boost the
reliability, consistency and value of your data. After
cleansing of data, all the good data is put in the
standardized format.
o Deduplication
During the ETL process the data is extracted from
various sources and often we find that there is
overlapping data. This causes duplicates in the data. In
this step the duplicate data is removed before loading it
in the data warehouse.
o Enrichment
Often times company’s extract data from the internet or
buy data from other companies to enrich their own data
and add more meaning to their own data. This step helps
enhance the quality of the company data.
 Loading
Once the data has been successfully transformed, the data is
loaded into the data warehouse. Loading is the last step in the
ETL process.

Answer 2

As shown in the flowchart above there is a hierarchy in an organization.


And the data requirements are different for each of them.
The operational management is more concerned with the transactional
data i.e. day to day operations. This data is found through ERP’s,
CRM’s and various other sources. Hence this level of management uses
this data to create reports and various queries to find out what happened
and what is happening in the business.

The Tactical management are more concerned with dimensional/flexible


reports and ad hoc reporting. This kind of management want to play
around with their data and hence use descriptive modelling. These
people want to find out why this has happened in the business and want
to further analyze it. They do a lot of data mining.

The Strategic management consists of all the CXO’s. They are not
concerned with all day to day transaction or reports or queries but with
the future and what will happen next. Hence, they want predictive
modelling and want as much insight as they can get, since they are the
ones who come up with strategies. They use the DSS and ESS.

Answer 3
a) One of the most popular database management system, available
in the market is called relational database management system,
because they are very easy and simple to operate. These systems
are normalized by using data which are generally stored in tables.
The data in the system can be associated with the data that is
present in any of the tables, be it the same or different. While
relational models are sometimes less efficient than other systems,
this is not a major problem as most modern computers have high
processing power and memory, that can easily help brands to
overlook this small disadvantage.

o Relational DataBase Management System representation of tables only.


o A relational database refers to a database that stores data in a structured
format, using rows and columns.
o In RDBMS we can store the data in the form of tables.
o Using this RDBMS we can create databases easily.
o We can insert the data easily.
o We can modify the data easily.
o Using the RDBMS we can perform all operations on the table.
o In relational database One table contains columns.
o The table may contain 1 column or no. of columns.
o One table must contain 1 column, without column we can’t create a
table.
o Which database we are going to be created regarding that we can create
the table.
o The table name must be matched to our database name.
o The database may be emp, student. Cell shop etc.,
o In columns, we will write the attributes of the databases.

It is needed to maintain strong relationship between data;

One of the most important functions of relational database management systems


programs is that it allows different data tables to relate to one another. When a
database contains information about employee data on its product sales in one
table and another table contain information one with sales employee data, then a
relational database will be perfect to manage their relationships in a systematic
and simple style. This system in turn can help brand managers to understand
important statistics like which salesperson is able to sell the most or which
product is being sold by a particular salesperson.

It helps Brand managers to search data in a better manner


The relational database management system also allows brand managers maintain
and build their data over successive years. The various tables in the relational
database management system allows brand managers to search through their
entire system for a particular information. The company manager can easily find
any information that they need, using a particular criterion. This is also available
for customers who can search for any feature that they want including price,
colour and brand. By storing information in a predictable and sequential format,
it enables users to find the information they need with a lot of ease.
b)
CRUD stands for create, read, update and delete. These are the four basic
functions of persistent storage. Also, each letter in the acronym can refer
to all functions executed in relational database applications and mapped to
a standard HTTP method, SQL statement or DDS operation.

 CREATE procedures: Performs the INSERT statement to create a new


record.
 READ procedures: Reads the table records based on the primary
keynoted within the input parameter.
 UPDATE procedures: Executes an UPDATE statement on the table
based on the specified primary key for a record within the WHERE
clause of the statement.
 DELETE procedures: Deletes a specified row in the WHERE clause.
c)
A schema contains schema objects, which could be tables, columns, data
types, views, stored procedures, relationships, primary keys, foreign keys,
etc.
A database schema can be represented in a visual diagram, which shows
the database objects and their relationship with each other.

1) ENTITY
An ERD entity is a definable thing or concept within a system, such as a
person/role (e.g. Student), object (e.g. Invoice), concept (e.g. Profile) or
event (e.g. Transaction) (note: In ERD, the term "entity" is often used
instead of "table", but they are the same). When determining entities, think
of them as nouns. In ER models, an entity is shown as a rounded rectangle,
with its name on top and its attributes listed in the body of the entity shape.
The ERD example below shows an example of an ER entity.
2) ATTRIBUTE
An attribute is simply one non-null cell in the spreadsheet, or the
conjunction of a column and row. It stores only one piece of data about
the object represented by the table in which the attribute belongs. For
example, the tuple can be an Invoice entity.

3) Record
In relational databases, a record is a group of related data held within
the same structure. More specifically, a record is a grouping of fields
within a table that reference one particular object. The term record is
frequently used synonymously with row

4) DATAFILE
A data file is any file that contains information, but not code; it is only
meant to be read or viewed and not executed. For example, this web
page, a letter you write in a word processor, and a text file are all
considered data files.

Programs may also rely on data files to get information.

You might also like