AD Module and Assessment Handbook 2022-23-16 - 8 - 2022

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 24

Advanced Databases

Assessmentand ModuleHandbook
2022/23Students
Level6, Semester A
(20 Credits)

Student Name ____________________________________________________________

Email Address ____________________________________________________________

Course _____________________________ Group ______________________________

Module tutor _________________________________

Communication Protocol: module staff will reply to student questions within a reasonable time but this will
normally be within office hours only. Students are advised to check this Handbook and also to see if there are any
online announcements or FAQ answers that deal withr thei
enquiry before contacting staff.
Contents

Contents
1 What this Module is about.....................................................................................................................2
1.3 Module Learning Activities..............................................................................................................3
1.4 Graduate Attributes Developed and Assessed.................................................................................4
2 Schedule of Work....................................................................................................................................5
At a glance schedule..........................................................................................................................5
3 Key Resources to Support Learning (see reading list on VLE).................................................................7
4.1 Introduction to the Assignment using the Police case study............................................................8
1. The Police Reporting Crime System (PRCS) (oracle database)........................................................8
2. Police System - Wales (see Appendix A also).................................................................................8
3. A speadsheet of crimes in Leeds (see Appendix A also).................................................................8
Data Mart (DM) – Design and ETL considerations..............................................................................9
Assignment 1 – tasks -1-3......................................................................................................................9
Assignment 2 – Tasks 1-2.................................................................................................................11
Data Mart (DM) – OLAP, dashboard and DW approach...................................................................11
Appendix A...........................................................................................................................................13
The police case study – data sets.............................................................................................................13
1) The Police Reported Crime System (PRCS).......................................................................................13
2) The Police System – Wales (PS-Wales).............................................................................................14
4.4 Reassessment............................................................................................................................22
4.5 Feedback....................................................................................................................................22
5 Understanding Your Assessment Responsibilities.................................................................................23

1 What this Module is about


1.1 Welcome to Advanced Databases!
On behalf of the module team, I would like to welcome you to this module, which we hope you will
find both challenging and rewarding.

The module is based around the use of data for business decision making and business intelligence
solutions and data analytics dashboards and applications. We will follow a database mart/
warehouse development lifecycle by firstly looking at what we want to investigate in the data mart
along with the types of reports (requirements). With this information we can identify the data
required and use it to design an appropriate data model (design), part of the design will involve

2
checking (and dealing with) the data quality. The implementation will involve the creation of a data
mart data model and writing SQL based ETL scripts to ‘extract, transform and load’ the data from the
original data sources into the data warehouse model. At this point we can produce some reports and
consider the best visualisation tools for this.

Along the way topics such as data management, data integration, data quality, the data dictionary,
data maintenance, and data ethics will be discussed. The module has a good mixture of database
expertise and business understanding – you will write SQL and you will discuss business strategies!
As your module tutors, we aim to provide you with a coherent set of learning opportunities, which
will enable you to develop your skills and knowledge in databases.

We hope this will be a valuable learning experience for you.

Jackie

Module Aims

• This module aims to build understanding and practical capabilities of database


technologies for effective management and utilisation of an organisations data
resource.
• The principles, techniques and concepts of data integration, extraction and data
quality when implementing a database system are addressing theoretically and
practically.
• Your knowledge of technical and practical development and the end-purpose of
using a database system for decision support making will be developed.

1.2 Module Learning Outcomes:


On completion of this module the student should be able to:

1. Demonstrate a critical understanding of the theories underpinning a range of database


design and management issues, including methods, technologies and emerging trends
2. Critically evaluate the role of development tools in the design, development and
management of database systems
3. Assess the organisational context, roles and tools for effective utilisation of an enterprise’s
data resource.

1.3 Module Learning Activities


Keynote lectures will be used to develop knowledge and understanding and to provide a
discussion of students’ own research finding.
Tutorial sessions will be lab-based and will be mainly problem or enquiry based, allowing students
to analyse, evaluate and discuss these technologies using case study scenarios, and to use and
appraise applications and technologies. These activities will be both individual and group-based.
Students will discuss and present ideas to their peers, enabling tutor and peer feedback.
Independent research and application will be expected as students will need to read around the
subject in order to gain a wider understanding of the theory application of the technologies
covered.

3
1.4 Graduate Attributes Developed and Assessed
• The Enterprise graduate attribute is both developed and assessed via problem based
activity.
• The digital literacy graduate attribute is both assessed and developed via the investigation
and selection of appropriate tools for the activities.
• The global outlook graduate attribute is developed in that students are required to
consider the case study activities for an international market.

4
2 Schedule of Work
You will need to revise the material from L4 and L5 and remind yourself of database design and
modelling. An introduction to database systems by CJ Date Chapters 1-4 is good – although others
listed in the reading list are preferred by some students. There is also material on the VLE under Unit
1.

General background and appreciation to this module can be found in the book Big Data: A
Revolution That Will Transform How We Live, Work and Think Viktor MayerSchonberger
and Kenneth Cukier . Copies are in the library.

At a glance schedule

1 Decision Support systems, Analytics and the Data Warehouse.


26/9/22
2 OLAP and OLTP – Decision support
03/10/22 (Understand the Business and Data)

Formative upload 1 – upload SQL queries (see assessment guide by


10/10) – contributes to portfolio
3 Data Marts and Star schema
11/10/22 (Understand the Business and Data)

Formative upload 2 – upload draft star schema (see assessment guide by


17/10) – contributes to portfolio
4 ELT - Data Integration, Data Quality, Data Prep
17/10/22
5 Data Transformation and ETL (ETL)
24/10/22 (Data Prep and data model)

6 Slowing Changing Dimensions


31/10/22
7 Assignment 1 support
07/11/22
Assignment 1 – tasks 1-3 due in by 14/11/2022

8 Using tools, tableau for data prep and data analysis


14/11/22 (data prep, data analysis)
Extracting data from the Data Mart for analysis – in Access/Excel and
using Oracle Apex (data analysis)
Extracting data from the Data Mart for analysis – for dashboards Data
Maintenance

9 PL/SQL – functions and Procedures


21/11/22
10 PL/SQL – Packages, applications, process and application
28/11/22
11 Data Management reflection
05/12/22 Assignment support

5
12 Assignment support by appointment or email
12/12/22
Assignment 2 – tasks 4-5 due in by 13/1/2023

6
3 Key Resources to Support Learning (see reading list on VLE)
Please see online reading list.

7
4 ASSIGNMENT REQUIREMENTS - Design and considerations of a Data Mart -

4.1 Introduction to the Assignment using the Police case study

The Police Force has a number of stations throughout England, each station is located in a
regional area such as ‘Yorkshire’, or ‘Lancashire’. When a crime is reported it is immediately
assigned to a ‘station’, this is based on the area where the reported crime occurred.
Reported Crimes may be new or may be existing crimes which are ‘open’ or ‘closed’. A
reported crime tends to last between 1 month and a year, although some will last years. The
crime will be reviewed yearly after the reported date unless it has a ‘closed’ date. All crimes
belong to a specific station and have a Lead Police Officer. A Lead Police Officer could be
managing more than one crime at a time. Some crimes may be escalated to a higher level that
the ‘station’ in which case their status is marked as ‘escalated’.
You have been given a number of data sources.

1. The Police Reporting Crime System (PRCS) (oracle database)

A database management system used by the Police in England to record crimes (see
Appendix A). The script to create these tables and load test data in on MyBeckett
PRCS.sql and PRCS_insert_test_data.sql

and

2. Police System - Wales (see Appendix A also)


This is a database management system for Wales only. It records similar information, but is
slightly different. The script PS_Wales.sql sets up the tables and some data for this system.
and
3. A speadsheet of crimes in Leeds (see Appendix A also)
A spreadsheet of summary data of crimes in Leeds.

Note: The data you have been given is very limited, this helps to understand it and to be able
to verify your results. However, depending on the reports implemented you may end up with
very little data. This is fine. However, if you would like you may add extra data.

Your role is as an analyst/ developer on a Data Mart (DM) project to support the design,
analysis and collection of information relating to this Police case study.

8
Data Mart (DM) – Design and ETL considerations

The Police have a number of KPI’s (Key Performance Indicators) they would like to consider.
These include:

KPI 1: Reduce crime


This KPI is concerned with reducing crime. Examples of the types of reports they would like
are:
• Number of crimes per year
• Number of crimes per crime type per year

KPI 2: Close crimes


This KPI is concerned with ensuring crimes are solved and complete.
• Number of closed crimes per year

KPI 3: Identify areas with crime hotspots


This KPI is concerned with understanding crime patterns. Examples of the types of reports
they would like are:
• Number of crimes per station

Choose one of the KPIs above to focus your assignment around. Address all the tasks with
respect to this KPI.

Assignment 1 – tasks -1-3


Task 1: Data Mart (DM) star schema design for your chosen KPI
Identify 3-5 reports* that your star schema will support.
• Document the star schema (SS) design model to support these reports – use QSEE
• Use the data dictionary template from tutorials to document the data model the project.
• Select one of the reports* you have suggested. Illustrate the expected data in the star
schema to support the report - use Excel (or oracle or similar) to do this and add a few
rows.
[20 marks]

Task 2: Star Schema set up (DM environment) [5 marks]

Use QSEE to forward engineer the database for the star schema (SS) database you have
designed. Create and run a script to create the data mart tables (edited as appropriate).

Include QSEE generated script(s) as part of your upload along with screen shots as evidence
of the code running successfully and documentation of any changes you have made or
problems you encountered.

See marking scheme for more direction.

9
Task 3: Extract, Transform and Load (ETL) script to populate the Star Schema (DM) with data
The ETL is a script that puts the data into the DM tables. It does this by extracting the data
from the original sources, transforming the data as required and then loading them into the
DM tables.
Write an ETL script to:
• populate one - two dimension tables,
• the time_dimension table and
• 5-10 rows of the fact table with measure(s) – this will depend on your own project.
To do this, identify one of the reports to support (ideally the one you have already planned
the expected data for task 1). Your script should deal with 2-3 data quality issues, 1
transformation and include at least 1 measure/calculation for the FACT table. [25 marks]

Perform and provide evidence that you have successfully completed these tasks (via screen
shots which show your student id or evidence of successfully run scripts, this work should be
done in your own University apex account). Documenting any changes you have made or
issues you have encountered.

Assignment 1 upload: Please upload a word report addressing tasks 1-3, include
any code as an appendix to this document.

Upload to MyBeckett by Monday 14th November 23:00.

10
Assignment 2 – Tasks 1-2

Data Mart (DM) – OLAP, dashboard and DW approach


Task 1: Data Analysis/OLAP/Mining Investigation

Undertake some data analysis on the data from one of the data sources you have been given.
In this report screen shot some key visuals and also include some written interpretation of the
visual (show and tell). Either:

a) Upload the case study spreadsheet into MS Excel (or tableau). Create a pivot table
and produce some interesting (and appropriate) reports using the charts and
visualisations functionality. Use literature of data analysis, business intelligence and
OLAP to support and drive this task. You may include (or discuss) external data as
well. [20 marks]
OR

b) Using Apex create a dashboard for The Police system. Use literature on data
analytics and dashboards to inform the design. The dashboard should support your
chosen KPI. You can use the data from the Data Mart tables or the source data
tables.
[20 marks]
Task 2: Take your design and code further by using PL/SQL techniques
a) Identify, code, test a function, procedure and package to support the ETL task.

[20 marks]
See marking scheme for more direction.
[30 marks]

See marking scheme for more direction.

Assignment 2 upload: Please upload a word report addressing tasks 1-2, include
any code as an appendix to this document.

Upload to MyBeckett by Friday 13th January 2023 23:00.

Evidence of portfolio uploads

There are 2 key formative uploads:


1. SQL practice – upload by 10/10/22
Upload your code and evidence of the code running successfully in oracle apex for
these queries:

11
 Number of crimes per year per month
 Number of crimes per crime type per year
 Number of crimes per station per year
Aim to use useful titles to ensure the report is meaningful
2. Star Schema model – upload by 17/10/22
 Upload a list of 3 reports your star schema model supports and the star
schema model to support these reports (created in QSEE and as a screen
shot).
 This can be used for your assignment. You may complete more of task 1 for
this if you like.

These will be reviewed, and formative feedback given.

12
Appendix A

The police case study – data sets


1) The Police Reported Crime System (PRCS)

The Police Force has a number of stations throughout England, each station is located in a
regional area such as ‘Yorkshire’, or ‘Lancashire’. When a crime is reported it is immediately
assigned to a ‘station’, this is based on the area where the reported crime occurred.
Reported Crimes may be new or may be existing crimes which are ‘open’ or ‘closed’. A
reported crime tends to last between 1 month and a year, although some will last years. The
crime will be reviewed yearly after the reported date unless it has a ‘closed’ date. All crimes
belong to a specific station and have a Lead Police Officer. A Lead Police Officer could be
managing more than one crime at a time. Some crimes may be escalated to a higher level that
the ‘station’ in which case their status is marked as ‘escalated’.
Attributes have not been documented. They can be inferred from the columns defined at the
logical design stage. The script to create these tables and load test data in on MyBeckett
PRCS.sql and PRCS_insert_test_data.sql

The conceptual model for PRCS (England only)


Note that for the logical and physical model M:N relationships will usually include a link table.

13
2) The Police System – Wales (PS-Wales)
In Wales, there is a slightly different database system. See the model below:

The script to set up and run the Police System Wales is PS_wales.sql

The online version of the PS-Wales (for info only)

14
3) Spreadsheet of Crime_data_leeds
You have been given a spreadsheet of data, which is summerised data. You are not
expected to use this the ETL or data mart tasks. You can use if for the dashboard task if
you wish.

15
ASSIGNMENT MARKING SCHEME mark out of 100%
Student no: Group: Date:

Criteria Level: 70%+ 1st 60 – 69% 2:1 50 – 59% 2:2 40 – 49% 3rd <39% Fail

Task 1: 3-5 reports of DM type listed and Star schema fully Star schema mostly Star schema mostly Star schema doesn’t support
complement the chosen KPI. specified and meets specified and meets most specified and meets most queries. Little understanding
DM Star schema requirements, design requirements, some requirements via a DD. evidenced.
Star schema fully specified and meets decisions discussed and design decisions discussed
design requirements specified design justified. and justified via a data There are major flaws in Very little data, or incorrect data.
decisions discussed and dictionary. the design, considerations
20 marks justified via the data dictionary(s). Mostly as for a first, or data illustrated. Little
Data from both sources. some areas The reports and examples understanding evidenced..
ambiguously defined, are very similar to those Some literature.
Advanced concepts included such as missing or generally given in class or in
SCD simple, however assignment specification.
excellent understanding
All attributes correctly documented. evident. Generally well done with
no major flaws.
Supports functionality and reports as Data from both sources.
documented. Literature included,
Literature included described rather than
Design considerations (Granularity ‘matched’ rather than used to inform the tasks.
design decisions and reflected in star applied.
schema design).

The data is illustrated for each


dimension and the fact. With a least a
few rows in each, M:N.The data is
correct based on given case study
data.

Shows full understanding.

Clear evidence of application of


literature to the assignment tasks

TASK 2: Excellent database produced. Good database. Database produced. Using Db produced. Little Little database, or evidence of
Evidence that CASE tool has been Evidence that CASE tool CASE tool .Some aspects evidence that CASE tool CASE tool Database design not

17
SS- FACT table used. All aspects of the CASE has been used. Most of the database design have been used. Few addressed and dealt with.
and 2 other database design issues addressed & aspects of the CASE addressed & dealt with. aspects of the database
dealt with. database design issues design addressed & dealt
tables as a addressed & dealt with. with.
minimum (5
marks)

Task 3: Design of All tables fully populated from All tables populated All tables populated Some tables populated. Little population of db
the ETL and both from both “data- from both “databases”
“data-bases”. Data Extract, bases” as specified. as specified. Some ETL Some evidence of SQL Little evidence of the SQL
populate star
Quality, transformation, Some stages addressed. running successfully. running successfully.
schema
calculation of measures fully Data Extract, Maybe errors in code or
evidenced using mostly SQL. Quality, population. Evidence of Maybe lack of evidence Little
(25 marks ) of the SQL successfully understanding evidenced.
Competent use of appropriate, Transformation, SQL running
advanced SQL (eg sequences etc). calculation of successfully, may be running, or output that
Evidence of SQL running measures evidenced very similar to that omits information.
successfully. (mostly using SQL). given in class.
Some understanding
Evidence of SQL
Code is significantly developed Understanding evidenced
running successfully.
from that given in tutorials. Uses evidenced.
standards and includes Code is partially
comments. developed from that
given in tutorials.
Evidence of SQL running
successfully. Evidence of SQL
running successfully.
Excellent understanding
evidenced Good understanding
evidenced.
Feedback:

18
ASSIGNMENT 2 MARKING SCHEME
Student no: Group: Date:

Criteria Level: 70%+ 1st 60 – 69% 2:1 50 – 59% 2:2 40 – 49% 3rd <39% Fail

Task 1: OLAP: An excellent pivot table OLAP: A very good pivot table OLAP: A pivot table OLAP: A pivot table OLAP: little or not useful pivot
produced with very relevant and produced with very relevant and produced with some reports produced with report – table produced.
a) OLAP suitable reports – – visualisations,

OR As before and excellent using appropriate visualisations, suitable reports – using visualisations, labels, labels, titles may not be useful.
application of tools, research, labels, titles etc. correct visualisations, titles may not be useful.
Apex dashboard understanding. labels, titles etc. An apex dashboard created to
An apex dashboard created to An apex dashboard meet very basic KPI(s), very
(20 marks) meet KPIs, excellent HCI, data An apex dashboard created to meet basic basic SQL or code generated
consideration, advanced code. created to meet KPIs, KPIs, some HCI, data by Apex.
some HCI, data consideration, basic SQL.
Evidence of research being consideration, correct Little research or bibliography.
appropriately applied to drive code. Some evidence of
the areas and types of research – described Some understanding evidenced
investigation. Bibl’y Evidence of research some rather than
application. Bibliography. applied.Bibl’y.
Excellent understanding.
Good understanding. Reasonable understanding
evidenced
Task 2: Excellent design of package to Excellent design of package to Package designed and A procedure and Little or no understanding of
support the ETL process (or part support the ETL process (or includes procedure and function designed coded code ideas submitted.
PL/SQL of). part of). function. All appropriate. and working.
Mostly coded, maybe  Design
(30 marks) One function and one One function and one procedure some issues with fully Maybe package not  Code
procedure coded as part of the coded as part of the package. testing. entirely understood or  Test
considerations package. Code is moved on Code is moved on from that used for testing. Code as  Presentation of work
from that given in class given in class examples and well given in class.
examples and well tested. tested. Not as advanced as for a
first.

19
Feedback:

20
DETAILS OF THE REASSESSMENT

4.4 Reassessment
Reassessment is to ‘re-do’ either or both components as required. A summary sheet listing all
changes is required, this can be easily tracked using the Word, ‘tracking’ facility.

Reassessment date: : 7th April 2023

4.5 Feedback
Formative feedback will be given in tutorials. General feedback will be given approximately a
week after the hand-in date and individual feedback upto 3 weeks after the hand-in date.

Formative feedback is feedback on “what you have done already”, this gives you the opportunity to
apply the feedback to your final assignment. Students generally find this kind of feedback very useful
and gain better marks as a result.

Feedback schedule
Assignment Feedback

Assignment 1 14/11 2022 by 23:00 Formative feedback available as indicated in


schedule

General feedback provided via the VLE by week 11.

Individual feedback by end of week 11

Assignment 2 13/1/2023 by 23:00 Formative feedback available as indicated in


schedule

General feedback provided within 3 weeks.

Individual feedback within 3 weeks.

You are always welcome to make an appointment to discuss your feedback.


24

5 Understanding Your Assessment Responsibilities


Please refer to Course Handbook as appropriate

Mitigation and Extenuating Circumstances

If you are experiencing problems which are adversely affecting your ability to study (called
'extenuating circumstances'), then you can apply for mitigation. You can find full details of how to
apply for mitigation at:

http://www.leedsbeckett.ac.uk/studenthub/mitigation.htm

Late Submission

Without any form of extenuating circumstances, standard penalties apply for late submission of
assessed work. These range from 5% to 100% of the possible total mark, depending on the number
of days late. Full details (section C1.5.7) of the penalties for late submission of course work are
available at:

http://www.leedsbeckett.ac.uk/about/files/C1_Assessment_ -_General_Provisions.pdf

Academic Misconduct

Academic misconduct occurs when you yourself have not done the work that you submit. It may
include cheating, plagiarism and other forms of unfair practice. What is and what is not permitted is
clearly explained in The Little Book of Cheating, Plagiarism and Unfair Practice, available at:
http://www.leedsbeckett.ac.uk/studenthub/plagiarism.htm

The serious consequences of plagiarism and other types of unfair practice are detailed in section C9
of the Academic Regulations at: http://www.leedsbeckett.ac.uk/about/academic -regulations.htm
25

You might also like