IDA-Group Assignment Question

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

AICT009-4-2-Introduction to Data Analytics Group Assignment Page 1 of 6

Introduction to Data Analytics Assignment

Learning Outcomes
On conclusion, students should be able to demonstrate the appropriate data analytic approaches
for a given problem (A3, PLO6).
This requires students to understand and exhibit the basic concepts in the field of data analytics,
knowledge discovery, data gathering and data mining techniques. Students should be able to
select, analyse and evaluate appropriate data analytics approach for simple real-world problems.

Assessment
The total assessment mark of this group case study is 50%, with 40% of the total contributed by
an individual component. Marking criteria is attached on this assignment.

Groups
Your class will be divided into groups. Each group will contain a maximum of 3 or 4 members.
In addition to your workload matrix, each member will also have to attach a personal reflection
report into their documentation.

Overview
For this assignment, you will come up with business domain of your interest and frame the
analytical problems to be solved and explore it using the techniques of data analysis that we have
discussed in class and explored in the labs.
You will set your own Aim and Objectives and propose an analytical solution that would
hypothetically solve the problems and give positive business impacts.
You should choose any current topics that interests, including but not limited to, levels of wealth,
housing, education, transportation, medical, sports, manufacturing, banking, gaming, agriculture
and etc. You shall adopt a methodology and perform all activities within the phases defined
including data selection, data integration, data processing (cleaning & transformation) and
analysis steps appropriate to the scope. As a result of these activities, you should present your
entire assignment with a demonstration of the models built and communicate the insights in
terms of business benefits and its impact.
You are required to prepare an individual documentation reflecting the efforts undertaken in
completion of the project.

Diploma Asia Pacific University of Technology and Innovation 2021


AICT009-4-2-Introduction to Data Analytics Group Assignment Page 2 of 6

You may use the following as stepwise guideline:

• Define your Group Business Domain

• Domain Background information: Write a description of the selected dataset and project,
and its importance for your chosen company/ domain. Information must be appropriately
referenced.

• Download required datasets.

• Combine and clean data, and prepare the data sets

• Use appropriate tools for Data analytics

• Change variable names and labels

• Check for missing data types and variables

• Transform any variables that you would like to use in a different form (raw numbers to
percent, etc) – if required

• Transform continuous variables into binary variables, tabulate observations, combine


variables – if required

• Perform the relevant data analysis tasks using data mining techniques such as
classification/association/time series/clustering and identify the BI reporting solution
and/or dashboards you need to develop

• Interpret the analysis

• Justify why you chose these BI reporting solution/dashboards/data mining techniques and
why those data sets attributes are present and laid out in the fashion you proposed (feel
free to include all other relevant justifications).

• To ensure that you discuss this task properly, you must include visual samples of the
reports you produce (i.e. the screenshots of the BI report/dashboard must be presented
and explained in the written report), and also include any assumptions that you may have
made about the analysis.

• Write-up your findings

Diploma Asia Pacific University of Technology and Innovation 2021


AICT009-4-2-Introduction to Data Analytics Group Assignment Page 3 of 6

Components of your analysis


Your analysis must include the following and you must interpret the findings.

Descriptive Analysis:

• Histograms and scatterplots or any visual reporting to demonstrate the data


distribution.

• Data Distributions, Outlier Detections and Missing Value analysis

• Correlation: Show output for and describe correlations between variables.

• Regression: Bivariate regressions or multivariate regression, show output for and


describe (coefficients, t-stat, p-value, r-squared)

• OLAP reporting Dashboard

Predictive Analysis & Other Data Mining :

• Classification model – generate the classifier and test data to predict the outcome
and describe the influencing factors and values. Test the model accuracy and
precision

• Association Rule – use to identify the relationships and similarities among itemset

• Text Mining – use for unstructured text data to process Natural Languages,
analyse sentiments or recommendation analysis

Diploma Asia Pacific University of Technology and Innovation 2021


AICT009-4-2-Introduction to Data Analytics Group Assignment Page 4 of 6

Minimum report requirement


Business Goal & Objective
Once you have obtained the datasets for analysis, you and your group members have to specify
what the ultimate purpose of mining this data is. For example, seeking patterns in your data to
help you retain good customers, you might build one model to predict customer profitability and
a second model to identify customers likely to leave.

Data Analytics Lifecycle & Methodologies


You have to adopt one of the knowledge discovery process methodologies like SEMMA,
CRISP-DM, FAYYD’S KDD, etc. to guide you through the project process. It is very important
to explain and justify the methodology that has been chosen.
Dataset Preparation
To go through data selection, cleaning, formatting and exploring. The goal of exploring is to
identify the most important fields in predicting an outcome and determine which derived values
may be useful.

Type of Prediction & Modelling Techniques


The next step is deciding on the type of prediction that’s most appropriate: (1) classification:
predicting into what category or (2) regression: predicting what number value a variable will
have (if it’s a variable that varies with time, it’s called time series prediction).
Now you can choose the model type: a neural net to perform the regression, perhaps, and a
decision tree for the classification. There are also traditional statistical models to choose from
such as logistic regression, discriminant analysis, or general linear models. The most important
thing is to choose the model type that meets your requirements and objective.
You may also use unsupervised learning models like Kmeans algorithm to cluster your datsets.

Analysis & Recommendations


Once a data mining model is built and validated the results must be analyzed to recommend
actions based on it. Discussions on social impacts and ethical issues are encouraged if it is
relevant to the solution. Analysis and comparison of results between group members or
comparison between other techniques are encouraged for better results and understanding of the
output.

Getting datasets
Every project must involve at least one dataset. There are many interesting and freely available
datasets that you can find in the internet especially on social networking datasets, airline data,
weather forecasting and much more.
Example of Open Datasets:
https://www.data.gov.my/
https://data.world/
https://www.kdnuggets.com/datasets/government-local-public.html

Diploma Asia Pacific University of Technology and Innovation 2021


AICT009-4-2-Introduction to Data Analytics Group Assignment Page 5 of 6

https://github.com/awesomedata/awesome-public-datasets
AWS Open Datasets : https://registry.opendata.aws/

Data mining software:


You can implement your project using the following data mining software package and or any
other tool that is deemed appropriate:

a) SAS: SAS on Demand for Academics


b) Microsoft Power BI
c) TIBCO: Spotfire
d) Tableau
e) R Programming
f) Python Programming

Deliverables

Proposal (Group Submission) – 20%


Each group is required to submit a proposal which gives a brief description of the business goal
that the group aims to achieve with its proposed data analytics solution. In addition, the proposal
should also specify the aim, objectives and scope of the solution as well as identify any data sets
that will be used. The proposal will be due sometime in week 7 of your semester. Proposal
Template would be handed by class lecturer.

Individual final report (Individual Submission) – 80%


The end result of your project is a final report that clearly and concisely describes what you did,
the results you obtained, and what they mean. This will be due sometime in week 14 of your
semester.

Presentation
Students must be able to demonstrate the deliverables using SAS/any other suitable tool(s) for
data preparation and analytics. You will be required to interpret the results as per objective and
scope specified in your assignment.

Marking Criteria

Business Goal 20%


(identification of analytical case study and setting the right business goal)
Chosen methodology 5%
(identify methodology used; explanation on selection)
Data preparation 20%
(effort in data pre-processing , document all steps taken in data pre-processing)
Data Mining Algorithm / Technique used 10%
(Select and describe the DM technique used and identify the pros/cons. Demonstrate the model during presentation)
Model evaluation 20%

Diploma Asia Pacific University of Technology and Innovation 2021


AICT009-4-2-Introduction to Data Analytics Group Assignment Page 6 of 6

(Illustration of each model and technique used + final result. Supported evidence from software tools used during
presentation)
Accuracy in meeting Objectives/ Business Goal & Completion of Assignment 10%
(overall achievement and effort delivered in solving problem)
Workload matrix & Personal reflection report 10%

Documentation standards & formatting 5%

Total 100%

Note: If unable to form a group due to insufficient student numbers or other approved reasons by module
lecturer, marking criteria above will be considered 100% as Individual component (all criteria marked as
individual component)

Diploma Asia Pacific University of Technology and Innovation 2021

You might also like