Data Analysis For Accountants Assessment 2

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 13

DATA ANALYSIS FOR BUSINESS

Assessment 2 – PORTFOLIO

Student ID number:

PART 1: Presentation and Display of Survey Data

1 (a) Averages in the news

Provide the link to the article here:

https://www.londondaily.news/workplace-collection-pots-grow-to-an-average-of-139-as-brits-
give-generously-to-coworkers/
Paste a screenshot of the headline and relevant parts of the report mentioning the average
below:

2021-11-29T13_57_06.2496584Z-Data Analysis for Business 202122 - Assessment 2 Brief - 52708.docx


1
Which kind of average has been used?

The average used in the report is the mean, which the sum of all workplace collection pot
funds divided by the working populations.

What was the size of the sample used to find the average?

It is not specified but an analysis was done on tens of thousands of Collection Pots in
workplaces within the UK.

What is the source of the data? How was the data collected?

The source of the data is the Collection Pot, which is the biggest workplace collecting
platform in the UK. The data was collected from internal analysis of Collection Pots in
workplaces around the UK.
Do you think the average was chosen to give valid and reliable results, or to attract the
most readers?

Yes. The average was chosen to give valid and reliable results

1 (b) Charts in the news

Find an example of what you consider to be a GOOD chart about the cost-of-living crisis, and
an example of a BAD one. Include the charts on this page, together with links to the source.
For each chart, give THREE reasons why you consider it to be good or bad.
“GOOD” CHART

2021-11-29T13_57_06.2496584Z-Data Analysis for Business 202122 - Assessment 2 Brief - 52708.docx


2
LINK
https://www.igd.com/articles/article-viewer/t/conflict-compounds-cost-of-living-crisis/i/29621

THREE REASONS YOU THINK IT IS “GOOD”


1. The x and y axis are clearly defined and the source of data shown.
2. It has a well highlighted and labelled legend to show differentiation of colours signifying
various parameters on the chart.
3. The chart is very simple and straight forward. It is not overburdened with text or graphs
to complicate it.

“BAD” CHART

LINK
https://en.wikipedia.org/wiki/2000s_United_States_housing_bubble

THREE REASONS YOU THINK IT IS “BAD”


1. The chart is too overburdened with too many coloured graphs and making it hard to
understand the chart
2. The x and y axis are not defined and the title is not coherent.
3. Poor visibility. The legend is not visible because of too much text.

2021-11-29T13_57_06.2496584Z-Data Analysis for Business 202122 - Assessment 2 Brief - 52708.docx


3
PART 2: Correlation and Regression

2 (a) Correlation and causation

Using examples, outline the differences between correlation and causation.

Correlation is a statistical measure illustrating how two or more variables move together.
On the contrary, causation goes deeper by looking into for cause and effect between two
variables. A good example of correlation is; consider the relationship between the income in
a household and its expenditures. If a survey is conducted across homes, it is likely to
discover that richer households spend more. In such a case, we can conclude there is an
income and expenditures correlation. When it come to causation, a good example is; my cat
gets fatter because its being fed more. Such a scenario is called cause and effect. The cat
being fed more treats is the cause of it becoming more heavier, i.e., the effect.

Reference:
https://saylordotorg.github.io/text_microeconomics-theory-through-applications/s21-23-
correlation-and-causality.html

2 (b) Correlation and regression

For this part of your portfolio, you are required to find some data from Statista.com (email
address and password required off campus), a leading provider of market and consumer data.
You should find some data broadly related to your degree title. (If you do not have access to
Statista, your lecturer will advise you on where to find some data).

For the correlation and regression analysis, you will need TWO columns of data so that you
can have X and Y variables. You made need to download two different files and then
combine them in Excel. You need two columns of the same length, with at least 10 pairs of
(X, Y) data.

Write down the titles of the data file(s) you have chosen to use:
Data file title: UEFA champions finance and social media Practice
Total Revenue Vs. Capacity

2021-11-29T13_57_06.2496584Z-Data Analysis for Business 202122 - Assessment 2 Brief - 52708.docx


4
Write down which variable you have chosen as X and which as Y, and give a reason for
your choice.
The variable for X is Champion League’s Teams Total Revenue and for Y is their Stadium
Capacity. I chose the two variables because higher stadium capacity means that more
people can attend the team game thus leading to higher revenue.

Use Excel to create a fully-formatted scatter chart of your data, showing the linear
regression line, R-squared value, and the equation of the regression line.

UEFA Champions Clubs


Total Revenue vs Capacity
90000
80000
70000
f(x) = 45.4819539459764 x + 36490.310726544
60000 R² = 0.371549384071229
50000
Capacity

40000
30000
20000
10000
0
0 100 200 300 400 500 600 700 800
Total Revenue

What is the value of the correlation coefficient for your chosen data? What does it mean?
The correlation coefficient in this case is 0.61 which is positive correlation. That is to say the
higher the capacity the higher the revenue and vice versa.

Using the values in the regression equation, explain what your analysis shows.
The values of the regression equation are:
y = 45.482x + 36490
36490 is the intercept and 45.482 is the slope. 36490 is the capacity, when total revenue is 0.
While 45.482 is the estimate of change in capacity for a unit change in the total revenue.

Explain the concepts of interpolation, extrapolation, validity and reliability when using a
regression model for forecasting.

2021-11-29T13_57_06.2496584Z-Data Analysis for Business 202122 - Assessment 2 Brief - 52708.docx


5
INTERPOLATION:
Interpolation is predicting values which are inside of a data points’ range. In other words,
when using a fitted regression model to forecast the values of points inside the existing data
points’ range that’s interpolation.
EXTRAPOLATION:
Extrapolation is predicting values past a range of data points. Basically, when using fitted
regression model to forecast the values outside an existing range it is extrapolation.
To understand the concept of interpolation and extrapolation consider the graph below.

VALIDITY:
It is defined by how well a measure measures what it is intended to measure. More
specifically, this concept refers to how well the collected data and their analysis support the
results or findings, and whether the results or findings extend to other contexts, or
generalize
RELIABILITY:
Reliability refers to the consistency of the measure. High reliability indicates that the
measurement system produces similar results under the same conditions. If you measure
the same item or person multiple times, you want to obtain comparable values.

2021-11-29T13_57_06.2496584Z-Data Analysis for Business 202122 - Assessment 2 Brief - 52708.docx


6
PART 3: Index numbers and Percentages

3 (a) Excel functions and percentages


The table below shows selected price details for unleaded petrol and diesel published by the
Department for Business, Energy and Industrial Strategy.
(https://www.gov.uk/government/statistics/weekly-road-fuel-prices)
A B C D E
1 Petrol price Diesel price Duty rate VAT
(pence/litre) (pence/liter) (pence/litre) (% rate)
2 November 2018 125.8 135.4 57.95 20
3 November 2019 125.3 130.1 57.95 20
4 November 2020 112.4 117.4 57.95 20
5 November 2021 146.9 150.1 57.95 20

Using the example spreadsheet above, explain how the Excel VLOOKUP function can be
used to find items in an Excel table.

How the VLOOKUP function can be used to find Petrol price in November 2021
1. Select the cell where you want the result to be and start the VLOOKUP function by
typing: =VLOOKUP (
2. The first argument is the lookup value then you add a comma: =VLOOKUP (E1,
3. The second argument in VLOOKUP function is the table array. For this segment,
select or type the range and add a comma: =VLOOKUP (E1, A2:E5,
4. The third argument is the column_index_no. which in this case is the second
column,2, then add a comma: =VLOOKUP (E1, A2:E5,2,
5. Write the word: FALSE for an exact match and close the blanket: =VLOOKUP (E1,
A2:E5,2, FALSE)
6. Press on enter! You are done.

2021-11-29T13_57_06.2496584Z-Data Analysis for Business 202122 - Assessment 2 Brief - 52708.docx


7
The price of petrol and diesel includes fuel duty (levied at a flat rate of 57.95p per litre).
VAT at 20% is then charged on both the product price and the fuel duty. Complete columns
C and D, and rows 6 and 7 in the table below, and give an explanation of your calculations:
A B C D
1 Petrol price Petrol Price Petrol Price before
(pence/litre) before VAT duty and VAT
(pence/litre) (pence/litre)
2 November 2018 125.8 104.8 46.9
3 November 2019 125.3 102.8 44.8
4 November 2020 112.4 93.7 35.7
5 November 2021 146.9 122.4 64.5
6 % Change, 2018 to 2021 16.8% 16.8% 37.5%
7 % Change, 2020 to 2021 30.7% 30.7% 80.5%

Explanation:
Petrol price
i. Petrol Price before VAT = , the formular takes away the 20%
1.2
VAT.
ii. Petrol Price before duty and VAT = Petrol Price before VAT - fuel duty
iii. % Change, 2018 to 2021 for all the petrol prices =
Price∈ November 2021−Price∈November 2018
x 100
Price∈ November 2020
iv. Similar calculation is done for % Change, 2020 to 2021, but for 2020 and 2021.
Price∈ November 2021−Price∈November 2020
x 100
Price∈ November 2020
3 (b) Rebasing and Descriptive Statistics
The table below shows the average price of petrol and diesel in November from 2012 to
2021. Fill in all the missing values and give an explanation of your method.

Year Petrol Price Petrol Price Diesel Price Diesel Price


(pence/litre) Index 2012=100 (pence/litre) Index 2012=100

2021-11-29T13_57_06.2496584Z-Data Analysis for Business 202122 - Assessment 2 Brief - 52708.docx


8
2012 134.4 100.0 141.1 100.0
2013 130.0 96.7 137.7 97.6
2014 122.3 91.0 127.2 90.1
2015 107.3 79.8 110.3 78.2
2016 114.6 85.3 117.4 83.2
2017 120.1 89.4 123.7 87.7
2018 125.8 93.6 135.4 96.0
2019 125.3 93.2 130.1 92.2
2020 112.4 83.6 117.4 83.2
2021 146.9 109.3 150.1 106.4

Explanation:
For the first and third column which contain petrol and diesel prices respectively. These
prices are calculated the same way as follows:
Price∈2012
¿ x Price Index of the year beingcalculated
Price Index∈2012 ¿

For the second and fourth column contain petrol and diesel prices indexes respectively.
They are calculated as follows:
Price Index∈2012
¿ x Price of the year being calculated
Price∈2012 ¿

Use the information in the table above, together with any descriptive statistics you may
calculate, to make FIVE relevant comments and/or comparisons about Petrol and Diesel
prices since 2012.
1. The petrol prices are lower than diesel prices irrespective of the year in the table above.
2. Both prices index of petrol and diesel 100 in 2012.
3. The average price of petrol is 123.9 for the period given while that of diesel between 129.04
4. In 2012 Petrol price was 134.4 and in 2021 the price was 146.9. That’s a 9.3% change.
5. Diesel prices in 2016 and 2020 were the same.

PART 4: Time Series Analysis

2021-11-29T13_57_06.2496584Z-Data Analysis for Business 202122 - Assessment 2 Brief - 52708.docx


9
The table below shows information about visits to the UK by overseas residents, and visits
abroad by UK residents. The data is also available as an Excel file on Blackboard.
Number of visitors Spending by overseas Number of UK
to UK from overseas visitors to UK residents’ visits abroad
Thousands £ Millions Thousands
Q2 2016 10138 6533 21787
Q3 2016 10892 8235 27548
Q4 2016 9900 6200 17450
Q1 2017 8847 5075 15934
Q2 2017 11012 7153 23744
Q3 2017 11899 10088 28699
Q4 2017 9322 6080 18865
Q1 2018 8547 5194 16592
Q2 2018 10521 6939 24646
Q3 2018 11536 8401 29923
Q4 2018 9679 5974 19409
Q1 2019 8332 4805 18159
Q2 2019 10364 6896 25760
Q3 2019 11864 9193 30000
Q4 2019 10297 7555 19167
Q1 2020 6994 4344 13891
Q2 2020 398 218 939
Q3 2020 2322 1037 6191
Q4 2020 1386 611 2806
Q1 2021 195 248 774
Q2 2021 277 386 1000
Source: https://www.ons.gov.uk/
Select some of the data above and complete a time series analysis.
You do not need to analyse all of the data.
Clearly state and interpret all seasonal effects that you have calculated.
Use your model to make relevant predictions.
Use calculator or Excel (see your seminar on time series analysis).
Present your results on the next pages.

ADDITIVE TIME SERIES

2021-11-29T13_57_06.2496584Z-Data Analysis for Business 202122 - Assessment 2 Brief - 52708.docx


10
Spending
by overseas CMA
Yr visitors to (TREND
Observations Qtr UK (000s) MA ) SV Projection Forecast
2016
6533
1 Q3
2 Q4 8235
2017
6200
3 Q1 6511 6588 -388
4 Q2 5075 6666 6897 -1822
5 Q3 7153 7129 7114 39
6 Q4 10088 7099 7114 2974
2018
6080
7 Q1 7129 7102 -1022
8 Q2 5194 7075 6864 -1670
9 Q3 6939 6654 6640 299
10 Q4 8401 6627 6578 1823 10044
2019
5974
11 Q1 6530 6749
12 Q2 4805 6724 4805
13 Q3 6699 6887
14 Q4 6673 10559
2020
15 Q1 6648 5749
16 Q2 6623 3965
17 Q3 6598 6786
18 Q4 6573 10458
19 2021 Q1 6548 5648
20 Q2 6522 3865
Spending by overseas visitors to UK

Additive Time Series - Spending by overseas


visitors to UK
12000
8000
4000
0
2016 Q3

2017 Q1

2018 Q1

2019 Q1

2020 Q1

2021 Q1
Q4

Q2
Q3
Q4

Q2
Q3
Q4

Q2
Q3
Q4

Q2
Q3
Q4

Q2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Obseration number

2021-11-29T13_57_06.2496584Z-Data Analysis for Business 202122 - Assessment 2 Brief - 52708.docx


11
Expalination

This time series is for spending by overseas visitors to UK between the Q3 2016 to the Q2

2019. In the analysis, I calculated several items including MA and CMA (Moving Average

and Trend) and SV (Seasonal Variations), the forecast and the projection of the spending by

overseas visitors to UK from the Q3 2016 to the Q2 2019. The seasonal variations were

calculated as shown in the table below, whereas the slope and the intercept of the chart

generated above were -25 and 7026 respectively.

Average Seasonal Variation Table

Quarter 2017 2018 ASV

1 -388 -1022 -899

2 -1822 -1670 -2658

3 39 299 188

4 2974 1823 3885

From the data chosen, the numbers seem to fluctuate depending on the quarter. For instance,

the first and the second quarter have low spending compare to the third and fourth. In the Q1

of 2018, the spending was 6080 while on Q4 of the same year the spending was 8401 million.

Therefore the % change was 38.17%. This could have been cause by seasonal weather

patterns. The lowest spending between 2016 and 2019 was Q2 of 2019 with highest spending

being Q4 of 2017 at 4805 and 10088 respectively. Besides, there is a characteristic drop in

spending as years pass by between 2016 and 2019. This average fall in spending could be

blamed on global inflation.

The forecast shows an almost similar pattern in spending as previous years for the year end

2019, the whole of 2020 and 2021. However, that is not the case when actual figures are

looked at. In the year beginning 2020, the world was hit by the corona pandemic which halted

2021-11-29T13_57_06.2496584Z-Data Analysis for Business 202122 - Assessment 2 Brief - 52708.docx


12
most economic activity including cross border movements. This led to a drastic fall in

spending by visitors to the UK because basically there were no visitor. Therefore, the

projected and forecasted figures were not achieved. For instance, the forecasted figures for

the Q3 2020 were 6786 while in actual sense they were 611.

2021-11-29T13_57_06.2496584Z-Data Analysis for Business 202122 - Assessment 2 Brief - 52708.docx


13

You might also like