Lecture 9

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 46

BUSN3100: Strategic Bus Intelligence

Lecture 9
Chapter 20

Pricing and product


mix decisions

Performing the test plan and analyzing the results


Text 2: Chapter 3
3. Perform the
1. Data 2. Master the Test and
Analytics Data Address
Results

4. Communicate 5. The Modern 6. Audit


Results Environment Analytics

WHERE WE
ARE NOW 7. 8. Financial
9. Tax
Management Statement
Analytics
Analytics Analytics

10. Project 11. Project


Chapter Chapter
(Basic) (Advanced)

3-2
Identify the
questions

In the IMPACT Track Master the

cycle, we’re outcomes data

going to look at
Performing the Communica
te insights
Perform
test plan

test plan. Address


and refine
results

Exhibit 1-1 The IMPACT Cycle

3-3
Chapter Objectives
• Understand four categories of Data Analytics.
• Describe some descriptive analytics approaches, including
summary statistics and data reduction.
• Explain the diagnostic approach to Data Analytics, including
profiling and clustering.
• Understand predictive analytics, including regression and
classification.
• Describe the use of prescriptive analytics, including machine
learning and artificial intelligence.

© McGraw Hill 4
Learning Objective 3-1

What are the four categories


of Data Analytics?

© McGraw Hill 5
There are four main categories of data
analytics.
• Descriptive analytics are • Predictive analytics are
procedures that summarize procedures used to
existing data to determine generate a model that can
what has happened in the be used to determine what
past. is likely to happen in the
• Diagnostic analytics are future.
procedures that explore the • Prescriptive analytics are
current data to determine procedures that model data
why something has to enable recommendations
happened the way it has, for what should be done in
typically comparing the data the future.
to a benchmark.
© McGraw Hill 6
Each stage takes additional effort but
provides additional value.

Exhibit 3-1 Four Main Categories of Data Analytics


• Access the text alternative for slide images.

© McGraw Hill 7
Descriptive analytics examples:
• Summary statistics • Data reduction or filtering
describe a set of data in is used to reduce the
terms of their location amount of observations to
(mean, median), range focus on relevant items (that
(standard deviation, is, highest cost, highest risk,
minimum, maximum), shape largest impact, etc.). It does
(quartile), and size (count). this by taking a large set of
data (perhaps the
population) and reducing it
to a smaller set that has the
vast majority of the critical
information of the larger set.
© McGraw Hill 8
Diagnostic analytics examples:
• Profiling identifies the • Clustering helps identify
“typical” behavior of an groups (or clusters) of
individual, group, or individuals (such as
population by compiling customers) that share
summary statistics about the common underlying
data (including mean, characteristics—in other
standard deviations, etc.) words, identifying groups of
and comparing individuals to similar data elements and
the population. the underlying drivers of
those groups.

© McGraw Hill 9
More diagnostic analytics examples:
• Similarity matching is a • Co-occurrence grouping
grouping technique used to discovers associations
identify similar individuals between individuals based
based on data known about on common events, such as
them. transactions they are
involved in.

© McGraw Hill 10
Predictive analytics examples:
• Regression estimates or • Classification predicts a
predicts the numerical value class or category for a new
of a dependent variable observation based on the
based on the slope and manual identification of
intersect of a line and the classes from previous
value of an independent observations.
variable.

© McGraw Hill 11
More predictive analytics examples:
• Link prediction predicts a relationship between two data items,
such as members of a social media platform.

© McGraw Hill 12
Prescriptive analytics examples:
• Decision support • Machine learning and
systems are rule-based artificial intelligence are
systems that gather data learning models or intelligent
and recommend actions agents that adapt to new
based on the input. external data to recommend
a course of action.

© McGraw Hill 13
Learning Objective 3-2

What are some descriptive analytics


approaches, including summary
statistics and data reduction?

© McGraw Hill 14
Descriptive analytics help summarize
what has happened in the past.
• A financial accountant would • An auditor would filter
sum all the sales data to limit the scope to
transactions within a period transactions that represent
to calculate the value for the highest risk. In all these
Sales Revenue that appears cases, basic analysis
on the income statement. provides an understanding
• An analyst would count the of what has happened in the
number of records in a data past to help decision makers
extract to ensure the data achieve good results and
are complete before running correct poor results.
a more complex analysis.
© McGraw Hill 15
Summary statistics
• Summary Statistic Excel formula Description
Sum SUM() The total value of all numerical values
statistics Mean =AVERAGE()
The center value; sum of all observations divided by the
describe the number of observations
The middle value that divides the top half of the data from the
location, Median =MEDIAN()
bottom half
Minimum =MIN() The smallest value
spread, Maximum =MAX() The largest value
shape, and Count =COUNT() The number of observations

dependence Frequency =FREQUENCY() The number of observations in each of a series of numerical


or categorical buckets
of a set of Standard
=STDEV()
The variability or spread of the data from the mean; a larger
deviation standard deviation means a wider spread away from the mean
observations. Quartile =QUARTILE()
The value that divides a quarter of the data from the rest;
indicates skewness of the data
Correlation How closely two datasets are correlated or predictive of one
=CORREL()
coefficient another

Exhibit 3-3 Description of Summary Statistics


© McGraw Hill 16
Data reduction involves the following
steps:
• Identify the attribute you
would like to reduce or
focus on.
• Filter the results.
• Interpret the results.
• Follow up on results.

Exhibit 3-4 Use Filters to Reduce Data

© McGraw Hill 17
Fuzzy matching locates approximate
matches
• Useful for
identifying
relationships in
imperfect data.

Exhibit 3-5 A Fuzzy Matching Shows a Likely Match of an


Employees and Vendor

© McGraw Hill 18
Q. Describe how the data reduction
approach could be used to evaluate
employee travel and entertainment
expenses.

© McGraw Hill 19
Learning Objective 3-3

How does the diagnostic approach


to Data Analytics work, including
profiling and clustering?

© McGraw Hill 20
Diagnostic analytics
• Diagnostic analytics provide insight into why things happened or
how individual data values relate to the general population.

© McGraw Hill 21
Profiling compares an individual to the
population
• Profiling is done primarily using structured data—data that are
stored in a database or spreadsheet and are readily
searchable.
• Profiling is used to discover patterns of behavior. In this
example, the higher the Z-score (farther away from the mean),
the more likely a customer will have a delayed shipment
(blue circle).

© McGraw Hill 22
Profiling relies on gathering summary
statistics and identifying outliers.
• Identify the objects or activity you want to profile.
• Determine the types of profiling you want to perform.
• Set boundaries or thresholds for the activity.
• Interpret the results and monitor the activity and/or generate a
list of exceptions.
• Follow up on exceptions.

© McGraw Hill 23
Z-Scores and box plots show spread
and outliers.

Exhibit 3-7 Z-Scores Provide an Example of


Profiling That Helps Identify Outliers
EXHIBIT 3-8 Box Plots Provide an Example of
Profiling That Helps Identify Outliers (in This
Case, Categories with Unusually High Average
Days to Ship)
Access the text alternative for slide images.
© McGraw Hill 24
Variance analysis is an example of
data profiling.
• Internal auditors
analyze travel and
entertainment
expenses for violations
of internal controls.
• Managers use profiling
to compare variances
from target ranges.

Exhibit 3-9 Variance Analysis Is an Example of Data


Profiling
Access the text alternative for slide images.
© McGraw Hill 25
Benford’s Law is a diagnostic analytics
that compares actual to expected values.
• In the continuous
audit, an auditor
may use
Benford’s Law to
evaluate the
frequency
distribution of the
first digits from a
large set of
numerical data. Exhibit 3-10 Benford’s Law Applied to Large Numerical
Data Sets (including Employee Transactions)
Access the text alternative for slide images.
© McGraw Hill 26
Cluster analysis shows natural
groupings of data.
• Clustering is used to identify
groups of similar data
elements and the underlying
drivers of those groups.
• Clustering algorithms
calculate the minimum
distance of all observations
and groups those elements.

Exhibit 3-11 Clustering Is Used to Find Three Natural


Access the text alternative for slide images. Groupings of Vendors Based on Purchase Activity
© McGraw Hill 27
What are some examples of
clustering?
• Internal auditors can
use clustering to
identify groups of
transactions that may
indicate risk or fraud in
insurance or other
payments.

Access the text alternative for slide images. Exhibit 3-12 Cluster Analysis of Insurance Payments
© McGraw Hill 28
Hypothesis testing is used to identify
how different groups are.
• Begin by setting the
Null Hypothesis H0 (no
relationship) and the
Alternative Hypothesis
HA (expected
relationship).
• Test the p-value for
statistical significance.

EXHIBIT 3-13 T-Test Assessing for Significant Differences


Access the text alternative for slide images. in Average Shipping Times across Categories
© McGraw Hill 29
Learning Objective 3-4

When do you use predictive


analytics, including
regression and classification?
© McGraw Hill 30
Regression helps predict expected
outcomes.
• Identify the variables that
might predict an outcome.
• Determine the functional
form of the relationship.
• Identify the parameters of
the model.
• Dependent variable =
f(independent variables)

Exhibit 3-14 Regression


Access the text alternative for slide images.
© McGraw Hill 31
What are some examples of
regression?
• In managerial accounting, • In auditing, regression may
regression may predict be used to determine the
employee turnover: appropriateness of
• Employee turnover = allowance accounts:
f(current professional • Allowance for loan losses
salaries, health of the amount = f(current aged
economy [G D P], salaries loans, loan type, customer
offered by other accounting loan history, collections
firms or by corporate success)
accounting, etc.)

© McGraw Hill 32
The goal of classification is to predict
which class an individual belongs to.
• Identify the classes you wish to predict.
• Manually classify an existing set of records.
• Select a set of classification models.
• Divide your data into training and testing sets.
• Generate your model.
• Interpret the results and select the “best” model.

© McGraw Hill 33
Classification begins with decision
boundaries.
• Training data are existing
data that have been
manually evaluated and
assigned a class.
• Test data are existing data
used to evaluate the
model.
• Decision trees are used to
divide data into smaller
groups.
• Decision boundaries mark
the split between one
class and another.
Exhibit 3-16 Example of Decision Trees and Decision
Boundaries
Access the text alternative for slide images.
© McGraw Hill 34
What else do you need to know about
classification? 2
Pruning removes branches
from a decision tree to avoid
overfitting the model.

• Access the text alternative for slide images. Exhibit 3-17 Illustration of Pruning a Decision Tree
© McGraw Hill 35
What else do you need to know about
classification? 3
• Linear classifiers are useful
for ranking items rather than
simply predicting class
probability.
• These are useful for
determining the important
values, such as valuable
customers, or which
transactions are most likely
fraudulent.
Exhibit 3-13 Illustration of Linear Classifiers
Access the text alternative for slide images.
© McGraw Hill 36
What else do you need to know about
classification? 4
Exhibit 3-14 Support Vector Machines
• Support vector machine is a Exhibit 3-15 Support Vector
discriminating classifier that Machine Decision Boundaries
is defined by a separating
hyperplane that works first
to find the widest margin (or
biggest pipe) and then
works to find the middle line.

Access the text alternative for slide images.


© McGraw Hill 37
How do we evaluate classifiers?
Try to avoid
overfitting, or models
that are too accurate.
They are bad at
predicting a future
observation.
Exhibit 3-21 Illustration of Underfitting and Overfitting
the Data with a Predictive Model

Access the text alternative for slide images.


© McGraw Hill 38
How do we evaluate classifiers?
Look for the sweet spot
where we maximize the
accuracy of the testing
data.

Exhibit 3-22 Illustration of the Trade-Off between the


Access the text alternative for slide images.
Complexity of the Model and the Accuracy of the
© McGraw Hill
Classification 39
Q. If we are trying to predict the extent
of employee turnover, do you believe
the health of the economy, as
measured using G D P, will be
positively or negatively associated
with employee turnover?

© McGraw Hill 40
Learning Objective 3-5

What are prescriptive analytics,


including machine learning and
artificial intelligence?
© McGraw Hill 41
What do we do next?
• Once other diagnostic and predictive analyses have been
performed, the decision process can be aided by rules-based
decision support systems, machine learning models, or added
to an existing artificial intelligence model to improve future
predictions.

© McGraw Hill 42
Decision support systems use rules to
guide the accountant.
• The rules are derived from
past behavior to help guide
the accountant through a
process.
• For example, the
classification of leases is
based on evaluating several
rules.

Exhibit 3-23 Lease Classification Flowchart

Access the text alternative for slide images.


© McGraw Hill 43
Machine learning learns from past
data to predict better outcomes.
• What these all have in • For most application of artificial
common is the use of intelligence models, most
algorithms and statistical companies will outsource the
models to generate a underlying system to
previously unknown model companies like Microsoft,
Amazon, or Google rather than
that relies on patterns and
develop it themselves.
inferences.
• These companies have large
datasets to create more
accurate prediction and
recommendation engines.

© McGraw Hill 44
Chapter 3 Summary
In this chapter, we addressed the third and fourth steps We introduced some specific models and terminology
of the IMPACT cycle model: the “P” for “performing test related to these tools, including Benford’s law, test and
plan” and “A” for “address and refine results.” That is, training data, decision trees and boundaries, linear
how are we going to test or analyze the data to address classifiers, and support vector machines. We identified
a problem we are facing? (LO 3-1) cases where creating models that overfit existing data
are not very accurate at predicting the future. (LO 3-4)
We identified descriptive analytics that help describe
what happened with the data, including summary We explained examples of predictive analytics and
statistics, data reduction, and filtering. (LO 3-2) introduced some data mining concepts related to
regression, classification, and link prediction that can
We provided examples of diagnostic analytics that help help predict future events or values. (LO 3-4)
users identify relationships in the data that uncover why
certain events happen through profiling, We discussed prescriptive analytics, including decision
clustering,similarity matching, and co-occurrence support systems and artificial intelligence and provided
grouping. (LO 3-3) some examples of how these systems can make
recommendations for future actions. (LO 3-5)

© McGraw Hill 45
Homework
• Chapter 3 homework:
• DQ2; DQ3; DQ8; DQ9; DQ10;
• P2, P4, P7

You might also like