Antim Prahar Business Statistics and Analysis - 240328 - 180758

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

ANTIM PRAHAR

The Most Important Questions


By
Dr. Anand Vyas
1 Correlation Karl Pearson's and Spearman's
Rank Correlation
Karl Pearson’s Coefficient of Correlation is widely used mathematical
method wherein the numerical expression is used to calculate the
degree and direction of the relationship between linear related
variables.

Spearman rank correlation is a non-parametric test that is used to


measure the degree of association between two variables. The
Spearman rank correlation test does not carry any assumptions about
the distribution of the data and is the appropriate correlation analysis
when the variables are measured on a scale that is at least ordinal.
Karl
Pearson's
Correlation

Spearman's
Rank
Correlation
2 Regression
• Regression is a statistical measurement used in finance, investing and
other disciplines that attempts to determine the strength of the
relationship between one dependent variable (usually denoted by Y)
and a series of other changing variables (known as independent
variables).
3 Probability (Card & Ball Question)
• In our day to day life the “probability” or “chance” is very commonly
used term. Sometimes, we use to say “Probably it may rain
tomorrow”, “Probably Mr. X may come for taking his class today”,
“Probably you are right”. All these terms, possibility and probability
convey the same meaning. But in statistics probability has certain
special connotation unlike in Layman’s view.

• Addition and Multiplication Laws


4 Mean, Median Mode
• MEAN (ARITHMETIC)
• The mean (or average) is the most popular and well known measure of central tendency. It can be
used with both discrete and continuous data, although its use is most often with continuous data
(see our Types of Variable guide for data types). The mean is equal to the sum of all the values in
the data set divided by the number of values in the data set. So, if we have n values in a data set
and they have values x1, x2, …, xn, the sample mean, usually denoted by (pronounced x bar), is:

• MEDIAN
• The median is the middle score for a set of data that has been arranged in order of magnitude.

• MODE
• The mode is the most frequent score in our data set. On a histogram it represents the highest bar
in a bar chart or histogram. You can, therefore, sometimes consider the mode as being the most
popular option
5 Fisher Index Number
• Fisher's method is a way of combining the information in the p-values
from different statistical tests so as to form a single overall test: this
method requires that the individual test statistics (or, more
immediately, their resulting p-values) should be statistically
independent.
• Fisher's method is considered the most ideal because it uses both
prices and quantities of base and current period and is based on
geometric mean.
6 Mean Deviation & Standard Deviation
• Mean Deviation
• To understand the dispersion of data from a measure of central tendency,
we can use mean deviation. It comes as an improvement over the range. It
basically measures the deviations from a value. This value is generally
mean or median. Hence although mean deviation about mode can be
calculated, mean deviation about mean and median are frequently used.

• Standard Deviation
• As the name suggests, this quantity is a standard measure of the deviation
of the entire data in any distribution. Usually represented by s or σ. It uses
the arithmetic mean of the distribution as the reference point and
normalizes the deviation of all the data values from this mean.
7 Application of Business Analytics
• Companies use Business Analytics (BA) to make data-driven decisions. The
insight gained by BA enables these companies to automate and optimize
their business processes. In fact, data-driven companies that utilize
Business Analytics achieve a competitive advantage because they are able
to use the insights to:
• Conduct data mining (explore data to find new patterns and relationships)
• Complete statistical analysis and quantitative analysis to explain why
certain results occur
• Test previous decisions using A/B testing and multivariate testing
• Make use of predictive modeling and predictive analytics to forecast future
results
8 Decision Tree
• Decision Tree may be understood as the logical tree, is a range of
conditions (premises) and actions (conclusions), which are depicted as
nodes and the branches of the tree which link the premises with
conclusions. It is a decision support tool, having a tree-like representation
of decisions and the consequences thereof. It uses ‘AND’ and ‘OR’
operators, to recreate the structure of if-then rules.

• Decision Node: Represented as square, wherein different courses of action


arise from decision node in main branches.
• Chance Node: Symbolised as a circle, at the terminal point of decision
node, the chance node is present, where they emerge as sub-branches.
These depict probabilities and outcomes.
9 Skewness Meaning and Types
• Skewness, in statistics, is the degree of distortion from the
symmetrical bell curve, or normal distribution, in a set of data.
Skewness can be negative, positive, zero or undefined. A normal
distribution has a skew of zero, while a lognormal distribution, for
example, would exhibit some degree of right-skew.
• The three probability distributions depicted below depict increasing
levels of right (or positive) skewness. Distributions can also be left
(negative) skewed. Skewness is used along with kurtosis to better
judge the likelihood of events falling in the tails of a probability
distribution.
10 Central Tendency
• A measure of central tendency is a summary statistic that represents
the center point or typical value of a dataset. These measures indicate
where most values in a distribution fall and are also referred to as the
central location of a distribution. You can think of it as the tendency
of data to cluster around a middle value. In statistics, the three most
common measures of central tendency are the mean, median, and
mode. Each of these measures calculates the location of the central
point using a different method.

You might also like