Business Statistics Material

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Introduction to Business Statistics ________________________Student Lecture Note

Chapter One
Introduction to Business Statistics
1.1 Introduction

Nowadays most executives and other decision makers pass effective decisions based on research findings. Most
researches in different areas of study require data so as to generate valuable information that facilitate the
decision making process. Data are raw materials for researches. Moreover, the quality of the collected data
greatly affects or determines the precision of results to be obtained from a specific investigation. Therefore, it is
extremely important to know about the basics of data collection.

1.2 Definition of Statistics

We encounter the term statistics frequently in our everyday language. It really has two meanings. In the more
common usage, statistics refers to numerical information. Examples include the average salary of instructors in
Oda Bultum University, the average number of cars sold per week, the percentage of students attending
Cooperatives department, the number of car accidents occurred over the last five years and the number of deaths
due to HIV AIDS in Chiro town last year. In these examples statistics is a number or a percentage.

Other examples include the mean time spent on waiting to get service in Dashen Bank is 10 minutes, typical
Toyota automobile in Ethiopia travels 15,000 kilo meters per year.

The above are examples of a statistics. A collection of more than one figure is called statistics (plural).
Statistics can appear in graphic form as well as in sentence form. The subject of statistics has a much broader
meaning than just collecting and publishing numerical information.

Statistics defined as a method (Singular sense)


The second definition of Statistics refers to the science or the methods of Statistics. It is also in the sense of its
second definition that we consider Statistics as a subject. With this regard, Statistics may be defined as:
“Statistics is the science which deals with the methods of collecting, classifying, presenting, comparing
(analyzing) and interpreting numerical data collected to throw some light on any sphere of enquiry.”
“Statistics is the method of judging collective, natural or social phenomenon from the results obtained from the
analysis or enumeration or collection of estimates.”
“Statistics is the application of the scientific method in the analysis of numerical data for the purpose of making
rational decisions.”

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 1


Introduction to Business Statistics ________________________Student Lecture Note

Berenson and Levin


“Statistics is the collection, presentation, analysis and interpretation of numerical data.” Crookston and Cowden
1.3 Characteristics of statistics
i. Statistics are aggregates of facts
Single and isolated figures are not Statistics for the simple reason that such figures are unrelated and can‟t be
compared. According to this aspect, to be Statistics, data must be in aggregate (mass) and also the individual
elements within the aggregate should relate to a common phenomenon so that they can be compared to one
another.
ii. Statistics are affected to a marked extent by multiplicity of causes
Since Statistics are most commonly used in social sciences it is natural that they are affected by a large variety
of factors at the same time. Putting differently, Statistics are not as such caused by a single factor (force), rather
they are outcomes of a number of (multiple) factors (forces) operating together.
iii. They should be numerically expressed
All Statistics are expressed in numbers. Nevertheless, the converse of this statement is not in general true. That
is, statements expressed in terms of numbers may not necessarily be Statistics as there is a possibility for them
not to meet the other requirements of the definition given above.
iv. They should be enumerated or estimated according to reasonable standards of accuracy
Numerical statements can either be enumerated, in which case they are supposed to be accurate and precise or
else they can be estimated by some expert observers, in which case 100% accuracy is unlikely to be attained. In
the process of estimation, reasonable standards of accuracy must, however, be attained.
v. They should be collected in a systematic m
vi. anner
If data are collected in haphazard manner, then results to be obtained are likely to lead to fallacious conclusions.
Therefore, it is essential that data must be collected in a systematic manner so that they may confirm to
reasonable standards of accuracy.
vii. They should be collected for a predetermined purpose.
Statistics collected without any predetermined purpose do not serve any useful purpose. Therefore, the purpose
of collecting Statistics should be defined clearly before they are collected. Meaning, figures (Statistics) should
be collected in view of some goal or target. Moreover, the data should be collected in such a manner that it
meets the predetermined needs.
viii. They should be placed in relation to each other

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 2


Introduction to Business Statistics ________________________Student Lecture Note

For numerical facts to be called Statistics they should be comparable either period wise or region wise, or in
reference to some other means of comparison.
As an example, suppose that the marketing head of a given supermarket in Hawassa wants to know the average
expenditure of households in the city, among other things, so as to revise his marketing strategy. To achieve his
objective, the head collects data on expenditure from a sample of 1000 household‟s selected using stratification
.Moreover; the head used the interview approach to gather the required information. Thus, the data collected by
the marketing head are Statistics as they fulfill all the requirements of the definition.
Limitations of statistics
The fact that Statistics is applicable in almost all fields of study is not a guarantee for its perfection. Of course,
there is no perfect science in the globe. Statistical methods as well have their own limitations. The following are
the major limitations:
i. Statistics does not deal with individual items
This is to mean that Statistics deals only with aggregates of facts and no importance is attached to individual
items. For instance, age of a single student in a given class in a given year is not a Statistical data. In contrast,
the age of all students within a given class in a given year form an aggregate and hence can be considered as
data. Alternatively, the semester GPA of a single student for 4 semesters also forms a Statistical data. In short,
Statistical methods are suited only to those problems or situations where group characteristics are desired to be
studied.
ii. Statistics deals only with quantitatively expressed items
Another limitation of Statistics is that it deals with those subjects of inquiry that are capable of being
quantitatively measured and numerically expressed. Accordingly, such qualitative characteristics as health,
poverty, honesty and intelligence are not suitable for Statistical analysis however; problems involving such
qualitative variables are treated in Statistics indirectly. For example, the variable health may be studied through
death rate, which is a quantitative variable. However, these are only indirect methods.
iv. Statistical results are not universally true
As it is often said, Statistical results are true only on the average. Meaning, the results obtained from Statistical
data analysis are not true for each member or item within the data for which the analysis is made. Statistical
statements or conclusions are not generally true or applicable to individuals, but are applicable to the majority
of cases.
v. Statistics is liable to be misused

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 3


Introduction to Business Statistics ________________________Student Lecture Note

Misuses of Statistics, unfortunately, are probably as common as valid uses of Statistics. In reality, Statistical
methods can be properly used by experienced or trained people, as it requires skill to draw sensible conclusions
from data. It is actually this limitation that hinders the possibility of mass popularity of such a useful and
applicable science.
1.5 Classification of Statistics

The study of statistics is usually divided into two categories: descriptive statistics and inferential statistics.

Descriptive Statistics: Methods of organizing, summarizing, and presenting data in an informative way.

Descriptive statistics is a branch of statistics devoted to accurate representation of a mass of data with graphs
and summery measures. It deals with collecting, summarizing and simplifying data to draw meaningful
conclusions. It does not attempt to use samples to predict the parameters of population. It does not look beyond
the data at hand, but rather concentrates on how best to understand and present these data. Measures of central
tendency, dispersion, skeweness and kurtosis are examples of descriptive statistics. The data can be presented
using tools like graphs, tables, averages, mode, medians etc.

Example1: - Unemployment rate of a country

- total production of a nation

- total population size of Oromia region

 Calculating the average age of students at Oda Bultum University from graduating class students;

 Recording second year cooperatives students grade for the previous semester and then finding the
average of these grades;

 Drawing graphs that show the difference in brand of cars sold in the year 2014.

Inferential Statistics

Another facet of statistics is inferential statistics that is also called statistical inference and inductive statistics.
Our main concern regarding inferential statistics is finding out something about a population based on a sample
taken from the population. For example, based on a sample survey by Ethiopian Reporter newspaper, only 60%
of young people prefer to drink coca-cola in Ethiopia. Since this is inference about the population (all young
people in Ethiopia) based on sample data, we refer to them as inferential statistics.

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 4


Introduction to Business Statistics ________________________Student Lecture Note

If the Ethiopian economic association reports that the domestic product of Ethiopia this year is 120 million tons,
this is descriptive statistics. But if the association predicts the domestic product to be doubled after 10 years
based on the present information, this is inferential statistics.

Inferential Statistics: the methods used to find out something about a population, based on a sample.

Note the words “population” and “sample” in the definition of inferential statistics. We often make reference
to the population living in Ethiopia. However, in statistics the word population has a broader meaning. A
population may consist of individuals such as all the students enrolled at Hawassa University, all second year
cooperative students, or all the prisoners at Kaliti Prison. A population may also consist of a group of
measurements, such as all the heights of the cooperatives students in Awada campus. Thus, a population in the
statistical sense of the word does not necessarily refer to people.

Population: A collection of all possible individuals, objects, or measurements of interest.

Population can be finite (limited in its size) or infinite (unrestricted). In finite population, observations are
countable- at least in theory. In contrast, infinite population is indefinitely large. The observations cannot be
even in theory. To infer something about a population, we usually take a sample from the population.

A Sample is a portion, or part, of the population of interest. Parameter: It is a measurable characteristic of the
population or it is a numerical result obtained as measuring the population. Statistic: It is a measurable
characteristic of the sample. In short it is a sample result.

1.6 Types of Variables and Measurement Scales


A variable is a characteristic of an object that can have different possible values.
There are two types of variables.
A. Quantitative variables: are variables that can be quantified or can have numerical values. A quantitative
variable is a variable with quantitative data. Examples: height, area, income, temperature e t c.
Qualitative variables: are variables that cannot be quantified directly. A variable that cannot be measured
numerically but can be divided into different
Categories are called Qualitative or categorical variables. The corresponding data is qualitative data. For
example: The status of on under graduate college student is a qualitative variable since a Student can fall into
any one of four categories:
1st 2nd 3rd 4th
Freshman sophomore Junior Senior

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 5


Introduction to Business Statistics ________________________Student Lecture Note

a) Examples: color, beauty, sex, location qualitative variables are also called categorical variables. A
categorical variable is a variable with categorical data,
Hence we have two types of data; qualitative/ Categorical & quantitative data.
 Data that can be grouped by specific categories are referred to as categorical data. Categorical data use
either the nominal or ordinal scale of measurement.
 Data that use numerical values to indicate how much or how many are referred to as quantitative data.
Quantitative data are obtained using either the interval or ratio scale of measurement.
Quantitative variables can be further classified as
 Discrete variables, and
 Continuous variables
a) Discrete variables are variables whose values are counts and all discrete variables have gaps in their
scale of measurement. Discrete variables take on only a finite set of values. Typically discrete variables
result from counting. Examples: number of students, number of households (family size), Number of
pages of a book. The number of cars sold in any day as the number of cars sold must be 0, 1, 2, …. It
cannot be between 0 and 1 or 1 and 2. Number of people visiting a bank on any day, Number of cars in a
parking list and Family size are examples.
b) Continuous variables are variables that can have any value within an interval. Continuous variables
take on an infinite number of values. Typically, continuous variables result from measuring.
Examples: weight, Length, Volume, temperature and elevation. e t c.

Chapter Two
Sampling and Sampling Distributions
Statistics is a science of inference. It is the science of making general conclusion about the entire group (the
population) based on information obtained from a small group or sample. In statistics we are interested in
obtaining information about a total collection of elements, which we will refer to as population. For instance,
we might have all the residents of a given state, or all the television sets produced in last year by a particular
manufacturer. In such cases, we try to learn about the populations by choosing a sub-group of its elements. This
sub group of population is called a sample.
2.1. Sampling Theory
Sampling theory is the study of relationships existing between a population and samples drawn from the
population. Sample is a part of the population from which it is selected.

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 6


Introduction to Business Statistics ________________________Student Lecture Note

The process of selecting a sample is known as sampling. Thus, the sampling theory is a study of relationship that exists
between the population and the samples drawn from the population. The complete enumeration, popularly known
as census, may not be feasible either due to non-availability of time or because of high cost involved.
2.1.1. Basic Definitions
Sampling: - May be defined as the selection of some parts of an aggregate or totality on the basis of which a
judgment or inference about the aggregate or totality is made.
Statistic: - Statistical measurable value of the sample or a measurable characteristic value of the sample.
Parameter: - A measurable value of the population or a measurable characteristic value of the population. It is a
population result.
Sample frame; - it is a potential respondents/population where a sample to be chosen. It is a listing of items that
make up the population.
Sampling design: - A sample design is a definite plan for obtaining a sample from the sampling frame.
2.1.2. The need for samples
It is often not feasible to study the entire population. Some of the major reasons why sampling is necessary are
listed as follows;
A. The destructive nature of certain testes. Many experiments especially in quality control demand
destructing outputs consider the following tests:
 Testing wine or coffee
 Testing strength of light bulbs
 Blood test for a patient
Unless sample is taken from the entire population the wine tester should drink all the wine and all the light
bulbs produced should be destroyed nothing would remain for sale and also all the blood from the patient
should be poured-out the patient will die. Here sample is a must.
B. The physical impossibility of checking all items is the population.
The populations of fish, birds and other wild lives are large and are constantly moving being born and dying.
There is no mechanism to contact all items or individual members of the population.
C. The cost of studying all the items in a population is often prohibitive.
Public opinion polls and consumer testing organizations usually contact fewer families out of millions. Consider
a multi-national corporation with 50 million customers worldwide. If this company plans to undertake market
survey out of the 50 million it will take 2000 samples, if it takes 20 br. to mail samples and tabulate the
responses of 2000 samples, the same survey involving 50 million populations would cost about one billion br.

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 7


Introduction to Business Statistics ________________________Student Lecture Note

D. The adequacy of sample results.


Even if funds were available, it is doubtful whether the additional accuracy of 100% sample i.e., studying the
entire population is essential in most problems. To determine monthly index of food prices, bread, beans, milk
etc., it is unlikely that the inclusion of all grocery stores and shapes would significantly affect the index, since,
the prices of such commodities usually do not vary by more than a few cants form one store to another. 100%
accuracy cannot be all ways guaranteed by studying the entire population. The chance of error in collecting and
analyzing bulk data has its own disadvantage.
E. To contact the whole population would often be time consuming. A market survey may take two or
three days for field interviews by taking a sample of 2000 customers regular staff and field interviews.
By using the same staff and interviewers and working seven days a week it would take nearly 200 years
to contact 50 million customers.
Characteristics of good sample
An ideal sample should fulfill the following four basic characteristics.
 Representatives: An ideal sample must represent adequately the whole population. It should not lack a
quality found in the whole population.
 Adequacy: The number of units included in the sample should be sufficient to enable derivation of
conclusion applicable for the whole population.
 Homogeneity: The element included in the sample must bear likeness with other element.
3. 2 sampling and non-sampling errors
The errors involved in the collection, processing and analysis of a data may be broadly classified under the
following two heads:
i) Sampling Errors, and
ii) Non-sampling Errors
Sampling Errors is the difference between the result of a sample and the result of census. It is the difference
between the sample estimation and the actual value of the population. Sampling error decreases as the sample
size increases.
Non-Sampling Errors is such error can be created from errors in the sampling procedure, and it cannot be
reduced or eliminate by increasing the sample size. Such error occurs because of human mistakes and not
chance variation.
Types of samples random (probability) and non-random (non-probability) samples
There are two broad classifications of sampling techniques random and non-random.

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 8


Introduction to Business Statistics ________________________Student Lecture Note

1 Random or probabilistic sampling techniques


Probability sampling is the scientific method of selecting samples according to some laws of chance in which
each unit in the population has some definite pre-assigned probability of being selected in the sample.
a. Simple Random Samples
In a simple random sample, every item from a frame has the same chance of selection as every other item.
 In addition, every sample of a fixed size has the same chance of selection as every other sample of that
size.
 In simple random sampling, you use n to represent the sample size and N to represent the population
size.
 Your number every item in the frame from 1 to N.
The chance that you will select any particular member of the frame on the first selection is 1/N.
Simple random sampling is a method of selecting n units out of a finite population of size N by giving equal
probability to all units, or a sampling procedure in which all possible combinations of n units that may be
formed from the finite population of size N units have the same probability of selection.

There are N C n distinct possible samples in the case of sampling without replacement; the chance of selecting
1
each one of them is .
N Cn

There are possible samples in the case of sampling with replacement, the chance of selecting each one of
them is 1/ .
i) Lottery method
Example: If we want to take a sample of 25 persons out of a population of 150, the procedure is to write the
names of all the 150 persons on separate slips of papers, fold these slips, mix them thoroughly and then make a
blindfold selection of 25 slips without replacement.

ii) Table of random numbers

These numbers are very widely used in all the sampling techniques and have proved to be quite reliable as
regards accuracy and representivness.

B. Systematic Random Sampling


The items or individuals of the population are arranged in some way (alphabetical) or some other method.
 A random starting point is selected and then every Kth member of the population is selected for the sample.

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 9


Introduction to Business Statistics ________________________Student Lecture Note

 A systematic random sample should not be used; if there is a predetermined pattern to the population. Values are listed
in ascending or descending order.
 A systematic random sample should not be used; if there is a predetermined pattern to the population.
Values are listed in ascending or descending order.
 The procedure starts in determining the first element to be included in the sample, select a unit i
randomly from the first group, i as the first element. The second unit will be (i+k)th element from the
frame. Totality we have a sample of size n from the population of size N, i th, (i+k)th, (i+2k)th,… (i+(n-
1)k)th element of the population are taken as a sample.
Example:
 Suppose that N = 20 and we want to select a sample of size 4, so that k = N/n =20/4 = 5.
 The first element in the sample is selected from the first 5 units randomly, say 3 rd, which is the random
start. Then, every 5th unit is selected, and the sample contains the 3rd, 8th, 13th and 18th units of the
population.
C. Stratified Random Sample
A population is first divided into subgroups called strata, and a sample is selected from each stratum. Stratum
can be
- Proportional sample / to the population or
- Non-proportional sample.
Stratified sampling has the advantage in some cases of more accuracy reflecting the characteristics of the
population than simple random or systematic random sampling.

 Proportionate stratified sample The size of the sample selected from each subgroup is proportional to
the size of that subgroup in the entire population.
 Disproportionate stratified sample The size of the sample selected from each subgroup is
disproportional to the size of that subgroup in the population. Here, equal numbers of elements are
selected from each stratum regardless of how the stratum is represented in the population.
 Example. Studying advertising expenditure of 352 large companies. Profitability percentage is used to
stratify this population. We need to select 50 samples.

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 10


Introduction to Business Statistics ________________________Student Lecture Note

Stratum Profitability Number of % of total Number


(0) (1) (2) (3) (4) (50x(3))
1 30 % and over 8 2 1
2 20-30% 35 10 5
3 10-20% 189 54 27
4 0 up to 10% 115 33 16
5 deficit 5 1 1
352 150 50
Stratified sampling has the advantage in some cases of more accuracy reflecting the characteristics of the
population than simple random or systematic random sampling.
D. Cluster Sampling
It is dividing the population in to small units. These units are called primary units. They select at random certain
groups or clusters.
- Often employed to reduce cost of sampling a population catered over a large geographic area.
- Often employed to reduce cost of sampling a population catered over a large geographic area.
- Data are arranged according to places like continents, regions, and countries
Example Assume study of Languages spoken in Ethiopia
Region Common Language Spoken Sample size selected
1 Tigrigna 200 people
2 Afar 120 people
3 Amharic 400 people
4 Afan Oromo 700 people

2.2.2 Non-random sampling techniques


It is a sampling scheme in which there is not attachment of probability concept in selecting a sample unit to
the sample. Rather, the sample is selected with definite purpose in view and the choice of the sampling units
depends entirely on the discretion and judgment of the investigator.
 Here, as already mentioned you select the items or individuals without knowing their probabilities of
selection.
 Because of this, the theory of statistical inference that has been developed for probability sampling
cannot be applied to non-probability samples.
Non-probability sampling includes the following.
a. Convenience sampling

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 11


Introduction to Business Statistics ________________________Student Lecture Note

Convenience sampling is a non-probability sampling technique. As the name implies, the sample is identified
primarily by convenience.
 Elements are included in the sample without pre-specified or known probabilities of being selected.
B. Judgment Sampling
One additional non probability sampling technique is judgment sampling. In this approach, the person most
knowledgeable on the subject of the study selects elements of the population that he or she feels are most
representative of the population.
 Hence, you get the opinions of preselected experts in the subject matter. Although the experts may be well
informed, you cannot generalize their results to the population.
C. Quota sampling
In this technique, quota is set up according to given criteria, but the sample with in prescribed quota is selected
by personal judgment of the investigator. It is suitable in market and public opinion surveys where stratification
is very difficult.
 However, it suffers from representivness as the interviewer may select samples convenient for him with regards
to location and sample unit.
It is the combination of judgment and stratified sampling methods. So it enjoys the merits of both.
Generally, non probability samples can have certain advantages, such as convenience, speed, and low cost. However,
their lack of accuracy due to selection bias and the fact that the results cannot be used for statistical inference more
than offset these advantages.
.

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 12


Introduction to Business Statistics ________________________Student Lecture Note

Chapter three
Hypothesis Testing
3.1. Basic concepts
 Hypothesis is a statement about the value of a population parameter developed for the purpose of testing
or hypothesis is an assertion or tentative solution.
 Hypothesis testing is a procedure based on sample evidence and probability distribution used to alter
whether the hypothesis is a reasonable statement and should be not be rejected, or is unreasonable and
should be rejected.
There are two types of hypothesis:
1) The Null hypothesis: - is an assertion that a population parameter assumes a fixed value. It always
includes the equality sing, and is denoted by Ho. The null hypothesis is often established in such a way
that it states „nothing is different‟ from what it is supposed to be, is claimed to be, or has been in the
past.
2) The alternative hypothesis: - describes what you will conclude if you reject the null hypothesis. It is a
statement that is accepted if the sample data provide evidence that the null hypothesis is false. It is
written as H1 and is read “H sub-one”. It is also referred to as the research hypothesis. The alternative
hypothesis is accepted if the sample data provide us with statistically significant evidence that the null
hypothesis is false.

5.2. Steps in Hypothesis testing


There are five-step procedures that systematize hypothesis testing.
Step I. Identity the null hypothesis and the alternate hypothesis
 The first step is to state the hypothesis to be tested. It is called the Null Hypothesis, designated by Ho
and read “H sub-zero”. The capital letter H stands for hypothesis and the subscript zero implies “no
difference or no change.
 Alliterate hypothesis is a statement describes what we will believe if we reject the null hypothesis. It is
designated H1 (H sub – one) the alternative hypothesis will be accepted if the sample data provide us
with evidence that the null hypothesis is false.
Step II: Determine the level of significance
 After setting up the null hypothesis and alternate hypothesis, the next step is to state the level of
significance. It is the probability of rejecting the null hypothesis when it is actually true.

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 13


Introduction to Business Statistics ________________________Student Lecture Note

 Level of significance is the risk we assume of rejecting the null hypothesis when it is a actually true.
 The level of significance is designated by the Greek letter alpha, ,, it is also referred to as the level of
risk
Step III: Find the Test statistic
 There are many test statistics, Z (the normal distribution), the student t test, F, and X2or the chi –square.
 Test statistic – A value, determined from sample information, used to reject or not to reject the null
hypothesis.
 The standard normal deviate, Z distribution is used as test statistic when the sample size is large, n  30.
 In hypothesis testing the test static Z is computed by
xN
Z=

n
Step IV: Determine the decision rule
 A decision rule is a statement of the conditions under which the null hypothesis is rejected and the
conditions under which it is not rejected.
 The critical value separates the critical region from the noncritical region. The symbol for critical
value is C.V.
 The critical or rejection region is the range of values of the test value that indicates that there is a
significant difference and that the null hypothesis should be rejected.
 The noncritical or non-rejection region is the range of values of the test value that indicates that
the difference was probably due to chance and that the null hypothesis should not be rejected.
Steps V: Take a sample and made a decision
 At this step a decision is made to reject or not to reject the null hypothesis.
5.3. Type I and type II errors (concepts)
In hypothesis testing, there are two possible kinds of errors called type I error and type II error.
1. Type I error: - is the error committed in rejecting the null hypothesis while it is actually true. The
probability of type I error is denoted by  and is called the level of significance. i.e., the level of
significance is the probability of rejecting the null hypothesis when it is actually true.
 The level of significance is also referred to as the level of risk. This may be a more appropriate
term because it is the risk you take of rejecting the null hypothesis when it is really true.
 There is no unique level of significance; it depends up on the choice of the researcher.

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 14


Introduction to Business Statistics ________________________Student Lecture Note

 The researcher must decide on the level of significance before formulating a decision rule and
collecting sample data.
 There are two commonly used levels of significances. The .05 and .01.

2. Type II error: - is the error that is committed in accepting the null hypothesis when it is actually false.
 The probability of type II error is designated by a Greek letter beta ()).
 We often refer to these two possible errors as the alpha error, ,, and the beta error, ,, Alpha ()) is
the probability of making a Type I error, and beta ()) is the probability of making a type II error.
 Notice that there are two possibilities for a correct decision and two possibilities for an incorrect
decision.
Possible outcomes of a hypothesis test
H0true H0 false
Error Correct
Reject H0 Type I Decision
Correct Error
Do not reject decision Type II
H0

Two-tailed and one-tailed tests


 A one-tailed test indicates that the null hypothesis should be rejected when the test value is in the
critical region on one side of the mean. A one-tailed test is either a right tailed test or left-tailed
test, depending on the direction of the inequality of the alternative hypothesis.
 In a two-tailed test, the null hypothesis should be rejected when the test value is in either of the two
critical regions.
 For hypothesis tests involving a population mean, we let μ0 denote the hypothesized value and we
must choose one of the following three forms for the hypothesis test.
 The three possible forms of hypotheses H0and H1are shown here. Note that the equality always
appears in the null hypothesis H0
Ho:  =0 Ho:  µ0Ho:  µ0
H1: 0H1: >0Ho:  µ0
Two-tailed test Right- tailed test Left-tailed test
Finding Critical values for 

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 15


Introduction to Business Statistics ________________________Student Lecture Note

Find critical and noncritical Regions for 0.01 (Left-Tailed Test)

Critical region non critical region

2.33
Find critical and noncritical Regions for 0.01 (Right-Tailed Test)

Critical region

non critical region


+2.33

Find critical and noncritical Regions for = 0.01 (Two-Tailed Test)

For a two-tailed test, then, the critical region must be split into two equal parts. If  = 0.01, then one-half of the
area, or 0.005, must be to the right of the mean and one half must be to the left of the mean.

Non critical
Critical region region Critical region

-2.58 +2.58

3.5. Hypothesis testing for population means and proportion


Mean of the population can be tested presuming different situations such as the population may be normal or
other than normal, sample size may be large or small and the standard deviation of the population may be
known or unknown and the alternative hypothesis many be two- sided or one-sided.
Methods for testing hypothesis for population means and proportion
There are two methods of testing hypothesis for population means and proportion
A. Critical value method
Steps for testing hypothesis by using critical value method
Step 1 State the hypotheses and identify the claim.
Step 2 Find the critical value(s) from the appropriate table.

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 16


Introduction to Business Statistics ________________________Student Lecture Note

Step 3 Compute the test value.


Step 4 Make the decision to reject or not reject the null hypothesis.
Step 5 Summarize the results (Interpretation).
B. P-value method
The P-value (or p-value or probability value) is the probability of getting a value of the test statistic that is at
least as extreme as the one representing the sample data, assuming that the null hypothesis is true.
Steps for testing hypothesis by using p-value method
Step 1 State the hypotheses and identify the claim.
Step 2 Compute the test value.
Step 3 Find the P-value.
Step 4 Make the decision
Step 5 Summarize the results.

3.3. 1. Hypothesis testing for population means


In case of testing the hypothesis for the population mean we consider two case i.e standard deviation known and
unknown case.

Hypothesis testing for the population mean: σ known case


 The σ known case corresponds to applications in which historical data and/or other information are
available that enable us to obtain a good estimate of the population standard deviation prior to sampling.
 In such cases the population standard deviation can, for all practical purposes, be considered known.
 In this section we show how to conduct a hypothesis test about a population mean for the σ known case.
Many hypotheses are tested using a statistical test based on the following general formula:

Test statistics =

 The observed value is the statistic (such as the mean) that is computed from the sample data.
 The expected value is the parameter (such as the mean) that you would expect to obtain if the null
hypothesis were true in other words, the hypothesized value.
 The denominator is the standard error of the statistic being tested (in this case, the standard error of the
mean).
The z test is defined formally as follows.
 The z test is a statistical test for the mean of a population. It can be used when n 30, or when the
population is normally distributed and s is known. The formula for the z test is

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 17


Introduction to Business Statistics ________________________Student Lecture Note

X  O
Z 
 n

Where: Z is the standard normal distribution  O is the hypothesized mean

X is the sample mean  is the population standard deviation


n is the sample size.
Example1: A researcher reports that the average salary of assistant professors is more than $42,000. A sample
of 30 assistant professors has a mean salary of $43,260. At = 0.05, test the claim that assistant
professors earn more than $42,000 per year. The standard deviation of the population is $5230. Test
the hypothesis by using critical value and p-value method?
Solution
Critical value method
Step 1 State the hypotheses and identify the claim.
H0: µ= $42,000 and H1: µ$42,000 (claim)
Step 2 Find the critical value. Since = 0.05 and the test is a right-tailed test, the critical value is z= 1.65.
Step 3 Compute the test value.

X  O $43,250  $42,000
Z  Z   1.32
 n 5230 30
Step 4 Make the decision. Since the test value, +1.32, is less than the critical value, +1.65, and is not in the
critical region, the decision is to not reject the null hypothesis.
Step 5 Summarize the results. There is not enough evidence to support the claim that assistant professors earn
more on average than $42,000 per year.
P- Value method
Step: 1 State the hypotheses and identify the claim.
H0: µ= $42,000 and H1: µ$42,000 (claim)
Step 2 Compute the test value.

X  O $43,250  $42,000
Z  Z   1.32
 n 5230 30
Step 3 Find the P-value., find the corresponding area under the normal distribution for z = 1.32. It is 0.4066.
Subtract this value for the area from 0.5 to find the area in the right tail. 0.5 - 0.4066 = 0.0934.

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 18


Introduction to Business Statistics ________________________Student Lecture Note

Hence the P-value is 0.0934


Step 4 Make the decision. The decision is to not reject the null hypothesis, since the P-value is greater than
0.05.
Step 5 Summarize the results. There is not enough evidence to support the claim that assistant professors earn
more on average than $42,000 per year.

Example 2: A researcher claims that the average wind speed in a certain city is 8 miles per hour a sample of 32
days has an average wind speed of 8.2 miles per hour. The standard deviation of the population is
0.6 mile per hour. At =0.05, is there enough evidence to reject the claim? Use the P-value
method.
Solution
Step 1 State the hypotheses and identify the claim.
H0: µ= 8 (claim) and H1: µ8
Step 2 Compute the test value.

X  O 8.2  8
Z  Z   1.89
 n 0.6 32
Step 3Find the P-value. Find the corresponding area for z =1.89. It is 0.9706. = (0.5 + 0.4706=0.9706) subtract
the value from 1.0000. 1.0000 - 0.9706 = 0.0294.
Since this is a two-tailed test, the area of 0.0294 must be doubled to get the P-value. 2(0.0294) = 0.0588
Step 4 Make the decision. The decision is to not reject the null hypothesis, since the P-value is greater than
0.05.
Step 5 Summarize the results. There is not enough evidence to reject the claim that the average wind speed is 8
miles per hour.

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 19


Introduction to Business Statistics ________________________Student Lecture Note

Summary of hypothesis tests about a population mean: σ known case


Left tail test Right tail test Two-tailed test
Hypotheses H0: µµ0 H0: µµ0 H0: µ =µ0
H0: µ µ0 H0: µµ0 H0: µ µ0
Test statistics X  O X  O X  O
Z  Z  Z 
 n  n  n
Rejection rule

P-value approach Reject H0 if p-value  Reject H0 if p-value  Reject H0 if p-value 

Critical value Reject H0 if z -z Reject H0 if z z Reject H0 if z -z/2 or z z/2


approach

Compiled by: Ismael H. We are here to Strive for a Solution!!! Page 20

You might also like