Business Research Methods: Unit 3 Scaling & Measurement Techniques
Business Research Methods: Unit 3 Scaling & Measurement Techniques
Business Research Methods: Unit 3 Scaling & Measurement Techniques
Unit 3
Scaling & measurement techniques
Three fresh MBAs joined a consulting company. The first assignment given to them was
to design and conduct a study to compare the perception of the patrons of Domino’s
Pizza with Pizza Hut. As the first step, they conducted an exploratory research by
informally talking to the management of both the pizza joints. They also conducted
three focus groups so as to gain insight into what the consumers are actually looking at
while buying pizza. The output of the unstructured interviews and focus groups
resulted in identifying various information needs that could be used in designing the
relevant questionnaire. Some of the relevant information was on gender, age, income,
frequency and occasion of eating pizza, ranking of the attributes that are sought while
choosing pizza joints, and comparative perceptions of Domino’s and Pizza Hut. This
information was to be employed in designing the questionnaire.
One question that came into the minds of the three MBAs was how to measure the
attitude and analyse the information thus obtained from the survey. For this, it was
necessary to assign numbers or symbols to the characteristics of the objects.
Assignment of numbers permits a statistical analysis of the data. The numbers assigned
and the subsequent analysis could be different, depending upon the type of question
asked. On one hand, there can be questions used to measure different psychological
aspects such as attitude, perception, image and preference of people with the help of a
certain pre-defined set of stimuli. On the other hand, there can be questions on gender,
marital status, ranking preference for different flavors, income and age.
What is Measurement?
Situation: Situational factors may also come in the way of correct measurement. Any condition which places a
strain on interview can have serious effects on the interviewer-respondent rapport. For instance, if someone
else is present, he can distort responses by joining in or merely by being present. If the respondent feels that
anonymity is not assured, he may be reluctant to express certain feelings.
Measurer: The interviewer can distort responses by rewording or reordering questions. His behaviour, style
and looks may encourage or discourage certain replies from respondents. Careless mechanical processing may
distort the findings. Errors may also creep in because of incorrect coding, faulty tabulation and/or statistical
calculations, particularly in the data-analysis stage.
Instrument: Error may arise because of the defective measuring instrument. The use of complex words,
beyond the comprehension of the respondent, ambiguous meanings, poor printing, inadequate space for
replies, response choice omissions, etc. are a few things that make the measuring instrument defective and
may result in measurement errors. Another type of instrument deficiency is the poor sampling of the universe
of items of concern.
Researcher must know that correct measurement depends on successfully meeting all of the problems listed
above. He must, to the extent possible, try to eliminate, neutralize or otherwise deal with all the possible
sources of error so that the final results may not be contaminated.
Reliability
It is the characteristic of a set of test scores that The validity of a measurement tool (for example, a
relates to the amount of random error from the
test in education) is considered to be the degree
to which the tool measures what it claims to
measurement process that might be embedded in measure in this case, the validity is an equivalent
the scores. to accuracy
Scores that are highly reliable are accurate, Validity is important because it can help determine
reproducible, and consistent from one testing what types of tests to use, and help to make sure
occasion to another researchers are using methods that are not only
ethical, and cost effective, but also a method that
That is, if the testing process were repeated with a
truly measures the idea or construct in question
group of test takers, essentially the same results
would be obtained
A Hindu may be assigned a number 1, a Sikh may be assigned a number 2, a Christian may be
assigned a number 3 and so on. Any religion which is assigned a higher number is in no way
superior to the one which is assigned a lower number. The assignment of numbers is only for
the purpose of identification.
- Nominal scale measurements are used for identifying food habits (vegetarian
or non-vegetarian), gender (male/female), caste, respondents, brands, attributes,
stores, the players of a hockey team and so on.
- The only arithmetic operations that can be carried out are the count of each
category.
- Therefore, a frequency distribution table can be prepared for the nominal scale
variables and mode of the distribution can be worked out.
- One can also use chi square test and compute contingency coefficient using
nominal scale variables.
Ordinal scale
- This is the next higher level of measurement than the nominal scale measurement.
- One of the limitations of the nominal scale measurements is that we cannot say
whether the assigned number to an object is higher or lower than the one
assigned to another option. The ordinal scale measurement takes care of this
limitation.
- An ordinal scale tells us the relative positions of the objects and not the difference
between the magnitudes of the objects. Suppose Shashi scores the highest marks
in marketing and is ranked no. 1; Mohan scores the second highest marks and is
ranked no. 2; and Krishna scores third highest marks and is ranked no. 3.
- However, from this statement we cannot say whether the difference in the marks
scored by Shashi and Mohan is the same as between Mohan and Krishna. The only
statement which can be made under ordinal scale is that Shashi has scored higher
than Mohan and Mohan has scored higher than Krishna.
Rank the following attributes while choosing a
restaurant for dinner. The most important
attribute may be ranked 1, the next important
may be assigned a rank 2 and so on:
Attribute Rank
Food Quality 2
Prices 4
Menu Variety 3
Ambience 1
Service 5
- In the ordinal scale, the assigned ranks cannot be added, multiplied,
subtracted or divided.
- The other major statistical analysis which can be carried out is the rank
order correlation coefficient, sign test.
- As the interval scale data can be converted into the ordinal and the
nominal scale data, therefore all the techniques applicable for the
ordinal and the nominal scale data can also be used for interval
scale data.
Ratio scale
- All the mathematical operations can be carried out using the ratio scale data.
- In addition to the statistical analysis mentioned in the interval, the ordinal and
the nominal scale data, one can compute coefficient of variation, geometric
mean and harmonic mean using the ratio scale measurement.
ATTITUDE
- A company is able to sell its products or services when its customers have a
favorable attitude towards its products/services.
- In the reverse scenario, the company will not be able to sustain itself for long.
- It, therefore, becomes very important to measure the attitude of the customers
towards the company’s products/services.
1. Cognitive
2. Affective
3. Intention
Cognitive component
- This component represents an individual’s information and knowledge
about an object.
- It includes awareness of the existence of the object, beliefs about the
characteristics or attributes of the object and judgment about the
relative importance of each of the attributes.
- In a survey, if the respondents are asked to name the companies
manufacturing plastic products, some respondents may remember
names like Tupperware, Cello and Pearl Pet. This is called unaided recall
awareness.
- More names are likely to be remembered when the investigator makes
a mention of them. This is aided recall.
- It may be noted that the knowledge may not be limited only to the
awareness.
- An individual can form beliefs or judgments about the characteristics or
attributes of the plastic products manufacturing companies through
advertisements, word of mouth, peer groups, etc.
- The examples of such beliefs could be that the products of Tupperware
are of high quality, non-toxic and can be used in parties.
Affective component
- The affective component summarizes a person’s overall feeling or
emotions towards the objects.
- The examples for this component could be: the food cooked in a
pressure cooker is tasty, taste of orange juice is good or the taste
of bitter gourd is very bad.
- If there are a number of alternatives to choose from, liking is
expressed in terms of preference for one alternative over the
other. Among the various soft drinks like Pepsi, Coke, Limca and
Sprite, the respondents might have to indicate the most
preferred soft drinks, the second preferred one and so on.
- This is an example of the affective component.
- The other example could be that the plastic products produced
by Pearl Pet are cheaper than Tupperware products; however,
the quality of Tupperware products is better than that of Pearl
Pet.
Intention or action
component
- This component of an attitude, also called the behavioral component, reflects a
predisposition to an action by reflecting the consumer’s buying or purchase
intention.
- It also reflects a person’s expectations of future behavior towards an object.
- How likely a person is to buy a designer carpet may range from most likely to
not at all likely, reflecting the purchase intentions.
- However, when one is talking about the purchase intentions, a time horizon has
to be kept in mind as the intentions may undergo a change over time.
- The intentions incorporate information regarding the respondent’s willingness
to pay for the product
- There is a relationship between attitude and behavior. If a consumer does not
have a favorable attitude towards the product, he/she will certainly not buy the
product. However, having a favorable attitude does not mean that it would be
reflected in the purchase behavior.
- Eg. Mercedes Benz
- Therefore, the relationship between the attitude and the purchase behavior is a
necessary condition for the purchase of the product but it is not a sufficient
condition.
CLASSIFICATION OF
SCALES
1. Single Item vs Multiple Item Scale
Single item scale: In the single item scale, there is only one item to measure
a given construct. For example:
Consider the following question:
How satisfied are you with your current job?
Very Dissatisfied
Dissatisfied
Neutral
Satisfied
Very satisfied
The problem with the above question is that there are many aspects to a
job, like pay, work environment, rules and regulations, security of job and
communication with the seniors.
The respondent may be satisfied on some of the factors but may not on
others.
By asking a question as stated above, it will be difficult to analyse the
problem areas.
To overcome this problem, a multiple item scale is proposed.
Multiple item scale
In multiple item scale, there are many items that play a role in forming the underlying construct that the researcher
is trying to measure. This is because each of the item forms some part of the construct (satisfaction) which the
researcher is trying to measure.
As an example, some of the following questions may be asked in a multiple item scale.
1. How satisfied are you with the pay you are getting on your current job?
Very dissatisfied
Dissatisfied
Neutral
Satisfied
Very satisfied
2. How satisfied are you with the rules and regulations of your organization?
Very dissatisfied
Dissatisfied
Neutral
Satisfied
Very satisfied
3. How satisfied are you with the job security in your current job?
Very dissatisfied
Dissatisfied
Neutral
Satisfied
Very satisfied
2. Comparative vs Non-comparative Scales
Comparative Scales
Limca
Coke
Pepsi
Sprite
7UP
Fanta
Rank order scaling: In the rank order scaling, respondents are presented with several objects
simultaneously and asked to order or rank them according to some criterion. Consider, for
example the following question:
• Rank the following soft drinks in order of your preference, the most preferred soft drink
should be ranked one, the second most preferred should be ranked two and so on.
Like paired comparison, this approach is also comparative in nature. The problem with this
scale is that if a respondent does not like any of the above-mentioned soft drink and is forced
to rank them in the order of his choice, then, the soft drink which is ranked one should be
treated as the least disliked soft drink and similarly, the other rankings can be interpreted.
This scale is very commonly used to measure preferences for brands as well as attributes.
Constant sum rating scaling: In constant sum rating scale, the respondents are asked to allocate a
total of 100 points between various objects and brands. The respondent distributes the points to
the various objects in the order of his preference. Consider the following example:
• Allocate a total of 100 points among the various schools into which you would like to admit
your child. The more the points you allocate to a school, more preferred it is to be considered.
The points should be allocated in such a way that the sum total of the points allocated to
various schools adds up to 100.
School Points
DAV 30
APPEJAY 5
Modern 5
Mother’s International 15
Laxman Public School 4
Tagore International 6
Amity 35
Total Point 100
This type of data is not only comparative in nature but could also result in ratio scale measurement.
This type of scale is widely used in allocating weights which the consumer may assign to the various
attributes of a product.
Q-sort technique:
- The Q-sort technique was developed to discriminate among a
large number of objects quickly.
- This technique makes use of the rank order procedure in
which objects are sorted into different piles based on their
similarity with respect to certain criterion.
- Suppose there are 100 statements and an individual is asked
to pile them into five groups, in such a way, that the strongly
agreed statements could be put in one pile, agreed statements
could be put in another pile, neutral statements form the third
pile, disagreed statements come in the fourth pile and
strongly disagreed statements form the fifth pile, and so on.
The data generated in this way would be ordinal in nature.
Non-comparative Scales
• This is a continuous scale, also called graphic rating Scale. In the graphic rating scale the
respondent is asked to tick his preference on a graph. Consider for example the following
question:
• Please put a tick mark ( ) on the following line to indicate your preference for fast food.
Itemized rating scale
- In the itemized rating scale, the respondents are provided with a scale that has a number of
brief descriptions associated with each of the response categories.
- The response categories are ordered in terms of the scale position and the respondents are
supposed to select the specified category that describes in the best possible way an object is
rated.
- Itemized rating scales are widely used in survey research. There are certain issues that
should be kept in mind while designing the itemized rating scale.
Very happy
Happy
Netiher happy nor sad
Very Sad
Sad
Balanced versus unbalanced scales: A balanced scale is the
one which has equal number of favourable and unfavourable
categories. Examples of balanced and unbalanced scale are
given below. The following is the example of a balanced
scale:
• How important is price to you in buying a new car?
Very important
Relatively important
Neither important nor unimportant
Relatively unimportant
Very unimportant
In this question, there are five response categories, two of which emphasize the importance of
price and two others that do not show its importance. The middle category is neutral.
1. Likert scale:
- This is a multiple item agree–disagree five-point scale.
- The respondents are given a certain number of items (statements) on
which they are asked to express their degree of agreement /
disagreement.
- This is also called a summated scale because the scores on individual
items can be added together to produce a total score for the
respondent.
- In a typical research study, there are generally 25 to 30 items on a Likert
scale.
- To construct a Likert scale to measure a particular construct, a large
number of statements pertaining to the construct are listed. These
statements could range from 80 to 120. The identification of the
statements is done through exploratory research which is carried out by
conducting a focus group, unstructured interviews with knowledgeable
people, literature survey, analysis of case studies and so on.
For example, if a respondent has ticked (/) statements numbering from one to
ten as shown in Table 7.5, his total score would be 3 + 5 + 4 + 4 + 5 + 4 + 4 + 5 + 4
+ 4 = 42 out of 50. Now if there are 100 respondents and 100 statements, the
score on the image of the company can be worked out for each respondent by
adding his/her scores on the 100 statements. The minimum score for each
respondent will be 100, whereas the maximum score would be 500
2. Semantic differential scale:
- This scale is widely used to compare the images of competing brands, companies
or services.
- Here the respondent is required to rate each attitude or object on a number of
five-or seven-point rating scales.
- This scale is bounded at each end by bipolar adjectives or phrases.
- The difference between Likert and Semantic differential scale is that in Likert
scale, a number of statements (items) are presented to the respondents to
express their degree of agreement/disagreement.
- However, in the semantic differential scale, bipolar adjectives or phrases are
used.
- As in the case of Likert scale, the information on the phrases and adjectives is
obtained through exploratory research.
- At times there may be a favourable or unfavourable descriptor (adjectives) on
the right-hand side and on certain occasions these may be presented on the left-
hand side.
- This rotation becomes necessary to avoid the halo effect. This is because the
location of previous judgments on the scale may influence the subsequent
judgements because of the carelessness of the respondents. The mid point of a
bipolar scale is a neutral point.
3. Stapel scale
- The Stapel scale is used to measure the direction and intensity of
an attitude.
- At times, it may be difficult to use semantic differential scales
because of the problem in creating bipolar adjectives.
- This scale generally has 10 categories involving numbering –5 to
+5 without a neutral point and is usually presented in a vertical
form.
- The job of the respondent is to indicate how accurately or
inaccurately each term describes the object by selecting an
appropriate numerical response category. If a positive higher
number is selected by the respondent, it means the respondent
is able to describe it more favorably.
MEASUREMENT ERROR
1. There are factors like mood, fatigue and health of the respondent which may influence
the observed response while the instrument is being administered.
2. The variations in the environment in which measurements are taken may also result in a
departure from the true value.
3. There are situations when a respondent may not understand the question being asked
and the interviewer may have to rephrase the same. While rephrasing the question the
interviewer’s bias may get into the responses. Also how the questionnaire is
administered (telephone survey, personal interview with questionnaire or mail survey)
will have its own impact on the responses.
4. At times, some of the questions in the questionnaire may be ambiguous and some may
be very difficult for the respondents to understand. Both of them can cause deviation
from the correct response, thereby giving rise to measurement error.
5. At times, the errors may be committed at the time of coding, entering of data from
questionnaire to the spreadsheet on the computer and at the tabulation stage.
Criteria for Good Measurement
There are three criteria for evaluating measurements: reliability, validity and sensitivity.
1. Reliability
• Split-half reliability method: This method is used in the case of multiple item scales.
Here the number of items is randomly divided into two parts and a correlation
coefficient between the two is obtained. A high correlation indicates that the
internal consistency of the construct leads to greater reliability.
2. Validity
The validity of a scale refers to the question whether we are measuring what we want to measure.
Validity of the scale refers to the extent to which the measurement process is free from both
systematic and random errors. The validity of a scale is a more serious issue than reliability. There are
different ways to measure validity.
• Content validity: This is also called face validity. It involves subjective judgment by an expert for
assessing the appropriateness of the construct. For example, to measure the perception of a
customer towards Jet Airways, a multiple item scale is developed. A set of 15 items is proposed.
These items when combined in an index measure the perception of Jet Airways. In order to
judge the content validity of these 15 items, a set of experts may be requested to examine the
representativeness of the 15 items. The items covered may be lacking in the content validity if
we have omitted behavior of the crew, food quality, and food quantity, etc., from the list. In fact,
conducting the exploratory research to exhaust the list of items measuring perception of the
airline would be of immense help in such a case.
• Concurrent validity: It is used to measure the validity of the new measuring techniques by
correlating them with the established techniques. It involves computing the correlation
coefficient of two measures of the same phenomena (for example, perception of an airline and
image of a company) which are administered at the same time.
• Predictive validity: This involves the ability of a measured phenomena at one point of time to
predict another phenomenon at a future point of time. If the correlation coefficient between the
two is high, the initial measure is said to have a high predictive ability.
3. Sensitivity
- A dichotomous response category such as agree or disagree does not allow the
recording of any attitude changes. A more sensitive measure with numerous
categories on the scale may be required. For example, adding strongly agree,
agree, neither agree nor disagree, disagree and strongly disagree categories will
increase the sensitivity of the scale.