Introduction To Research and Psychological Testing
Introduction To Research and Psychological Testing
Introduction To Research and Psychological Testing
What is Research?
A systematic, empirical, critical investigation that is structured to answer questions about
the behaviour and experience of individuals, is called research. Research can have educational,
occupational or clinical applications. The researcher (who more often than not is really a small
group of researchers) formulates a research question, conducts a study designed to answer the
question, analyzes the resulting data, draws conclusions about the answer to the question, and
publishes the results so that they become part of the research literature. Because the research
literature is one of the primary sources of new research questions, this process can be thought of
as a cycle. New research leads to new questions, which lead to new research, and so on. It can be
better understood by the following figure:
Figure 1
Model of Scientific Research in Psychology
This figure also indicates that research questions can originate outside of this cycle either
with informal observations or with practical problems that need to be solved. But even in these
cases, the researcher would start by checking the research literature to see if the question had
already been answered and to refine it based on what previous research had already found.
Therefore, to sum up, research is all about answering questions.
Goals in Research
Every research has the common goal of learning how things work. The goals specifically
aimed at uncovering the mysteries of human and animal behaviour are description, explanation,
prediction, control and application.
Description
The first step in understanding anything is to describe it. Description involves observing a
behaviour and noting everything about it: what is happening, where it happens, to whom it
happens and under what circumstances it seems to happen.
Explanation
Based on one’s observation, a researcher might try to understand or find an explanation.
Finding explanations for behaviour is a very important step in the process of forming theories of
behaviour. A theory is a general explanation of a set of observations or facts. The goal of
description provides the observations, and the goal of explanation helps to build the theory.
Prediction
Determining what will happen in the future is a prediction. If through research, a
researcher gains knowledge, he can use it to make future predictions to control or modify the
behaviour under study.
Control
The focus of control, or the modification of some behaviour is to change a behaviour
from an undesirable one to a desirable one.
Application
The use of acquired knowledge for betterment of human society, is one of the last goals
of research. If the results obtained by research can be applied in real life, it fulfils the application
goal of research.
Steps in Quatitative Research
Step1: Identify a Question of Interest
The first step of scientific enquiry is to identify the question of interest. From personal
experiences, news events, scientific articles and books, and other sources, researchers observe
something that piques their interest and they ask a question about it.
Step 2: Gather Information and Form Hypothesis
Next, scientists examine whether any studies, theories, and other information already
exists that might help answer their question, and then they form a hypothesis. Hypothesis is a
specific prediction about some phenomenon.
Step 3: Test Hypothesis by Conducting Research
The third step is to test the hypothesis by conducting research. Research conducted can
be done through various methods. The data is collected and the hypothesis is tested by drawing
out results from the acquired data.
Step 4: Analyse data, Draw Tentative Conclusions, and Report Findings
At the fourth step, researchers analyse the information (called data) they collect, draw
tentative conclusions, and report their findings to the scientific community. If expert reviewers
favourably judge the quality and importance of the research, the article gets published. It allows
fellow scientists to learn about new findings, to evaluate the research, and to challenge or expand
on it.
Step 5: Build a Body of Knowledge
At the fifth step, scientists build a body of knowledge about the topic in question. They
ask further questions, formulate new hypotheses, and test those hypotheses by conducting more
research. As additional evidence comes in, scientists may attempt to build theories. A theory is a
set of formal statements that explains how and why certain events are related to one another.
Theories are broader than hypotheses. Scientists use the theory to formulate new hypotheses,
which are then tested by conducting still more research. In this manner, the scientific process
becomes self correcting. If research consistently supports the hypotheses derived from the theory,
confidence in the theory becomes stronger. If the predictions made by the theory are not
supported, then it will need to be modified or, ultimately, discarded.
Types of Research
On the basis of Use and Audience of Research
Basic Research. Research designed to advance fundamental knowledge about how the
world works and build/test theoretical explanations by focusing on the “why” question. The
scientific community is its primary audience. Basic research lacks practical applications in the
short term, but it builds a foundation for knowledge and broad understanding that has an impact
on many issues, policy areas, or areas of study. Basic research is also the main source of the
tools—methods, theories, and ideas—that all researchers use. Almost all of the major
breakthroughs and significant advances in knowledge originated in basic research. It lays a
foundation for core understandings and may have implications for issues that do not even exist
when a study is conducted.
Applied Research. Research designed to offer practical solutions to a concrete problem
or address the immediate and specific needs of clinicians or practitioners. Only rarely in applied
research do we try to build, test, or make connections to theory. Most applied research studies are
short term and small scale. They offer practical results that we can use within a year or less. For
example, the student government of University X wants to reduce alcohol abuse. It wants,
therefore, to find out whether the number of University X students arrested for driving while
intoxicated would decline if the student government were to sponsor alcohol-free parties next
year. An applied research study would be most applicable for this situation. Businesses,
government offices, health care facilities, social service agencies, political organisations, and
educational institutions conduct applied studies and make decisions based on findings.
On the basis of Time
Cross-sectional. Any research that examines information on many cases at one point in
time. Cross-sectional research can be exploratory, descriptive, or explanatory, but it is most
consistent with a descriptive approach. It is usually the simplest and least costly alternative but
rarely captures social processes or change. For example, scientists in healthcare may use
cross-sectional research to understand how children ages 2-12 years across India are prone to
calcium deficiency.
Longitudinal Research. Any research that examines information from many units or
cases across more than one point in time We can use longitudinal studies for exploratory,
descriptive, and explanatory purposes. Usually more complicated and costly to conduct than
cross sectional research, longitudinal studies are more powerful. For instance, consider a study
conducted to understand the similarities or differences between identical twins who are brought
up together versus identical twins who are not. The study observes several variables, but the
constant is that all the participants have identical twins. In this case, researchers would want to
observe these participants from childhood to adulthood, to understand how growing up in
different environments influences traits, habits, and personality. Over many years, researchers
can see both sets of twins as they experience life without intervention. Because the participants
share the same genes, it is assumed that any differences are due to environmental factors, but
only an attentive study can conclude those assumptions.
On the basis of Purpose of Research
Descriptive Research. Research in which the primary purpose is to “paint a picture”
using words or numbers and to present a profile, a classification of types, or an outline of steps to
answer questions such as who, when, where, and how. Descriptive research presents a picture of
the specific details of a situation, social setting, or relationship. Much of the social research
found in scholarly journals or used for making policy decisions is descriptive. A descriptive
research study starts with a well-defined issue or question and tries to describe it accurately. The
study’s outcome is a detailed picture of the issue or answer to the research question. For
example, the focused issue might be the relationship between parents who are heavy alcohol
drinkers and child abuse. Results could show that 25 percent of heavy-drinking parents had
physically or sexually abused their children compared to 5 percent of parents who never drink or
drink very little.
Exploratory Research. Research whose primary purpose is to examine a little
understood issue or phenomenon and to develop preliminary ideas about it and move toward
refined research questions. We use exploratory research when the subject is very new, we know
little or nothing about it, and no one has yet explored it. Our goal with it is to formulate more
precise questions that we can address in future research. As a first stage of inquiry, we want to
know enough after the exploratory study so we can design and execute a second, more
systematic and extensive study. Researchers who conduct exploratory research must be creative,
open minded, and flexible; adopt an investigative stance; and explore all sources of information.
For example, an expectation might be that the impact of immigration to a new nation would be
more negative on younger children than on older ones. Instead, the unexpected finding was that
children of a specific age group (between ages six and eleven) who immigrate are most
vulnerable to its disruption more so than either older or younger children.
Explanatory Research. Research whose primary purpose is to explain why events occur
and to build, elaborate, extend, or test theory. When encountering an issue that is known and
with a description of it, we might wonder why things are the way they are. Addressing the “why”
is the purpose of explanatory research. It builds on exploratory and descriptive research and goes
on to identify the reason something occurs. Going beyond providing a picture of the issue, an
explanatory study looks for causes and reasons. For example, a descriptive study would
document the numbers of heavy-drinking parents who abuse their children whereas an
explanatory study would be interested in learning why these parents abuse their children. We
focus on exactly what is it about heavy drinking that contributes to child abuse.
On the basis of Data Collection Technique
Qualitative Research. Qualitative research is expressed in words. It is used to
understand concepts, thoughts or experiences. This type of research enables you to gather
in-depth insights on topics that are not well understood. Common qualitative methods include
interviews with open-ended questions, observations described in words, and literature reviews
that explore concepts and theories.
Quantitative Research. Quantitative research is expressed in numbers and graphs. It is
used to test or confirm theories and assumptions. This type of research can be used to establish
generalizable facts about a topic. Common quantitative methods include experiments,
observations recorded as numbers, and surveys with closed-ended questions.
Table 1
Qualitative v/s Quantitative Research
Non linear research path that permits and Clearly set linear path and successive
obligates the researcher to go in cyclical, back procedures that seem to follow in a logical
and forth, and non-‐successive sequences. sequence.
May start out with a vague or poorly defined Questions are finalised before the study and
research question which may evolve as the are used in developing steps and guiding the
study progresses and new insights are gained study.
and incorporated.
Use semi-‐structured methods such as in-‐ Use highly structured methods such as
depth interviews, focus groups and participant questionnaires, surveys and structured
observation. observation
Coefficient alpha is an index of the internal consistency of the items, that is, their
tendency to correlate positively with one another. Insofar as a test or scale with high internal
consistency will also tend to show stability of scores in a test–retest approach, coefficient alpha
is therefore a useful estimate of reliability.
Cronbach (1951) has shown that coefficient alpha is the general application of a more
specific formula developed earlier by Kuder and Richardson (1937). Their formula is generally
referred to as Kuder- Richardson formula 20 or, simply, KR-20.
Interscorer. Some tests leave a great deal of judgement to the examiner in the
assignment of scores like projective techniques and tests on moral development and creativity. In
this method a sample of tests is independently scored by two or more examiners and scores for
pairs of examiners are then correlated.
Validity
Standards for Educational and Psychological Testing define validity as “the degree to
which evidence and theory support the intended interpretation of test scores for the proposed
purpose” (AERA, APA, NCME, 1999, p.11). A test is valid to the extent that inferences made
from it are appropriate, meaningful, and useful. In other words, validity describes how
adequately attest measures the attribute it is designed to measure.
Traditionally, the different ways of accumulating validity evidence have been grouped
into three categories:
Content validity. Content validity is determined by the degree to which the questions,
tasks, or items on a test are representative of the universe of behaviour the test was designed to
sample. The items of a test can be visualised as a sample drawn from a larger population of
potential items that define what the researcher really wishes to measure. If the sample (specific
items on the test) is representative of the population (all possible items), then the test possesses
content validity. For example, is there an appropriate representation of questions from each topic
area on the assessment that reflect the curriculum that is being taught.
Criterion-related validity. Criterion-related validity is demonstrated when a test is
shown to be effective in estimating an examinee’s performance on some outcome measure. In
this context, the variable of primary interest is the outcome measure, called a criterion. The test
score is useful only insofar as it provides a basis for accurate prediction of the criterion. For
example, a college entrance exam that is reasonably accurate in predicting the subsequent grade
point average of examinees would possess criterion-related validity.
Construct validity. A construct is a theoretical, intangible quality or trait in which
individuals differ (Messick, 1995). Examples of constructs include leadership ability,
overcontrolled hostility, depression, and intelligence. A test designed to measure a construct
must estimate the existence of an inferred, underlying characteristic (e.g., leadership ability)
based on a limited sample of behaviour. Construct validity refers to the appropriateness of these
inferences about the underlying construct.
Norms
Norms indicate an examinee’s standing on the test relative to the performance of other persons of
the same age, grade, sex and so on. It is important to note that several different groups may be
used in providing normative information for interpreting test scores. There are national, local and
subgroup norms. This essentially means that:No single population can be regarded as the norm
group.A wide variety of norm-based interpretations could be made for a given raw score,
depending on which normative group is chosen
Provided that they are of sufficient size and fairly representative of their categories
subgroups can be formed in terms of sex, occupation, ethnicity, scio-economic level, education
level or any other variables that may have a significant impact on test scores or yield
comparisons of interest. The types of norms can be as follows.
National Norms. National norms are derived from a normative sample that was
nationally representative of the population at the time the norming study was conducted. Norms
for group ability tests and large achievement test batteries used in school settings are usually in
scope.
Local Norms. Typically developed by test users themselves, local norms provide
normative information with respect to the performance of a more narrowly defined population on
some test such as the employees of a particular company or the students of a certain university.
Subgroup Norms. When large samples are gathered to represent broadly defined
population norms can be reported in the aggregate or can be separated into sub group norms.
Provided they are of sufficient size and fairly representative of their categories subgroup groups
can be formed in terms of sex, occupation, ethnicity socio-economic level, education level or any
other variable that may have a significant impact on test scores or yield comparisonof interest.
Age Norms. Age equivalent scores, also known as age norms, depict the level of test
performance for each separate age group in the normative sample. The purpose of age normsIs to
facilitate same east comparisons.With age norms,performance open examining is interpreted in
relation to standardisation subjects of same age. Age norms can be developed for any
characteristics that systematically change with age such as vocabulary, mathematical ability,
moral reasoning etc.
Grade Norms. Grade equivalent scores also known as great norms are conceptually
similar to age norms. A grade norm depicts the level of test performance for each separate grade
in the normative sample. Great norms are rarely used with ability tests.However these norms are
specially useful in school settings when reporting the achievement levels of schoolchildren.
Within-group Norms.Within groups, norms can be described as a test scoring method. It
is the most common normative strategy for testing. This type of scoring is very common in
psychological and intelligence measures. A test is given to a group of individuals and their
results are used to create a normal distribution. This distribution of scores is used as a normative
group in which to compare and score people who take the test. The within-group norms can be,
percentile norms, standard scores, stanines and stens.
Percentile norms express the percentage of cases in the standardisation sample who
scored below a specific raw score. For example, if 94 percent of the sample fell below a raw
score of 25, we can say that a raw score of 25 corresponds to a percentile rank of 94. This will be
denoted as P94 = 25. The 50th percentile (Q2) corresponds to the median and the 25th and the
75th percentile are known as the first and third quartile points (Q1 and Q3). Percentiles make no
assumption with regard to the characteristics of the total distribution. Thus they can be
interpreted easily when the distribution of test scores is non-normal.
The standard score indicates the position of raw scores, relative to the mean of the
distribution, in standard deviation units. Unlike percentile ranks, standard scores represent
measurement on an interval scale. Standard scores are obtained by linear transformation of the
data. The distribution of standard scores has exactly the same shape as the distribution of raw
scores. Therefore, the relative magnitude of differences between successive values correspond
exactly to that between the raw scores. One of the most familiar standard scores is the z score.
The z score has a mean of 0 and a standard deviation of 1. The z score is extremely useful
because it indicates each person’s standing as compared to the group mean. Also, when the
distribution of raw scores is reasonably normal, it can be directly converted into a percentile. T
score is a variant of z score, suggested by McCall (1922). It is exactly the same as the z score
except that the mean is 50 rather than 0 and the standard deviation is 10 rather than 1.
Stanines were originally devised by the U.S. Air Force during World War II. The
“standard nine,” or stanine scale divides the distribution of scores into nine groups, and
transforms all the scores into single-digit numbers from 1 to 9. The mean of the stanine scale is 5
and its standard deviation is approximately 2. Except the ranks of stanine 1 (lowest) and 9
(highest), each unit is equal to one half of a standard deviation.
The Sten (standard ten) is a standard score system that is conceptually similar to the
stanine scale. Stens divide the score scale into ten units. Each unit has a band width of half a
standard deviation except the highest unit (Sten 10) which extends from 2 standard deviations
above the mean, and the lowest unit (Sten 1) which extends from 2 standard deviations below the
mean. The mean of the sten scale is 5.5 and its standard deviation is approximately 2.
Types of Tests
On the Basis of Number of People
Group Tests. A Group Test consists of tests that can be administered to a large group of
people at one time. Group tests were designed as mass testing instruments; they not only permit
the simultaneous examination of large groups but they also use simplified instruction and
administration procedures. Thereby requiring a minimum of training on the part of the examiner.
Most testing today is administered as group tests considering the many benefits that are
associated with these tests. Considering the many standardised tests that are administered each
year, it is understandable that many of these are group tests. Examples of group tests include
statewide testing throughout K-12 students, placement examinations into college, and placement
examinations into graduate coursework. There are many advantages to group tests over
individual tests. Group tests are much more time-efficient in many aspects. For example, group
tests are administered to many people at once; to test each person individually is unrealistic.
Group tests also are much easier to score because they are dominantly multiple choice. More
time is taken to score short answer or essay-based questions, but these are still much quicker than
scoring each person's individual answers. Scoring is also more objective and more reliable since
subjectivity of the grader is not as prevalent. These tests are also more cost-efficient since they
don't require expensive materials or extensive training of administrators
Individual Tests. An individual test is a test that can be administered to only one person
at a time. The examiner gives instructions and records the examinee’s responses using a
standardised approach outlined in the test manual. The examiner then assesses and scores the
responses. This scoring procedure usually involves considerable skill. For example: Stanford
Binet Intelligence Test, Wechsler Scales. The advantage of individual tests is that they are often
more comprehensive, valid, and have better psychometric properties than group tests. They are
helpful in determining a person’s unique attributes and allow individualised interpretation of test
results. Individual tests, however, are more expensive due to the necessity for a qualified
administrator.
On the Basis of Degree of Difficulty
Speed Tests. A type of test used to calculate the number of problems or tasks the
participant can solve or perform in a predesignated block of time. The participant is often, but
not always, made aware of the time limit. Speed tests are designed to assess how quickly a test
taker is able to complete the items within a set time period. The primary objective of speed tests
is to measure the person's ability to process information quickly and accurately, while under
duress. Speed tests contain more items than the vast majority of applicants will be able to answer
in the time allotted, and the items are usually not high in difficulty. Scoring is based on how
many questions are answered by the applicant within the time limit. Often these tests are used by
human resource professionals and I/O Psychologists during the hiring process. Example of a
speed test is the Clinical Speed and Accuracy Test.
Power Tests. a type of test intended to calculate the participant’s level of mastery of a
particular topic under conditions of little or no time pressure. The test is designed so that items
become progressively more difficult. Thus, power tests are designed to gauge the knowledge of
the test-taker. A score on the power test depends entirely upon the numbers of items answered
and answered correctly. Raven’s Progressive Matrices (Raven & Court, 1998) is an example of
power test.
On the Basis of Culture Fairness
Culture Fair Tests. Culture-free tests, in contrast, are those that are relatively free of
specific cultural influences of the test designer and administrator. Items are designed to measure
innate abilities not affected by culture. Example: Maze tests and Block design tests
Culturally Loaded Tests. These types of tests are designed for a specific population and
show biased results for a specific group, culture, and population due to cultural influence. A
particular population influenced by cultural elements display either low or high scores relative to
the test norms.
On the Basis of Attribute and Purpose
Intelligence Tests. Intelligence refers to the global mental capacities of an individual,
and tests of intelligence essentially measure rational and abstract thinking of an individual. They
are designed to measure the global mental capacities of an individual in terms of verbal
comprehension, perceptual organisation, reasoning etc. The purpose is usually to determine the
subject’s suitability for some occupation or scholastic work. Example of the most commonly
used Intelligence test is Wechsler Adult Intelligence Scale (WAIS).
Achievement Tests. Achievement refers to a person’s past learning, and achievement
tests are designed to measure a person’s past learning on accomplishment in a task. Stanford
Achievement Test by Gardner and Madden (1969) is an example of Achievement test. The
distinction between aptitude and achievement tests is more a matter of use than content (Gregory
1994). In fact, any test can be an aptitude test to the extent it helps in predicting future
performance. Likewise, any test can be an achievement test to the extent it measures past
learning and measures a person's degree of success, or accomplishment in a subject or task.
Aptitude Tests. Aptitude refers to an individual’s potential to learn a specified task under
provision of training. Aptitude tests are designed to measure the subject’s capability of learning
specific tasks or acquiring specific skills. SAT (Scholastic Aptitude Test), Seashore Measure of
Musical Talent, Guilford and Zimmerman Aptitude Survey, General Aptitude Test Battery etc are
some examples of aptitude tests.
Personality Tests. These tests are designed to measure a person’s individuality in terms
of his unique traits and behaviour. These tests help in predicting an individuals’ future behaviour.
They come in several varieties like checklists, inventories and subject evaluation techniques,
inkblot and sentence completion tests. Personality tests can broadly be classified further into two
categories –structured personality tests and unstructured personality tests.
Structured Personality Tests are based on the premise that there are common dimensions
across all personalities which can be measured with the help of a psychological test in an
objective manner. In such tests, responses are already defined and the testee has only to choose
one of the options in the form of his responses. Tests coming in this category are 16PF, MMPI,
Maudsley Personality Inventory (MPI), and so on.
Unstructured Personality Tests, on the other hand, believe in idiosyncratic individual
specific needs, which are discovered and measured by analysing the responses given by the
testee on the presentation of ambiguous stimuli. These tests are based on the rationale that a
test-taker reacts to a vague or an ambiguous stimulus by projecting own feelings, thoughts,
experiences and memories. The responses given by the client indicate different facets of the
personality dimensions. Examples of unstructured personality tests are projective tests like
Thematic Apperception Test (TAT), Rorschach Inkblot Test etc.
Interest Inventories/Tests. Measure an individual's preference for certain activities or
topics and thereby help determine occupational choice. Examples of Interest inventories include
the Strong Interest Inventory, the Campbell Interest and Skill Survey, and the Myers Briggs Type
Indicator (MBTI).
Creative Tests. Creativity refers to a person’s ability to think of new ideas and creativity
tests are designed to measure a person’s ability to produce new and original ideas, and the
capacity to find unexpected solutions to vaguely defined problems. Examples of creativity tests
are the Torrance Test of Creative Thinking by E. Paul Torrance (1966) and the Creativity Self
Report by Feldhusen (1965).
Neuropsychological Tests. Measure cognitive, sensory, perceptual, and motor
performance to determine the extent, locus, and behavioural consequences of brain damage.
Behavioural Procedures. Objectively describe and count the frequency of a behaviour,
identifying the antecedents and consequences of the behaviour. Some behavioural assessments
include Vineland Adaptive Behaviour Scales, Conners Parent and Teacher Rating Scales, and
Behaviour Assessment System for Children (BASC), among others.
Applications of Testing
By far the most common use of psychological tests is to make decisions about persons.
For example, educational institutions frequently use tests to determine placement levels for
students, and universities ascertain who should be admitted, in part, on the basis of test scores.
State, federal, and local civil service systems also rely heavily on tests for purposes of personnel
Selection. But simple decision making or hiring is not the only function of psychological testing.
It is convenient to distinguish five uses of tests:
Classification. The term classification encompasses a variety of procedures that share a
common purpose: assigning a person to one category rather than another. Thus, classification can
have important effects such as granting or restricting access to a specific college or determining
whether a person is hired for a particular job. There are many variant forms of classification,
each emphasising a particular purpose in assigning persons to categories. We will distinguish
placement, screening, certification, and selection.
Diagnosis and Treatment Planning. Diagnosis consists of two intertwined tasks:
determining the nature and source of a person’s abnormal behaviour, and classifying the
behaviour pattern within an accepted diagnostic system. Diagnosis is usually a precursor to
remediation or treatment of personal distress or impaired performance. Psychological tests often
play an important role in diagnosis and treatment planning. For example, intelligence tests are
absolutely essential in the diagnosis of mental retardation. A proper diagnosis conveys
information—about strengths, weaknesses, etiology, and best choices for remediation/treatment.
Self-knowledge. In some cases, the feedback a person receives from psychological tests
can change a career path or otherwise alter a person’s life course. Of course, not every instance
of psychological testing provides self-knowledge.
Program Evaluation. Another use for psychological tests is the systematic evaluation of
educational and social programs. We focus here on the use of tests in the evaluation of social
programs. Social programs are designed to provide services that improve social conditions and
community life.
Research. Tests also play a major role in both the applied and theoretical branches of
behavioural research. As an example of testing in applied research, consider the problem faced
by neuropsychologists who wish to investigate the hypothesis that low-level lead absorption
causes behavioural deficits in children. The only feasible way to explore this supposition is by
testing normal and lead-burdened children with a battery of psychological tests.
Limitations of Testing
Uncertainty of Measurement. Because psychological tests are attempting to measure
attributes that are not directly observable, there is always a gap between what a test is attempting
to measure and what is what it actually measures. Since tests often rely on indirect measures
such as an individual responding to hypothetical situations, the decisions made in testing
situations are not always the same that people would take in real life situations.
Changing Circumstances. Because of changes in psychological theories and
advancements in technology, psychological tests only remain relevant for a time. Social or
cultural changes can lead to test items becoming obsolete, or new psychological theories may
replace the founding theories of the tests.To remain valid and reliable, psychological tests must
be updated often.
Cultural Bias. Psychological tests often used the dominant middle class culture as the
standard. This limits their validity for individuals from a different economic or cultural
background who may not have the same experiences that the test assumes as standard. It is
nearly impossible to create test questions that account for the different experiences of
individuals, so test administrators must use results with caution.
Language Bias. Most psychological tests are standardised in English and test results are
often not accurate for people who speak another language. Even when tests are translated into
native languages, problems occur with words that have multiple meanings and idioms specific to
one language or culture.
Inappropriate Standardisation Samples. Tests are often standardised on specific
normative groups. Most often minorities’ representation in norming samples may be insufficient
to allow for accurate interpretations of those groups.
Examiners’ Bias. Examiners who speak standard English may intimidate examinees and
communicate inaccurately with them, spuriously lowering their test scores. Sex, experience, or
race of the examiner may also affect test scores.
Inequitable Social Consequences. According to some authors, the unequal results of
standardised tests produce inequitable social consequences. Low test scores relegate minority
group members, already at an educational and vocational disadvantage, to educational tracks that
lead to mediocrity and low achievement. They may also be denied employment or be subjected
to other forms of discrimination.
Stereotype Threat. Labelling or stereotyping is another example of social consequences
of psychological testing. Stereotype threat is the thread of confirming, as self characteristic, a
negative stereotype about one’s group. For example, based on published data and media
coverage about race and IQ scores, African Americans are stereotyped as possessing less
intellectual ability than others. As a consequence, whenever they encounter tests of intelligence
or academic achievements, individuals from this group may perceive a risk that they will confirm
the stereotype.
Ethics in Research and Psychological testing
Rapport Formation
Ethics refers to the correct rules of conduct necessary when carrying out research. We have a
moral responsibility to protect research participants from harm. However important the issues under
investigation, psychologists need to remember that they have a duty to respect the rights and dignity of
research participants. This means that they must abide by certain moral principles and rules of
conduct. The purpose of these codes of conduct is to protect research participants, the reputation of
psychology, and psychologists themselves.
Voluntary Participation
All ethical research must be conducted using willing participants. Study volunteers should not
feel coerced, threatened or bribed into participation. This becomes especially important for researchers
working at universities or prisons, where students and inmates are often encouraged to participate in
experiments.
Informed Consent
Whenever possible investigators should obtain the consent of participants. In practice, this
means it is not sufficient to simply get potential participants to say “Yes”. They also need to know
what it is that they are agreeing to. In other words, the psychologist should, so far as is practicable,
explain what is involved in advance and obtain the informed consent of participants.
Debriefing
After the research is over the participant should be able to discuss the procedure and the
findings with the psychologist. They must be given a general idea of what the researcher was
investigating and why, and their part in the research should be explained. Participants must be told if
they have been deceived and given reasons why. They must be asked if they have any questions and
those questions should be answered honestly and as fully as possible.
Sharing the Results of The Study
After research results are published, psychologists do not withhold the data on which their
conclusions are based from other competent professionals who seek to verify the substantive claims
through reanalysis and who intend to use such data only for that purpose, provided that the
confidentiality of the participants can be protected and unless legal rights concerning proprietary data
preclude their release.
Confidentiality of Data
Participants and the data gained from them must be kept anonymous unless they give their full
consent. No names must be used in a lab report.
References
Online Library.
https://onlinelibrary.wiley.com/doi/full/10.1002/pa.2404#:%7E:text=Ethnographic%20res
earch%20is%20perhaps%20the,method%20in%20psychology%20and%20medicine.&tex
t=This%20type%20of%20research%20method,collection%20based%20on%20these%20
foundations.
Ciccareli S., & White N. (2017). Psychology: An exploration (4th Ed.). Pearson.
Chiang, I. A. (2015). Scientific research in psychology – research methods in
https://opentextbc.ca/researchmethods/chapter/scientific-research-in-psychology/
Holt, N., Bremner, A., Sutherland, E., Vliek, M., Passer, M., & Smith, R. (2019). Psychology:
Kantowitz, B. H., Roediger, H. I. L., & Elmes, D. G. (2014). Experimental psychology (10th
Mishra L. (2016, June). Focus group discussion in qualitative research. Techno Learn 6:(1), p.
1-5.
Neuman, L. W. (2009). Social research methods: Qualitative and quantitative approaches (7th
ed.). Pearson.
https://www.understood.org/articles/en/types-of-behavior-assessments
https://www.alleydog.com/glossary/definition.php?term=Power+Tests
https://www.scribbr.com/methodology/qualitative-quantitative-research
https://en.wikipedia.org/wiki/Psychological_research