Self Review Task - Bantayan

Name : Bantayan, Gellirose S.
Subject : CSA -101
SELF -REVIEW TASK
1.Differences among Test, measurement,Evaluation and Assessment.
 Test - Simply put, a test refers to a tool, technique, or method that is intended to
measure students’ knowledge or their ability to complete a particular task. In this sense,
testing can be considered as a form of assessment. Tests should meet some basic
requirements, such as validity and reliability.
 Measurement - Measurement is a specific process through which a learning experience,
phenomena, or context is translated into a representative set of numerical variables.
 Evaluation - Measurement is a specific process through which a learning experience,
phenomena, or context is translated into a representative set of numerical variables.
 Assessment -Assessment is thus the process of collecting information about learners
using different methods or tools (e.g. tests, quizzes, portfolios, etc).
2. Assessment FOR, OR & AS Learning
 Assessment for learning (Diagnostic assessment) involves the use of information about
student progress to support and improve student learning, inform instructional
practices, and: is teacher-driven for student, teacher, and parent use; occurs throughout
the teaching and learning process, using a variety of tools; and engages teachers in
providing differentiated instruction, feedback to students to enhance their learning, and
information to parents in support of learning.
 Assessment as learning (formative assessment) actively involves student reflection on
learning, monitoring of his/her own progress, and: supports students in critically
analyzing learning related to curricular outcomes; is student-driven with teacher
guidance; and occurs throughout the learning process.
 Assessment of learning (summative assessment) involves teachers’ use of evidence of
student learning to make judgements about student achievement and: provides
opportunity to report evidence of achievement related to curricular outcomes; occurs at
the end of a learning cycle using a variety of tools;provides the foundation for
discussions on placement or promotion.
3.Reasons for assessing students

 To identify gaps in performance and learning needs (pre-assessment)
 To encourage and support learning (continuous assessment)
 To measure learning and improve achievement (continuous assessment)
 To prepare learners for the next step in the learning journey (post-assessment)
 To seek feedback and areas of improvement in the instructional design process
(continuous assessment)
4. Types of Educational decisions
 Instructional -This decision is normally made by individual classroom teacher, as

necessary to meet the targets or objectives set during classroom engagement.Decisions
are reached according to the results of test administered to a class.
 Grading - It is usually based on teacher-made tests.Grades are assigned to the students
using assessment as one of the factors.
 Diagnostic- It is made to determine a student’s strengths and weaknesses and the
reason or reasons.
 Selection -It involves accepting or rejecting the examinee based on the results of
assessment,for admission or qualification to a program or school activity.The decisions
are made not by classroom teachers but by specialists such as guidance counselors,
administrators or the selection committee.
 Placement- It is made after a student has been admitted toschool. It involves the
process of identifyingstudents who needs remediation or mayberecommended for
enrichment programutilizes of the school.
 Guidance andCounseling - It utilizes test data to assist students inmaking their personal
choices for futurecareer and help them know their strengthsand weaknesses by means
of standardizedtests.On the other hand, teachers may use theresults of socio-metric
test to identify whoamong the students are popular or unpopular.Those who are
unpopular may be given helpfor them to gain friends and become moresociable.
 Program orCurriculum - It is made not at the level of theteachers but on higher level
such asdivision, regional or national level.Based on the result of assessment
andevaluation, educational decisions man be reached: to continue, discontinue,revise or
replace a curriculum orprogram being implemented.
 AdministrativePolicy - It involves determining theimplications to resources
includingfinancial consideration in order toimprove the student learning as aresult of an
assessment. It may entailacquisition of instructional materials, books, etc. to raise the
level ofstudents’ performance in academic, ornon-academic or both.
5. Assessors must have the ability to interpret the evidence collected and make a confident
judgment of competence. How can I do this you may ask?
 To make a good judgment, assessors must have:

 current skills and knowledge of the broader industry practice (or access to another
person with those skills and knowledge, such as an industry expert, who will assist with
the design and/or conduct of the assessment)
 .a common understanding of the assessment requirements.
 a common interpretation of the unit(s) of competency being assessed.
When making the assessment decision, the evidence must be evaluated in terms of the
principles of competency based assessments. Remember those?
 Validity
 Reliability
 Currency
 Consistency
 Practicability
 Fairness/absence of bias
 Relevance
 Authenticity
 Sufficiency
Checking the consistency of the judgement:
As an assessor, you need to be aware of the danger that your assessment decisions can be
biased by factors that should have no bearing on the assessment process or your judgement.
These factors include:
 Appearance and dress - should not be allowed to influence your decision, unless these
are explicitly stated in the standards.
 The ‘halo and horns’ effect. The halo effect can emerge when you are familiar with your
learners, and a good performance in the past leads you to assume that they are
performing well at present even though they may not be. The horns effect is where no
matter how well your candidates are currently performing, your judgment of poor
performance in the past continues to influence your assessment decisions.
6. Types of tests
 Diagnostic Testing -This testing is used to “diagnose" what a student knows and does
not know. Diagnostic testing typically happens at the start of a new phase of education,
like when students will start learning a new unit. The test covers topics students will be
taught in the upcoming lessons.
 Formative Testing- This type of testing is used to gauge student learning during the
lesson. It is used throughout a lecture and designed to give students the opportunity to
demonstrate that they have understood the material, like in the example of the clock
activity mentioned above. This informal, low-stakes testing happens in an ongoing
manner, and student performance on formative testing tends to get better as a lesson
progresses.
 Benchmark Testing -This testing is used to check whether students have mastered a unit
of content. Benchmark testing is given during or after a classroom focuses on a section
of material, and covers either a part or all of the content has been taught up to that
time. The assessments are designed to let teachers know whether students have
understood the material that’s been covered.
 Summative Testing- This testing is used as a checkpoint at the end of the year or course
to assess how much content students learned overall. This type of testing is similar to
benchmark testing, but instead of only covering one unit, it cumulatively covers
everything students have been spending time on throughout the year.
7. Instructional goals and objectives -
 Instructional Goals are broad, generalized statements about what is to be learned. Think
of them as a target to be reached, or "hit."
 Instructional objectives are the foundation upon which you can build lessons and
assessments that you can prove meet your overall course or lesson goals. Think of
objectives as tools you use to make sure you reach your goals. They are the arrows you
shoot towards your target (goal).
 The purpose of objectives is not to restrict spontaneity or constrain the vision of
education in the discipline; but to ensure that learning is focused clearly enough that
both students and teacher know what is going on, and so learning can be objectively
measured.
 Different archers have different styles, so do different teachers. Thus, you can shoot
your arrows (objectives) many ways. The important thing is that they reach your target
(goals) and score that bullseye!
 Thus, stating clear course objectives is important because:
 They provide you with a solid foundation for designing relevant activities and
assessment. Activities, assessment and grading should be based on the objectives.
 As you develop a learning object, course, a lesson or a learning activity, you have to
determine what you want the students to learn and how you will know that they
learned. Instructional objectives, also called behavioral objectives or learning objectives,
are a requirements for high-quality development of instruction.
 They help you identify critical and non-critical instructional elements.
 They help remove your subjectivity from the instruction.
 They help you design a series of interrelated instructional topics.
 Students will better understand expectations and the link between expectations,
teaching and grading.
8. Taxonomy of educational objectives
 Bloom’s Taxonomy Of Educational Objectives Is A Hierarchical Ordering Of Skills In

Different Domains Whose Primary Use Is To Help Teachers Teach And Students Learn
Effectively And Efficiently. The Meaning Of Bloom’s Taxonomy Can Be Understood By
Exploring Its Three Learning Domains—Cognitive, Affective And Psychomotor. Each Of
These Domains Further Consists Of A Hierarchy That Denotes Different Levels Of
Learning.The Nature Of Its Domains Means That It Can Be Applied To Almost Anything
That Requires A Stage-By-Stage System Of Learning
 The Three Domains Of Bloom’s Taxonomy
Bloom’s Taxonomy Comprises Three Learning Domains To Understand Different Levels
Of Learning.
 Cognitive -The Cognitive Domain Of Bloom’s Taxonomy Of Learning Tries To Cater To

Bloom’s Taxonomy Objectives Such As Critical Thinking, Problem-Solving And Creating
And Enhancing A Knowledge Base. This Was The First Domain Created By Bloom’s
Original Team Of Researchers And Includes Hierarchies That Are Concerned With
Building New Knowledge As Well As Refining Previously Gathered Information. The
Different Levels Of The Cognitive Domain Are As Follows:
 Remember: Concerned With All Kinds Of Memorization Techniques And Optimal Use Of
Information Acquired In The Past. For Example, Remembering The Names Of All The
Prime Ministers Of India
 Understand: Concerned With Going Into The Depths Of A Concept Or An Idea In Order
To Comprehend It In Multiple Ways. For Example, Identifying The Main Challenges In
Governance Each Prime Minister Had To Deal With During Their Tenure
 Apply: Concerned With Applying Knowledge To Produce Something Tangible. For
Example, Taking A Political Challenge From Five Decades Ago And Applying Its Lessons
To A Similar Issue In The present
 Analyze: Concerned With Examining And Scrutinizing Different Aspects Of What Is Being
Learnt. For Example, Analyzing The Personalities Of Different Prime Ministers And How
That Affected Their Performance
 Evaluate: Concerned With Detecting The Motivations And Intentions Behind Events,
Processes And Situations. For Example, Assessing Why Certain Prime Ministers Decided
To Go To War At Certain Junctures In History
 Create: Concerned With Building Something That’s Original And Constructive. For
Example, Creating A List Of Qualities That Any Modern Prime Minister Of India Should
Possess. This Particular Level Was Known As “Synthesis” In The Original Model, But Was
Later Changed To Acknowledge Creativity As The Highest Form Of Cognitive
Achievement In The Revised Version Of Bloom’s Taxonomy.
 Affective :The Affective Domain Of Bloom’s Taxonomy Of Learning Helps To Achieve
Bloom’s Taxonomy Objectives In Relation To Attitudes, Values And Interests Of Learners.
Its Primary Focus Is To Trace The Evolution Of Values And How They Develop Across The
Entire Learning Process. The Different Levels Of The Affective Domain Are As Follows:
 Receiving:Concerned With Paying Adequate Attention To Someone Who’s Presenting Or
Performing. For Example, Listening To A Lecturer And Writing A Summary Of That
Lecture
 Responding: Concerned With Producing A Performance Or A Presentation To Increase
Self-Confidence And Technical Skills. For Example, Delivering A Lecture To An Audience
On A Specific Subject
 Valuing: Concerned With Expressing The Values That One Prioritizes In Life And
Justifying Why They Do So. For Example, Delivering A Speech Highlighting Any Three
Values That One Considers To Be The Most Important For Any ProfProfessl
 Organization: Concerned With Organizing A Particular Value System And Comparing It
With Other Systems To Better Appreciate Different Settings And Cultures. For Example,
Delivering A Presentation That Compares Value Systems As Seen In Government-Funded
Charities And Non-Governmental Organizations
 Characterization: Concerned With Projecting One’s Values In Real Time To Be Able To
Work Successfully In A Team. For Example, Writing An Essay As Part Of A Team On How
Value Systems Need To Adapt To The World Of Online Learning
 Psychomotor -The Psychomotor Domain Of Bloom’s Taxonomy Of Learning Helps To
Realize Bloom’s Taxonomy Of Educational Objectives Such As Physically Accomplishing
Tasks And Performing Various Movements And Skills. The Different Levels Of The
Psychomotor Domain Are As Follows:
 Reflex: Concerned With An Instinctive Response To A Physical Stimulus. For Example,
Catching A Tennis Ball That’s Thrown At Learners Or Trying To Hit A Target With That
Same Tennis Ball.
 Basic Fundamental Movements: Concerned With Everyday Actions Or Movements Such
As Walking Or Running. For Example, Participating In A Relay Race That Tests One’s
Fitness, Speed And Teamwork Capabilities
 Perceptual Abilities: Concerned With Performing Activities That Integrate More Than
One Sensory Perception. For Example, Playing A Game Of Cricket That Assesses One’s
Ability To React To Events As Well As Anticipate Events Before They Occur
 Skilled Movements: Concerned With Adapting Oneself And One’s Attributes To A
Challenging Environment. For Example, Playing A Game Of Soccer Or Hockey At A
Location With A High Altitude Where Players Are Expected To Conserve Energy In Order
To Prevent Heavy Fatigue
 Non-Discursive Communication: Concerned With Expressing Oneself Through
Purposeful Movement And Activity. For Example, Playing Any Team Sport That Requires
Both Active Communication With Fellow Players And A Display Of Personal Skills
 Through This Breakdown Of Each Of The Domains Of Bloom’s Taxonomy, It’s Clear How
The Taxonomy Can Cater To All Kinds Of Learners And Attempt To Meet A Vast
Collection Of Learning Requirements. While It Isn’t Necessary For Learners To
Experience All Three Domains, The Cognitive Domain Is Usually Considered
Indispensable In Any Learning Process.
9. Principle and guidance of test construction
The fundamental principles of test construction are such as (a) Validity, (b) Reliability (c)
Standardisation (d) their evaluation.
 (a) Validity:
 Tests should have validity, that is, they should actually measure what they purport to
measure. A perfectly valid test would prospective employees in exactly the same
relationship to one another as they would stand after trial on the job.
 (b) Reliability:
 By the reliability of a test is meant the consistency with which it serves as a measuring
instrument.If a test is reliable a person taking at two different times should make
substantially the same score time. Under ideal conditions a test can never be any more
than a sample of the ability being measured. No test is of value in personnel work unless
it has a high degree of reliability.
 (c) Standardization
 The process of Standardisation includes:
 1. The scaling of test items in term of difficulty, and
 2. The establishment of norms.
 More important as an element in the Standardisation of personnel tests is the scaling of
test items in terms of difficulty. To be of functional value in a test, each item must be of
such difficulty as to be missed by a part of the examines but not by all.
 (d) Evaluation:
 The evaluation of test results, involving as it does all the problems of scoring and
weighting of items and the assignment of relative weights to tests used in a battery, it
surrounded with highly technical considerations.
10. Advantages and disadvantages of different types of tests
 Advantages of Test:
 (i) Proper Assessment: Tests provide a basis for finding out the suitability of candidates
for various jobs
 The mental capability, aptitude, liking and interests of the candidates enable the
selectors to find out whether a person is suitable for the job for which he is a candidate.
 (ii) Objective Assessment: Tests provide better objective criteria than any other method.
Subjectivity of every type is almost eliminated.
 (iii) Uniform Basis: Tests provide a uniform basis for comparing the performance of
applicants. Same tests are given to the candidates and their score will enable selectors
to see their performance.
 (iv) Selection of Better Persons: The aptitude, temperament and adjustability of
candidates are determined with the help of tests. This enables their placement on the
jobs where they will be most suitable. It will also improve their efficiency and job
satisfaction.
 (v) Labour Turnover Reduced: Proper selection of persons will also reduce labour
turnover. If suitable persons are not selected, they may leave their job sooner or later.
Tests are helpful in finding out the suitability of persons for the jobs. Interest tests will
help in knowing the liking of applicants for different jobs. When a person gets a job
according to his temperament and interest he would not leave it.
 Disadvantages of Tests:
 The Tests Suffer From The Following Disadvantages:
 (i) Unreliable:The inferences drawn from the tests may not be correct in certain cases.
The skill and ability of a candidate may not be properly judged with the help of tests.
 (ii) Wrong Use: The tests may not be properly used by the employees. Those persons
who are conducting these tests may be biased towards certain persons. This will falsify
the results of tests. Tests may also give unreliable results if used by incompetent
persons.
 (iii) Fear of Exposure: Some persons may not submit to the tests for fear of exposure.
They may be competent but may not like to be assessed through the tests. The
enterprise may be deprived of the services of such personnel who are not willing to
appear for the tests but are otherwise suitable for the concern.
11. Assembling and administering the test
 Assembling the Test
1. Record items on index cards
 See example on p. 354 for info to include

 File them in an item bank
2. Double-check all individual test items
 Review the checklist for each item type (pp. 178, 185, 190, 214, 232, 248)
3. Double-check the items as a set
 Still follows the table of specifications?

 Enough items for desired interpretations?
 Difficulty level appropriate?
 Items non-overlapping so don’t give clues?
4. Arrange items appropriately, which usually means:
 Keep all items of one type together

 Put lowest-level item types first (T/F, matching, short-answer, MC, interpretive, RR
essay, and then ER essay)
 Within item types, put easiest learning outcomes first (knowledge, comprehension,
application, etc.)
 Administer time-consuming extended-response essays and performance-based tasks
separately
 Why put items of a type together? Clearer and more efficient.
 Why put easiest items first? Motivational.
5. Prepare directions
 How to allot time

 How to respond (pick best alternative, etc.)
 How and where to record answers (circle, etc.; same vs. separate page)
 How guessing will be treated (or whether to answer all questions)
 How extended essays will be evaluated (accuracy, organization, etc.)
6. Reproduce the test
 Leave ample white space on every page

 List multiple choice options vertically
 Keep all parts of an item on the same page
 The introduction to an interpretive item may be on a facing page
 When not using a separate answer sheet, provide spaces for answering down one side
of the page (preferable the left)
 When using a separate answer sheet, consult the example on p. 354
 Number items consecutively
 Proofread
12. Test analysis
 test analysis is a detailed statistical assessment of a test’s psychometric properties,

including an evaluation of the quality of the test items and of the test as a whole. It
usually includes information such as the mean and standard deviation for the test scores
in the population used to develop the test, as well as data on the test’s reliability; it may
also include data on such factors as item difficulty value, item discriminability, and the
effect of item distractors.
 Item analysis also known as test analysis is a process which examines student responses
to individual test items (questions) in order to assess the quality of those items and of
the test as a whole. Item analysis is especially valuable in improving items which will be
used again in later tests, but it can also be used to eliminate ambiguous or misleading
items in a single test administration. In addition, item analysis is valuable for increasing
instructors’ skills in test construction, and identifying specific areas of course content
which need greater emphasis or clarity.
 Item Statistics -Item statistics are used to assess the performance of individual test
items on the assumption that the overall quality of a test derives from the quality of its
items. The ScorePak® item analysis report provides the following item information:
 Item Number -This is the question number taken from the student answer sheet, and
the ScorePak® Key Sheet. Up to 150 items can be scored on the Standard Answer Sheet.
 Mean and Standard Deviation- The mean is the “average” student response to an item.
It is computed by adding up the number of points earned by all students on the item,
and dividing that total by the number of students.
 The standard deviation, or S.D., is a measure of the dispersion of student scores on that
item. That is, it indicates how “spread out” the responses were. The item standard
deviation is most meaningful when comparing items which have more than one correct
alternative and when scale scoring is used. For this reason it is not typically used to
evaluate classroom tests.
 Item Difficulty- For items with one correct alternative worth a single point, the item
difficulty is simply the percentage of students who answer an item correctly. In this case,
it is also equal to the item mean. The item difficulty index ranges from 0 to 100; the
higher the value, the easier the question. When an alternative is worth other than a
single point, or when there is more than one correct alternative per question, the item
difficulty is the average score on that item divided by the highest number of points for
any one alternative.
 Item Discrimination- Item discrimination refers to the ability of an item to differentiate
among students on the basis of how well they know the material being tested. Various
hand calculation procedures have traditionally been used to compare item responses to
total test scores using high and low scoring groups of students.
 Alternate Weight-This column shows the number of points given for each response
alternative. For most tests, there will be one correct answer which will be given one
point, but ScorePak® allows multiple correct alternatives, each of which may be
assigned a different weight.
 Means - The mean total test score (minus that item) is shown for students who selected
each of the possible response alternatives. This information should be looked at in
conjunction with the discrimination index; higher total test scores should be obtained by
students choosing the correct, or most highly weighted alternative. Incorrect
alternatives with relatively high means should be examined to determine why “better”
students chose that particular alternative.
 Frequencies and Distribution -The number and percentage of students who choose each
alternative are reported. The bar graph on the right shows the percentage choosing
each response; each “#” represents approximately 2.5%. Frequently chosen wrong
alternatives may indicate common misconceptions among the students.
 Difficulty and Discrimination Distributions
 At the end of the Item Analysis report, test items are listed according their degrees of
difficulty (easy, medium, hard) and discrimination (good, fair, poor). These distributions
provide a quick overview of the test, and can be used to identify items which are not
performing well and which can perhaps be improved or discarded.
 Test Statistics- Two statistics are provided to evaluate the performance of the test as a
whole.
 Reliability Coefficient -The reliability of a test refers to the extent to which the test is
likely to produce consistent scores.
13. Alternative assessment
 What is Alternative Assessment?

This is a method of evaluation that measures a student’s level of proficiency in a subject
as opposed to the student’s level of knowledge. The overall goal of alternative
assessment is to allow students to demonstrate their knowledge and execute tasks.
Alternative assessment is also called a performance test or authentic assessment
because it is deeply rooted in one’s ability to do something by leveraging newly-gained
knowledge. As part of the assessment, the student will need to perform meaningful
tasks that reflect a clear understanding of the teaching and learning objectives.
You can ask your students to create a portfolio, work with others on specific projects or
engage in any other type of activity that shows they have a full grasp of what has been
discussed in the class or training. Examples of alternative assessment includes; Concept
maps, interviews, reports,collaborative testing, projects ,portfolio, performance test,
open tests and crib sheet.
14.Portfolio Assessment
 What is Portfolio assessment?

a portfolio is a collection of student work that demonstrates progress and growth.
Teachers can determine if specific assessments should be present or involve students in
determining the success criteria for what is to be added. Portfolios can be paper or
digital and can provide an immense amount of insight into student learning over a
period of time.
The purpose of a portfolio is to collect student learning and demonstrate the specific
evidence of growth in a variety of standards and content. Using portfolios is an excellent
way to get students involved in the assessment process and for teachers to authentically
assess student growth. Portfolios can be used in lieu of testing or final projects.
15. Summarizing data and measure of central tendency.
 INTRODUCTION TO CENTRAL TENDENCY
 Measures of Central Tendency
 Summarizing Data :Central tendency

 Characteristics of Mean
 Choosing an appropriate measure of central tendency
 Summarizing Data : Central tendency

16. Variability of scores
 Variability refers to how spread scores are in a distribution out; that is, it refers to the
amount of spread of the scores around the mean. For example, distributions with the
same mean can have different amounts of variability or dispersion.
There are four frequently used measures of the variability of a distribution:
 range
 interquartile range
 variance
 standard deviation
 Range -Let’s start with the range because it is the most straightforward measure of
variability to calculate and the simplest to understand. The range of a dataset is the
difference between the largest and smallest values in that dataset. For example, in the
two datasets below, dataset 1 has a range of 20 – 38 = 18 while dataset 2 has a range of
11 – 52 = 41. Dataset 2 has a broader range and, hence, more variability than dataset 1.
 The Interquartile Range (IQR) and other Percentiles
 The interquartile range is the middle half of the data. To visualize it, think about the
median value that splits the dataset in half. Similarly, you can divide the data into
quarters. Statisticians refer to these quarters as quartiles and denote them from low to
high as Q1, Q2, and Q3. The lowest quartile (Q1) contains the quarter of the dataset
with the smallest values. The upper quartile (Q4) contains the quarter of the dataset
with the highest values. The interquartile range is the middle half of the data that is in
between the upper and lower quartiles. In other words, the interquartile range includes
the 50% of data points that fall between Q1 and Q3.
 Variance - Variance is the average squared difference of the values from the mean.
Unlike the previous measures of variability, the variance includes all values in the
calculation by comparing each value to the mean. To calculate this statistic, you
calculate a set of squared differences between the data points and the mean, sum them,
and then divide by the number of observations. Hence, it’s the average squared
difference.
 Standard Deviation - The standard deviation is the standard or typical difference
between each data point and the mean. When the values in a dataset are grouped
closer together, you have a smaller standard deviation. On the other hand, when the
values are spread out more, the standard deviation is larger because the standard
distance is greater.
17.Correlation
 correlation is defined as a causal, complementary, parallel, or reciprocal relationship

found to exist between various variables examined during an investigation and based on
specific criteria. The value of a correlation co-efficient can vary from minus one to plus
one. A minus one indicates a perfect negative correlation, while a plus one indicates a
perfect positive correlation.
A correlation of zero means there is no relationship between the two variables. When
there is a negative correlation between two variables, as the value of one variable
increases, the value of the other variable decreases, and vise versa. In other words, for a
negative correlation, the variables work opposite each other. When there is a positive
correlation between two variables, as the value of one variable increases, the value of
the other variable also increases. The variables move together.
18. Validity, Reliability and Error.
 Reliability means that the results obtained are consistent.reliability is concerned with
the extent to which an experiment, test, or measurement procedure yields consistent
results on repeated trials. Reliability is the degree to which a measure is free from
random errors. But, due to the every present chance of random errors, we can never
achieve a completely error-free, 100% reliable measure. The risk of unreliability is
always present to a limited extent.
 Here are the basic methods for estimating the reliability of empirical measurements: 1)
Test-Retest Method, 2) Equivalent Form Method, and 3) Internal Consistency Method.
1. Test-Retest Method: The test-retest method repeats the measurement—repeats the
survey—under similar conditions. The second test is typically conducted among the
same respondents as the first test after a short period of time has elapsed.
2. Equivalent Form Method: The equivalent form method is used to avoid the problems
mentioned above with the test-retest method. The equivalent form method
measures the ability of similar instruments to produce results that have a strong
correlation.
3. Internal Consistency and the Split-Half Method: These methods for establishing
reliability rely on the internal consistency of an instrument to produce similar results
on different samples during the same time period. Internal consistency is concerned
with equivalence.
 Validity is the degree to which the researcher actually measures what he or she is trying
to measure.Validity is defined as the ability of an instrument to measure what the
researcher intends to measure. There are several different types of validity in social
science research. Each takes a different approach to assessing the extent to which a
measure actually measures what the researcher intends to measure. Each type of
validity has different meaning, uses, and limitations.
1. Face Validity: Face validity is the degree to which subjectively is viewed as measuring
what it purports to measure. It is based on the researcher's judgment or the
collective judgment of a wide group of researchers. As such, it is considered the
weakest form of validity. With face validity, a measure "looks like it measures what
we hope to measure," but it has not been proven to do so.
2. Content Validity: Content validity is frequently considered equivalent to face validity.
Content or logical validity is the extent to which experts agree that the measure
covers all facets of the construct.
3. Criterion Validity: Criterion Validity measures how well a measurement predicts
outcome based on information from other variables. It measures the match
between the survey question and the criterion—content or subject area—it purports
to measure.
4. Construct Validity: Construct validity is the degree to which an instrument
represents the construct it purports to represent. It involves an understanding the
theoretical foundations of the construct. A measure has construct validity when is
conforms to the theory underlying the construct
 Random Errors: Random error is a term used to describe all chance or random factors
than confound—undermine—the measurement of any phenomena. Random errors in
measurement are inconsistent errors that happen by chance. They are inherently
unpredictable and transitory. Random errors include sampling errors, unpredictable
fluctuations in the measurement apparatus, or a change in a respondents mood, which
may cause a person to offer an answer to a question that might differ from the one he
or she would normally provide. The amount of random errors is inversely related to the
reliability of a measurement instrument.[1] As the number of random errors decreases,
reliability rises and vice versa.
 Systematic Errors: Systematic or Non-Random Errors are a constant or systematic bias in
measurement. Here are two everyday examples of systematic error: 1) Imagine that
your bathroom scale always registers your weight as five pounds lighter that it actually
is and 2) The thermostat in your home says that the room temperature is 72º, when it is
actually 75º. The amount of systematic error is inversely related to the validity of a
measurement instrument.[2] As systematic errors increase, validity falls and vice versa.
19. Basic principles of guidance and counseling.

10

Self Review Task - Bantayan

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Self Review Task - Bantayan

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Self Review Task - Bantayan

Uploaded by

Copyright:

Available Formats

Name : Bantayan, Gellirose S.

Subject : CSA -101

SELF -REVIEW TASK

1.Differences among Test, measurement,Evaluation and Assessment.

2. Assessment FOR, OR & AS Learning

3.Reasons for assessing students

4. Types of Educational decisions

 Instructional -This decision is normally made by individual classroom teacher, as

 To make a good judgment, assessors must have:

Checking the consistency of the judgement:

7. Instructional goals and objectives -

8. Taxonomy of educational objectives

 Bloom’s Taxonomy Of Educational Objectives Is A Hierarchical Ordering Of Skills In

 Cognitive -The Cognitive Domain Of Bloom’s Taxonomy Of Learning Tries To Cater To

9. Principle and guidance of test construction

10. Advantages and disadvantages of different types of tests

11. Assembling and administering the test

 Assembling the Test

1. Record items on index cards

 See example on p. 354 for info to include

2. Double-check all individual test items

3. Double-check the items as a set

 Still follows the table of specifications?

4. Arrange items appropriately, which usually means:

 Keep all items of one type together

 How to allot time

6. Reproduce the test

 Leave ample white space on every page

12. Test analysis

 test analysis is a detailed statistical assessment of a test’s psychometric properties,

13. Alternative assessment

 What is Alternative Assessment?

 What is Portfolio assessment?

15. Summarizing data and measure of central tendency.

 INTRODUCTION TO CENTRAL TENDENCY

 Measures of Central Tendency

 Summarizing Data :Central tendency

 Summarizing Data : Central tendency

 correlation is defined as a causal, complementary, parallel, or reciprocal relationship

18. Validity, Reliability and Error.

19. Basic principles of guidance and counseling.

You might also like