Self Review Task - Bantayan
Self Review Task - Bantayan
Self Review Task - Bantayan
Test - Simply put, a test refers to a tool, technique, or method that is intended to
measure students’ knowledge or their ability to complete a particular task. In this sense,
testing can be considered as a form of assessment. Tests should meet some basic
requirements, such as validity and reliability.
Measurement - Measurement is a specific process through which a learning experience,
phenomena, or context is translated into a representative set of numerical variables.
Evaluation - Measurement is a specific process through which a learning experience,
phenomena, or context is translated into a representative set of numerical variables.
Assessment -Assessment is thus the process of collecting information about learners
using different methods or tools (e.g. tests, quizzes, portfolios, etc).
Assessment for learning (Diagnostic assessment) involves the use of information about
student progress to support and improve student learning, inform instructional
practices, and: is teacher-driven for student, teacher, and parent use; occurs throughout
the teaching and learning process, using a variety of tools; and engages teachers in
providing differentiated instruction, feedback to students to enhance their learning, and
information to parents in support of learning.
Assessment as learning (formative assessment) actively involves student reflection on
learning, monitoring of his/her own progress, and: supports students in critically
analyzing learning related to curricular outcomes; is student-driven with teacher
guidance; and occurs throughout the learning process.
Assessment of learning (summative assessment) involves teachers’ use of evidence of
student learning to make judgements about student achievement and: provides
opportunity to report evidence of achievement related to curricular outcomes; occurs at
the end of a learning cycle using a variety of tools;provides the foundation for
discussions on placement or promotion.
Validity
Reliability
Currency
Consistency
Practicability
Fairness/absence of bias
Relevance
Authenticity
Sufficiency
As an assessor, you need to be aware of the danger that your assessment decisions can be
biased by factors that should have no bearing on the assessment process or your judgement.
These factors include:
Appearance and dress - should not be allowed to influence your decision, unless these
are explicitly stated in the standards.
The ‘halo and horns’ effect. The halo effect can emerge when you are familiar with your
learners, and a good performance in the past leads you to assume that they are
performing well at present even though they may not be. The horns effect is where no
matter how well your candidates are currently performing, your judgment of poor
performance in the past continues to influence your assessment decisions.
6. Types of tests
Diagnostic Testing -This testing is used to “diagnose" what a student knows and does
not know. Diagnostic testing typically happens at the start of a new phase of education,
like when students will start learning a new unit. The test covers topics students will be
taught in the upcoming lessons.
Formative Testing- This type of testing is used to gauge student learning during the
lesson. It is used throughout a lecture and designed to give students the opportunity to
demonstrate that they have understood the material, like in the example of the clock
activity mentioned above. This informal, low-stakes testing happens in an ongoing
manner, and student performance on formative testing tends to get better as a lesson
progresses.
Benchmark Testing -This testing is used to check whether students have mastered a unit
of content. Benchmark testing is given during or after a classroom focuses on a section
of material, and covers either a part or all of the content has been taught up to that
time. The assessments are designed to let teachers know whether students have
understood the material that’s been covered.
Summative Testing- This testing is used as a checkpoint at the end of the year or course
to assess how much content students learned overall. This type of testing is similar to
benchmark testing, but instead of only covering one unit, it cumulatively covers
everything students have been spending time on throughout the year.
Instructional Goals are broad, generalized statements about what is to be learned. Think
of them as a target to be reached, or "hit."
Instructional objectives are the foundation upon which you can build lessons and
assessments that you can prove meet your overall course or lesson goals. Think of
objectives as tools you use to make sure you reach your goals. They are the arrows you
shoot towards your target (goal).
The purpose of objectives is not to restrict spontaneity or constrain the vision of
education in the discipline; but to ensure that learning is focused clearly enough that
both students and teacher know what is going on, and so learning can be objectively
measured.
Different archers have different styles, so do different teachers. Thus, you can shoot
your arrows (objectives) many ways. The important thing is that they reach your target
(goals) and score that bullseye!
Thus, stating clear course objectives is important because:
They provide you with a solid foundation for designing relevant activities and
assessment. Activities, assessment and grading should be based on the objectives.
As you develop a learning object, course, a lesson or a learning activity, you have to
determine what you want the students to learn and how you will know that they
learned. Instructional objectives, also called behavioral objectives or learning objectives,
are a requirements for high-quality development of instruction.
They help you identify critical and non-critical instructional elements.
They help remove your subjectivity from the instruction.
They help you design a series of interrelated instructional topics.
Students will better understand expectations and the link between expectations,
teaching and grading.
Through This Breakdown Of Each Of The Domains Of Bloom’s Taxonomy, It’s Clear How
The Taxonomy Can Cater To All Kinds Of Learners And Attempt To Meet A Vast
Collection Of Learning Requirements. While It Isn’t Necessary For Learners To
Experience All Three Domains, The Cognitive Domain Is Usually Considered
Indispensable In Any Learning Process.
The fundamental principles of test construction are such as (a) Validity, (b) Reliability (c)
Standardisation (d) their evaluation.
(a) Validity:
Tests should have validity, that is, they should actually measure what they purport to
measure. A perfectly valid test would prospective employees in exactly the same
relationship to one another as they would stand after trial on the job.
(b) Reliability:
By the reliability of a test is meant the consistency with which it serves as a measuring
instrument.If a test is reliable a person taking at two different times should make
substantially the same score time. Under ideal conditions a test can never be any more
than a sample of the ability being measured. No test is of value in personnel work unless
it has a high degree of reliability.
(c) Standardization
The process of Standardisation includes:
1. The scaling of test items in term of difficulty, and
2. The establishment of norms.
More important as an element in the Standardisation of personnel tests is the scaling of
test items in terms of difficulty. To be of functional value in a test, each item must be of
such difficulty as to be missed by a part of the examines but not by all.
(d) Evaluation:
The evaluation of test results, involving as it does all the problems of scoring and
weighting of items and the assignment of relative weights to tests used in a battery, it
surrounded with highly technical considerations.
Advantages of Test:
(i) Proper Assessment: Tests provide a basis for finding out the suitability of candidates
for various jobs
The mental capability, aptitude, liking and interests of the candidates enable the
selectors to find out whether a person is suitable for the job for which he is a candidate.
(ii) Objective Assessment: Tests provide better objective criteria than any other method.
Subjectivity of every type is almost eliminated.
(iii) Uniform Basis: Tests provide a uniform basis for comparing the performance of
applicants. Same tests are given to the candidates and their score will enable selectors
to see their performance.
(iv) Selection of Better Persons: The aptitude, temperament and adjustability of
candidates are determined with the help of tests. This enables their placement on the
jobs where they will be most suitable. It will also improve their efficiency and job
satisfaction.
(v) Labour Turnover Reduced: Proper selection of persons will also reduce labour
turnover. If suitable persons are not selected, they may leave their job sooner or later.
Tests are helpful in finding out the suitability of persons for the jobs. Interest tests will
help in knowing the liking of applicants for different jobs. When a person gets a job
according to his temperament and interest he would not leave it.
Disadvantages of Tests:
The Tests Suffer From The Following Disadvantages:
(i) Unreliable:The inferences drawn from the tests may not be correct in certain cases.
The skill and ability of a candidate may not be properly judged with the help of tests.
(ii) Wrong Use: The tests may not be properly used by the employees. Those persons
who are conducting these tests may be biased towards certain persons. This will falsify
the results of tests. Tests may also give unreliable results if used by incompetent
persons.
(iii) Fear of Exposure: Some persons may not submit to the tests for fear of exposure.
They may be competent but may not like to be assessed through the tests. The
enterprise may be deprived of the services of such personnel who are not willing to
appear for the tests but are otherwise suitable for the concern.
Review the checklist for each item type (pp. 178, 185, 190, 214, 232, 248)
5. Prepare directions
14.Portfolio Assessment
Variability refers to how spread scores are in a distribution out; that is, it refers to the
amount of spread of the scores around the mean. For example, distributions with the
same mean can have different amounts of variability or dispersion.
There are four frequently used measures of the variability of a distribution:
range
interquartile range
variance
standard deviation
Range -Let’s start with the range because it is the most straightforward measure of
variability to calculate and the simplest to understand. The range of a dataset is the
difference between the largest and smallest values in that dataset. For example, in the
two datasets below, dataset 1 has a range of 20 – 38 = 18 while dataset 2 has a range of
11 – 52 = 41. Dataset 2 has a broader range and, hence, more variability than dataset 1.
The Interquartile Range (IQR) and other Percentiles
The interquartile range is the middle half of the data. To visualize it, think about the
median value that splits the dataset in half. Similarly, you can divide the data into
quarters. Statisticians refer to these quarters as quartiles and denote them from low to
high as Q1, Q2, and Q3. The lowest quartile (Q1) contains the quarter of the dataset
with the smallest values. The upper quartile (Q4) contains the quarter of the dataset
with the highest values. The interquartile range is the middle half of the data that is in
between the upper and lower quartiles. In other words, the interquartile range includes
the 50% of data points that fall between Q1 and Q3.
Variance - Variance is the average squared difference of the values from the mean.
Unlike the previous measures of variability, the variance includes all values in the
calculation by comparing each value to the mean. To calculate this statistic, you
calculate a set of squared differences between the data points and the mean, sum them,
and then divide by the number of observations. Hence, it’s the average squared
difference.
Standard Deviation - The standard deviation is the standard or typical difference
between each data point and the mean. When the values in a dataset are grouped
closer together, you have a smaller standard deviation. On the other hand, when the
values are spread out more, the standard deviation is larger because the standard
distance is greater.
17.Correlation
Reliability means that the results obtained are consistent.reliability is concerned with
the extent to which an experiment, test, or measurement procedure yields consistent
results on repeated trials. Reliability is the degree to which a measure is free from
random errors. But, due to the every present chance of random errors, we can never
achieve a completely error-free, 100% reliable measure. The risk of unreliability is
always present to a limited extent.
Here are the basic methods for estimating the reliability of empirical measurements: 1)
Test-Retest Method, 2) Equivalent Form Method, and 3) Internal Consistency Method.
1. Test-Retest Method: The test-retest method repeats the measurement—repeats the
survey—under similar conditions. The second test is typically conducted among the
same respondents as the first test after a short period of time has elapsed.
2. Equivalent Form Method: The equivalent form method is used to avoid the problems
mentioned above with the test-retest method. The equivalent form method
measures the ability of similar instruments to produce results that have a strong
correlation.
3. Internal Consistency and the Split-Half Method: These methods for establishing
reliability rely on the internal consistency of an instrument to produce similar results
on different samples during the same time period. Internal consistency is concerned
with equivalence.
Validity is the degree to which the researcher actually measures what he or she is trying
to measure.Validity is defined as the ability of an instrument to measure what the
researcher intends to measure. There are several different types of validity in social
science research. Each takes a different approach to assessing the extent to which a
measure actually measures what the researcher intends to measure. Each type of
validity has different meaning, uses, and limitations.
1. Face Validity: Face validity is the degree to which subjectively is viewed as measuring
what it purports to measure. It is based on the researcher's judgment or the
collective judgment of a wide group of researchers. As such, it is considered the
weakest form of validity. With face validity, a measure "looks like it measures what
we hope to measure," but it has not been proven to do so.
2. Content Validity: Content validity is frequently considered equivalent to face validity.
Content or logical validity is the extent to which experts agree that the measure
covers all facets of the construct.
3. Criterion Validity: Criterion Validity measures how well a measurement predicts
outcome based on information from other variables. It measures the match
between the survey question and the criterion—content or subject area—it purports
to measure.
4. Construct Validity: Construct validity is the degree to which an instrument
represents the construct it purports to represent. It involves an understanding the
theoretical foundations of the construct. A measure has construct validity when is
conforms to the theory underlying the construct
Random Errors: Random error is a term used to describe all chance or random factors
than confound—undermine—the measurement of any phenomena. Random errors in
measurement are inconsistent errors that happen by chance. They are inherently
unpredictable and transitory. Random errors include sampling errors, unpredictable
fluctuations in the measurement apparatus, or a change in a respondents mood, which
may cause a person to offer an answer to a question that might differ from the one he
or she would normally provide. The amount of random errors is inversely related to the
reliability of a measurement instrument.[1] As the number of random errors decreases,
reliability rises and vice versa.
Systematic Errors: Systematic or Non-Random Errors are a constant or systematic bias in
measurement. Here are two everyday examples of systematic error: 1) Imagine that
your bathroom scale always registers your weight as five pounds lighter that it actually
is and 2) The thermostat in your home says that the room temperature is 72º, when it is
actually 75º. The amount of systematic error is inversely related to the validity of a
measurement instrument.[2] As systematic errors increase, validity falls and vice versa.