Testing in The Classroom and Its Effectiveness in Predicting Student Achievement and Understanding

Running head: TESTING IN THE CLASSROOM 1
Testing in the Classroom and its Effectiveness in Predicting Student
Achievement and Understanding
Ecaroh Jackson
Texas A&M University

Running head: TESTING IN THE CLASSROOM 2
Abstract
This paper explores the validity of summative assessments in K-12 classrooms. It includes a
discourse that branches off of the familiar standardized testing conversation. In many ways,
standardized testing and classroom assessments are similar, but the sheer frequency of classroom
assessments is enough to make it a focal point deserving of its own research. While summative
assessments aren’t usually utilized in determining accountability through the state, they can be
used for school or district level monitoring. Considering the flexibility allowed when testing
students informally, data received from these tests may or may not be as legitimate as it could be.
Additionally, other factors such as dishonesty, test anxiety, and human error are common when
testing in the classroom environment. This paper examines the test scores of students in a high
achieving eighth-grade class and compares those results to their performances in class.
Keywords: Testing, Student Achievement, Student Understanding, Test Anxiety

TESTING IN THE CLASSROOM 3
Testing in the Classroom and its Effectiveness in Predicting Student
Achievement and Understanding
Through the years, testing in schools has been subject to praise and criticism that has
molded the current testing model. Although criticism for testing is probably at an all-time high,
testing has never been more prevalent. Starting in the third grade, students are subjected to
standardized testing that holds the key to many of their futures. Additionally, because of the No
Child Left Behind Act, students are being tested for classroom placement, disabilities, and
general performance. High school students rely on ACT and SAT scores to land them a position
at the top universities and undergraduate students focus on acing the GRE for admission into
graduate programs. Then, after all of this time spent in school, students must test again to gain
certification in their specific fields. Countless studies have been done regarding the validity of
standardized testing scores, and while this aspect of testing is a crucial focal point of testing in
schools, it is also important to consider the testing involved in classrooms. This testing involves
summative assessments ranging from unit tests to benchmarks and are given more frequently
than standardized testing. Similar to standardized testing, summative testing in classrooms may
not be as indicative of student success and understanding as once thought and should be viewed
more critically.
Background
The students involved in this study are 8th grade students taking Algebra I at the high
school level. They were individually picked for this class by their Algebra I teacher who
observed them in the previous grades to see if they had the skills needed to make the jump from
7th grade math to Algebra I. The demographics of the class somewhat resemble those of the
school with the school being 42.7% White, 30.2% African American, 24.7% Hispanic, and 2.4%
Two or More Races (Murphy & Daniel, 2015). The 8th grade Algebra I class is 50% White, 42%
Black, and 8% Hispanic (Murphy & Daniel, 2015). The class logistics are slightly skewed
because of the small sample size. This was unavoidable since the secondary school, grades 6th-
12th, only consists of 255 students, with an average of 8 students per teacher (Murphy & Daniel,
2015).
During this study, I had the opportunity to observe the twelve students twice a day: once
in science and once in math. The data presented was entirely taken from the math class, but the
anecdotal evidence was gathered from observations garnered throughout the day.
Methodology
Two types of measurements were taken over the course of twenty-four weeks. For the
purpose of analysis, the data collected was separated into four, six week categories. Summative
assessments, which included unit tests, were tracked and formative assessments, consisting of
homework, worksheets, and quizzes, were documented. The data was compiled at the
conclusion of the study and used to identify trends pertaining to testing and achievement.
Additionally, the students’ test anxiety, dishonesty, and likeliness to make minute ‘human errors’
was evaluated. To determine the students’ levels of test anxiety, they were given a questionnaire
created by Nist and Diehl (see Appendix A for the questionnaire given). I created three
categories for human error mistakes and dishonesty instances – Low, Medium, and High. I
defined human error mistakes as mistakes due to a calculation or transfer error. This does not
include mistakes that were due to a lack of understanding. Low meant that a student rarely (less
than 10 instances) made human error mistakes. Medium included 10 to 19 instances, and high
signified 20 or more human error mistakes. These values were derived from the students’ turned
in work and from work that was done in class that I assisted with. This same scale was used for
the dishonesty instances. Dishonesty instances in this study are defined as instances where a
sheet of homework was copied or each instance where the student was caught cheating or
attempting to cheat on a test or quiz.
Results
Overall Overall Overall

Student Overall
Assessment Summative Formative
Pseudonym Average
Average Average Average
Kyle 93 90.9375 88.375 93.5
Hannah 96.75 96 95 97
Courtney 98.5 98 98.5 97.5
Katie 85.5 80.375 75 85.75
Haylie 91.25 89.75 87 92.5
John 82.5 79.8125 77.375 82.25
Bill 77.5 73.75 70.25 77.25
Bob 101.25 100.5625 100.375 100.75
Ashley 80.25 79.1875 79.875 78.5
Tim 92 90.0625 88.125 92
Reed 77.25 68.0625 59.125 77
Rylie 79 80.25 79.16666667 81.33333333
Averages 88.0625 85.75520833 83.42708333 88.08333333
Overall Gradebook Average - Average of the four gradebook averages
Overall Assessment Average - Average of the four Summative & Formative Averages
Discussion
The overall average is higher than the overall assessment average because bonus points
are given for signed progress reports and the lowest two minor grades each six weeks are
dropped. Additionally, students have a binder check each six weeks which serves as a major
grade; therefore, as long as they take their notes, they will receive a 100 as a major grade. The
overall formative assessment average was 4.65625 points higher than the overall summative
assessment average. This means that the students consistently performed better on their
homework assignments than on their tests. The increase in performance could be attributed to
multiple factors including that the lowest two homework grades are dropped each 6 weeks,
homework is easier to cheat on than tests, and we help the students frequently when they do their
homework. For the few students that had higher summative assessment averages than formative
assessment averages, the discrepancy can most likely be attributed to the students not turning in
their assignments or turning them in partially finished. Another interesting aspect of this data is
that students with the most instances of dishonesty tended to have the lower formative
assessment averages. This is intriguing since it would seem as if the students that tended to cheat
or attempted to cheat still received lower grades on their assignments than their peers who did
not cheat.
I was slightly shocked when I discovered the substantial difference between the students’
major and minor grade averages. The students receive an ample amount of help when taking
their tests and are able to ask an unlimited amount of questions. Although not all of the students
take advantage of our willingness to help, the majority do seek out assistance when in need. This
is where the human error aspect comes in. The students who frequently made minute errors that
caused them to miss questions rarely asked questions about those specific problems because they
had done the steps correctly and were confident that they hadn’t made any mistakes. Therefore,
it was not their understanding that was lacking, but more so their inattentive nature.
The text anxiety survey I administered to the students was brief so that I would get
genuine answers from the students. I had students rank in all three categories of test anxiety
(low, medium, high). There was no correlation between students’ test anxiety scores and their
summative assessment scores. Only one student scored high on the test anxiety questionnaire.
This particular student also had one of the largest differentials (5.125) between her formative and
summative assessment averages, which could be indicative of her test anxiety impacting her
performance on the tests. The student with the highest averages in all categories and the lowest
differential (0.375) between his formative and summative assessment averages scored low on the
test anxiety questionnaire. While this could signify that Bob’s test grades are a decent
representation of his understanding and achievement, there were too many contradictions in the
data for me to make these conclusions.
Other Studies
Interestingly, as I scoured the internet for information pertaining testing in the classroom,
I ran across few scholarly articles on this topic. On the other hand, the more infamous form of
testing, standardized testing, has an overabundance of data available to review. While I agree
that standardized testing should be the focal point of our current studies because it not only
affects our students temporarily, but also informs the curriculum that our educators use, I think it
is also imperative that we pay closer attention to the summative assessments our students are
being given in class.
Testing Action Plan (Obama)
In October of 2015, President Barak Obama announced his administration’s Testing
Action Plan which wanted to reexamine how tests are utilized in school (Arnett, 2016). At one
point in the announcement, President Obama said that students "should only take tests that are
worth taking — tests that are high quality, aimed at good instruction, and make sure everyone is
on track" (Arnett, 2016). He also mentioned that assessments shouldn’t consume the student’s
classroom time and should only be one of many tools to identify student progress (Arnett, 2016).
The administration went as far as saying that some tests that need to be eliminated are low-
quality, redundant, and unnecessary. The administration listed seven crucial points assessments
should satisfy. These points are listed below.

• *They must be worth taking: “Testing should be a part of good instruction, not a
departure from it.”
• *They must be high-quality: “High-quality assessment results in actionable, objective
information about student knowledge and skills.
• *They must be time-limited.
• *They must be fair: “Assessments should be fair, including providing fair measures of
student learning for students with disabilities and English learners. Accessibility features
and accommodations must level the playing field so tests accurately reflect what students
really know and can do.
• *They must be “fully transparent” to students and parents: “States and districts should
ensure that every parent gets understandable information about the assessments their
students are taking.”
• *They must be just one evaluation measure: “Assessments provide critical information
about student learning, but no single assessment should ever be the sole factor in making
an educational decision about a student, an educator, or a school.”
• *They must be “tied to improved learning: While some tests are for accountability
purposes only, the vast majority of assessments should be tools in a broader strategy to
improve teaching and learning.” (Strauss, 2015)
This plan was teacher led and had a four-pronged approach that included “financial
support for states to develop and use better, less burdensome assessments, expertise to states and
school districts looking to reduce time spent on testing, flexibility from federal mandates and
greater support to innovate and reduce testing, and reducing the reliance on student test scores
through our rules and executive actions” (U.S Department of Education, 2015). While this plan
was positively ambitious, reversing a culture of testing is easier said than done. Therefore, this
plan was not deemed as successful and standardized testing, as well as frequent classroom
assessments, are still prevalent today (Strauss, 2015).
Although Obama’s Testing Action Plan wasn’t carried out fully, I think the ideas in it
were sound and could serve as the foundation of a new plan that takes the focus away from test
based data and pushes it towards a more holistic approach.
Negative Effects of Testing
At my current school, students are tested at least three times every six weeks. That is an
average of one test every two weeks. This is not necessarily problematic, but may be
unnecessary if the results aren’t indicative of the students’ actual levels of understanding.
Additionally, for students, the term “test” already has a negative connotation. Every time a new
test is announced, the announcement is contested and met with groans. Students do not want to
be assessed at all, so over testing them is not beneficial to the students or the teachers.
According to Shepard, Penuel, and Pellegrino, classroom assessments should be used to support
learning rather than as a “business-as-usual” model (Shepard, Penuel, & Pellegrino, 2018). A
shared curriculum is not in place for every school in every state and therefore, state standards
and assessments cannot possibly be fully aligned with classroom assessments. The main concern
with this is that students may excel on classroom assessments and fail to perform on state
assessments. Neither assessment alone can verify a student’s understanding of the material and
are inaccurate with providing a holistic report on a student’s achievement. Although neither
assessment is enough to classify a student as meeting expectations or not meeting expectations,
state assessments alone are used for district, school, and teacher accountability. If there is such a
discrepancy between the state and local assessments, then neither assessment should be used to
determine student success. (Shepard, Penuel, & Pellegrino, 2018)
Students are well aware of the accountability systems in place which unfortunately
probably affects the way they view and perform on tests. Since state assessments carry more
weight and can affect a student’s future, students will likely take them more seriously than a
classroom assessment that will only be viewed by their teacher and administer. This
phenomenon can be explained using the Expectancy-Value Theory of Motivation. An excerpt
explaining this theory in detail is included below:
Applied to large-scale testing, expectancy-value theory states that a test taker’s motivation to engage in
activities related to large-scale testing depends on their belief about experiencing success on the test and
the value that they place on the content, process, and/or outcomes of the test. That is, if a test taker
believes they will experience success on the large-scale test and they value it, they are more likely to be
motivated and engage with the tasks to the best of their ability. (Barneveld & Brinson, 2017)
The amount of effort put into a test can predict the results. As educators, we need results based
on learning rather than on effort given on a particular day. This is one reason why testing
alternatives may not be an unreasonable concept in education’s future.
Classroom Testing Alternatives
The current model of educational assessments in classrooms includes summative
assessments and formative assessments. Typically, formative assessments are used by teachers
to inform their teaching practices and help them to determine what to modify and what to keep,
while summative assessments are mostly used for data collection by school districts and states.
Although each type of test has its niche, both tests can be used interchangeably in the classroom.
By mixing up the type of assessment given to the student, the burn out often seen with typical
tests may be reduced. Dixson and Worrell discuss two types of formative assessments in their
article: spontaneous and planned (Dixson & Worrell, 2016). Spontaneous formative assessments
aren’t reasonable data collecting assessments, but formative assessments such as quizzes and
homework can be if done correctly. Formative assessments are usually given frequently
throughout the school year and can provide a more in-depth look into student success than one
standardized test can. Some examples of formative assessments that can be taken as grades are
major projects, portfolios, worksheets, quizzes, homework assignments, and exit tickets. These
assessments, while valuable, cannot and will not completely replace testing in classrooms.
Instead, hopefully teachers can find a balance between formative and summative assessments so
that the students’ achievements won’t be completely dependent on tests they don’t want to take.
Conclusions and Future Study
I am well aware that testing in the classroom is not going anywhere anytime soon due to
the accountability constraints created by school districts and the state, but there are steps we can
take to lessen it. First and foremost, testing in the classroom needs to be evaluated and studied
further. Similar to standardized testing, in class testing can also carry high stakes and therefore
cause unnecessary stress to the students and teachers. Additionally, with more studies, the
accuracy and validity of classroom testing can be analyzed to see how informative it actually is
regarding students’ achievement and understanding. If classroom testing is not serving its
purpose, then educators need to reevaluate their assessment methods and determine if the
constant testing is essential to the students’ success. More specifically, educators should make
sure that their assessment strategies are benefiting students rather than hindering them.
References
Arnett, A. A. (2016, April 18). Why testing prevails in K-12 education. Retrieved from
https://www.educationdive.com/news/why-testing-prevails-in-k-12-education-1/417294/
Barneveld, C. V., & Brinson, K. (2017). The Rights and Responsibility of Test Takers when
Large-Scale Testing Is Used for Classroom Assessment. Canadian Journal of
Education,40(1), 1-22.
Dixson, D. D., & Worrell, F. C. (2016). Formative and Summative Assessment in the
Classroom. Theory Into Practice,55(2), 153-159. doi:10.1080/00405841.2016.1148989
Murphy, R., & Daniel, A. (2015, December 08). Snook Secondary School. Retrieved from
https://schools.texastribune.org/districts/snook-isd/snook-secondary-school/
Shepard, L. A., Penuel, W. R., & Pellegrino, J. W. (2018). Classroom Assessment Principles to
Support Learning and Avoid the Harms of Testing. Educational Measurement: Issues
and Practice,37(1), 52-57. doi:10.1111/emip.12195
Strauss, V. (2015, October 27). Why Obama's new plan to cap standardized testing won't work.
Retrieved from https://www.washingtonpost.com/news/answer-
sheet/wp/2015/10/27/why-obamas-new-plan-to-cap-standardized-testing-wont-
work/?utm_term=.673fdcff2119
U.S Department of Education. (2015, October 24). Fact Sheet: Testing Action Plan. Retrieved
from https://www.ed.gov/news/press-releases/fact-sheet-testing-action-plan
Appendix A
Test Anxiety Questionnaire from Nist and Diehl
Nist and Diehl (1990) developed a short questionnaire for determining if a

student experiences a mild or severe case of test anxiety. To complete the
evaluation, read through each statement and reflect upon past testing
experiences. You may wish to consider all testing experiences or focus on a
particular subject (history, science, math, etc.) one at a time. Indicate how
often each statement describes you by choosing a number from one to five as
outlined below [note that the numbers are in reverse compared to the previous
questionnaire on stress vulnerability].
Never Rarely Sometimes Often Always

1 2 3 4 5
1. _____ I have visible signs of nervousness such as sweaty palms, shaky

hands, and so on right before a test.
2. _____ I have “butterflies” in my stomach before a test.
3. _____ I feel nauseated before a test.
4. _____ I read through the test and feel that I do not know any of the
answers.
5. _____ I panic before and during a test.
6. _____ My mind goes blank during a test.
7. _____ I remember the information that I blanked once I get out of the
testing situation.
8. _____ I have trouble sleeping the night before a test.
9. _____ I make mistakes on easy questions or put answers in the wrong

places.
10. _____ I have trouble choosing answers.

DETERMINE YOUR RESULTS

• now add up your score on all statements
• scores will range from 10 to 50..
• a low score (10-19) indicates that you do not suffer from
test anxiety.
• in fact, if your score was extremely low (close to 10), a
little more anxiety may be healthy to keep you focused
and to get your blood flowing during exams.
• scores between 20 – 35 indicate that, although you exhibit
some of the characteristics of test anxiety, the level of
stress and tension is probably healthy.
• scores over 35 suggest that you are experiencing an
unhealthy level of anxiety.
• you should evaluate the reason(s) for the stress and
identify strategies for compensating.
Tables
Table 1
Overall Student Data
Overall Overall Overall

Student Overall Assessment Summative Formative
Pseudonym Average Average Average Average
Kyle 93 90.9375 88.375 93.5
Hannah 96.75 96 95 97
Courtney 98.5 98 98.5 97.5
Katie 85.5 80.375 75 85.75
Haylie 91.25 89.75 87 92.5
John 82.5 79.8125 77.375 82.25
Bill 77.5 73.75 70.25 77.25
Bob 101.25 100.5625 100.375 100.75
Ashley 80.25 79.1875 79.875 78.5
Tim 92 90.0625 88.125 92
Reed 77.25 68.0625 59.125 77
Rylie 79 80.25 79.16666667 81.33333333
Averages 88.0625 85.75520833 83.42708333 88.08333333
Note: This table is a view of all of the students’ overall data taken from the first 24 weeks of
school. The last 12 weeks of school are not included in this data. Pseudonyms were assigned to
each of the students to maintain their privacy.

Tables 2-4
Six Weeks Data

1st 6 Weeks
Summative & Summative Formative

Student Gradebook Formative Assessment Assessment
Kyle 94 91.75 87.5 96
Hannah 97 96 96 96
Courtney 96 95.25 95.5 95
Katie 86 82 78 86
Haylie 92 89.25 85.5 93
John 77 75.75 81.5 70
Bill 82 78.25 73.5 83
Bob 101 101.75 102.5 101
Ashley 84 80.25 78.5 82
Tim 95 93.75 93.5 94
Reed 77 73.5 75 72
Rylie T T T T
Averages 89 87.25 86.5 88
2nd 6 Weeks

Kyle 93 92.25 91.5 93
Hannah 98 97.75 98.5 97
Courtney 99 99.5 101 98
Katie 87 87 91 83
Haylie 90 87.5 84 91
John 89 86.75 86.5 87
Bill 76 69 60 78
Bob 101 101.25 102.5 100
Ashley 77 75.5 79 72
Tim 89 85.5 82 89
Reed 80 69.75 50.5 89
Rylie 76 78.5 76 81
Averages 87.91666667 85.85416667 83.54166667 88.16666667
3rd 6 Weeks

Kyle 91 87.75 84.5 91
Hannah 95 91.5 85 98
Courtney 99 99.25 101.5 97
Katie 79 69.5 55 84
Haylie 91 87.75 81.5 94
John 90 82.25 66.5 98
Bill 73 69 69 69
Bob 100 99.5 100 99
Ashley 81 80.25 79.5 81
Tim 93 90.75 87.5 94
Reed 71 65 64 66
Rylie 80 82.5 80 85
Averages 86.91666667 83.75 79.5 88
4th 6 Weeks

Kyle 94 92 90 94
Hannah 97 98.75 100.5 97
Courtney 100 98 96 100
Katie 90 83 76 90
Haylie 92 94.5 97 92
John 74 74.5 75 74
Bill 79 78.75 78.5 79
Bob 103 99.75 96.5 103
Ashley 79 80.75 82.5 79
Tim 91 90.25 89.5 91
Reed 81 64 47 81
Rylie 81 79.75 81.5 78
Averages 88.41666667 86.16666667 84.16666667 88.16666667
Table 5
Anecdotal Evidence
Anecdotal Evidence (Jan 8th - April 5th)
Test Human Dishonesty

Student Test Anxiety Anxiety Error Instances (L,
Pseudonym Average** Score Mistakes* M, H)
Kyle 38 High M M
Hannah 20 Moderate M L
Courtney 16 Low M L ~Can be defined
Katie 23 Moderate M M as instances
where a sheet of
Haylie 15 Low M L homework was
John N/a N/a L H copied or each
Bill 27 Moderate H M instance where
Bob 19 Low L L the student was
Ashley 25 Moderate M M caught cheating
or attempting to
Tim 21 Moderate M L cheat on a test
Reed N/a N/a L H or quiz
Rylie 14 Low L H
Average 21.8 Moderate M M
* Can be
defined as
** 10- Low: x<10,
mistakes
19=Low, 20- Medium:
made due to
35=Moderate, 10≤x<20,
a calculation
35-50=High High: x≥20
or transfer
error.
This table contains data pertaining to the students’ test anxiety, human error mistakes, and
dishonesty instances. These items were tracked over a period of thirteen weeks and aside from
the test anxiety average, were completely from my observations.

Testing in The Classroom and Its Effectiveness in Predicting Student Achievement and Understanding

Uploaded by

Copyright:

Available Formats

Testing in The Classroom and Its Effectiveness in Predicting Student Achievement and Understanding

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Testing in The Classroom and Its Effectiveness in Predicting Student Achievement and Understanding

Uploaded by

Copyright:

Available Formats

Running head: TESTING IN THE CLASSROOM 1

Testing in the Classroom and its Effectiveness in Predicting Student

Achievement and Understanding

Texas A&M University

Keywords: Testing, Student Achievement, Student Understanding, Test Anxiety

Testing in the Classroom and its Effectiveness in Predicting Student

Achievement and Understanding

attempting to cheat on a test or quiz.

Overall Overall Overall

data for me to make these conclusions.

being given in class.

Testing Action Plan (Obama)

In October of 2015, President Barak Obama announced his administration’s Testing

should satisfy. These points are listed below.

assessments, are still prevalent today (Strauss, 2015).

based data and pushes it towards a more holistic approach.

Negative Effects of Testing

assessment is enough to classify a student as meeting expectations or not meeting expectations,

determine student success. (Shepard, Penuel, & Pellegrino, 2018)

phenomenon can be explained using the Expectancy-Value Theory of Motivation. An excerpt

explaining this theory in detail is included below:

alternatives may not be an unreasonable concept in education’s future.

Classroom Testing Alternatives

The current model of educational assessments in classrooms includes summative

Conclusions and Future Study

Large-Scale Testing Is Used for Classroom Assessment. Canadian Journal of

Classroom. Theory Into Practice,55(2), 153-159. doi:10.1080/00405841.2016.1148989

and Practice,37(1), 52-57. doi:10.1111/emip.12195

Retrieved from https://www.washingtonpost.com/news/answer-

Test Anxiety Questionnaire from Nist and Diehl

Nist and Diehl (1990) developed a short questionnaire for determining if a

Never Rarely Sometimes Often Always

1. _____ I have visible signs of nervousness such as sweaty palms, shaky

2. _____ I have “butterflies” in my stomach before a test.

3. _____ I feel nauseated before a test.

5. _____ I panic before and during a test.

6. _____ My mind goes blank during a test.

8. _____ I have trouble sleeping the night before a test.

9. _____ I make mistakes on easy questions or put answers in the wrong

10. _____ I have trouble choosing answers.

DETERMINE YOUR RESULTS

Overall Student Data

Overall Overall Overall

each of the students to maintain their privacy.

Six Weeks Data

Summative & Summative Formative

Summative & Summative Formative

Summative & Summative Formative

Summative & Summative Formative

Anecdotal Evidence (Jan 8th - April 5th)

Test Human Dishonesty

the test anxiety average, were completely from my observations.

You might also like