Testing in The Classroom and Its Effectiveness in Predicting Student Achievement and Understanding

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Running head: TESTING IN THE CLASSROOM 1

Testing in the Classroom and its Effectiveness in Predicting Student

Achievement and Understanding

Ecaroh Jackson

Texas A&M University


Running head: TESTING IN THE CLASSROOM 2

Abstract

This paper explores the validity of summative assessments in K-12 classrooms. It includes a

discourse that branches off of the familiar standardized testing conversation. In many ways,

standardized testing and classroom assessments are similar, but the sheer frequency of classroom

assessments is enough to make it a focal point deserving of its own research. While summative

assessments aren’t usually utilized in determining accountability through the state, they can be

used for school or district level monitoring. Considering the flexibility allowed when testing

students informally, data received from these tests may or may not be as legitimate as it could be.

Additionally, other factors such as dishonesty, test anxiety, and human error are common when

testing in the classroom environment. This paper examines the test scores of students in a high

achieving eighth-grade class and compares those results to their performances in class.

Keywords: Testing, Student Achievement, Student Understanding, Test Anxiety


TESTING IN THE CLASSROOM 3

Testing in the Classroom and its Effectiveness in Predicting Student

Achievement and Understanding

Through the years, testing in schools has been subject to praise and criticism that has

molded the current testing model. Although criticism for testing is probably at an all-time high,

testing has never been more prevalent. Starting in the third grade, students are subjected to

standardized testing that holds the key to many of their futures. Additionally, because of the No

Child Left Behind Act, students are being tested for classroom placement, disabilities, and

general performance. High school students rely on ACT and SAT scores to land them a position

at the top universities and undergraduate students focus on acing the GRE for admission into

graduate programs. Then, after all of this time spent in school, students must test again to gain

certification in their specific fields. Countless studies have been done regarding the validity of

standardized testing scores, and while this aspect of testing is a crucial focal point of testing in

schools, it is also important to consider the testing involved in classrooms. This testing involves

summative assessments ranging from unit tests to benchmarks and are given more frequently

than standardized testing. Similar to standardized testing, summative testing in classrooms may

not be as indicative of student success and understanding as once thought and should be viewed

more critically.

Background

The students involved in this study are 8th grade students taking Algebra I at the high

school level. They were individually picked for this class by their Algebra I teacher who

observed them in the previous grades to see if they had the skills needed to make the jump from

7th grade math to Algebra I. The demographics of the class somewhat resemble those of the

school with the school being 42.7% White, 30.2% African American, 24.7% Hispanic, and 2.4%
TESTING IN THE CLASSROOM 4

Two or More Races (Murphy & Daniel, 2015). The 8th grade Algebra I class is 50% White, 42%

Black, and 8% Hispanic (Murphy & Daniel, 2015). The class logistics are slightly skewed

because of the small sample size. This was unavoidable since the secondary school, grades 6th-

12th, only consists of 255 students, with an average of 8 students per teacher (Murphy & Daniel,

2015).

During this study, I had the opportunity to observe the twelve students twice a day: once

in science and once in math. The data presented was entirely taken from the math class, but the

anecdotal evidence was gathered from observations garnered throughout the day.

Methodology

Two types of measurements were taken over the course of twenty-four weeks. For the

purpose of analysis, the data collected was separated into four, six week categories. Summative

assessments, which included unit tests, were tracked and formative assessments, consisting of

homework, worksheets, and quizzes, were documented. The data was compiled at the

conclusion of the study and used to identify trends pertaining to testing and achievement.

Additionally, the students’ test anxiety, dishonesty, and likeliness to make minute ‘human errors’

was evaluated. To determine the students’ levels of test anxiety, they were given a questionnaire

created by Nist and Diehl (see Appendix A for the questionnaire given). I created three

categories for human error mistakes and dishonesty instances – Low, Medium, and High. I

defined human error mistakes as mistakes due to a calculation or transfer error. This does not

include mistakes that were due to a lack of understanding. Low meant that a student rarely (less

than 10 instances) made human error mistakes. Medium included 10 to 19 instances, and high

signified 20 or more human error mistakes. These values were derived from the students’ turned

in work and from work that was done in class that I assisted with. This same scale was used for
TESTING IN THE CLASSROOM 5

the dishonesty instances. Dishonesty instances in this study are defined as instances where a

sheet of homework was copied or each instance where the student was caught cheating or

attempting to cheat on a test or quiz.

Results

Overall Overall Overall


Student Overall
Assessment Summative Formative
Pseudonym Average
Average Average Average
Kyle 93 90.9375 88.375 93.5
Hannah 96.75 96 95 97
Courtney 98.5 98 98.5 97.5
Katie 85.5 80.375 75 85.75
Haylie 91.25 89.75 87 92.5
John 82.5 79.8125 77.375 82.25
Bill 77.5 73.75 70.25 77.25
Bob 101.25 100.5625 100.375 100.75
Ashley 80.25 79.1875 79.875 78.5
Tim 92 90.0625 88.125 92
Reed 77.25 68.0625 59.125 77
Rylie 79 80.25 79.16666667 81.33333333
Averages 88.0625 85.75520833 83.42708333 88.08333333
Overall Gradebook Average - Average of the four gradebook averages
Overall Assessment Average - Average of the four Summative & Formative Averages

Discussion

The overall average is higher than the overall assessment average because bonus points

are given for signed progress reports and the lowest two minor grades each six weeks are

dropped. Additionally, students have a binder check each six weeks which serves as a major

grade; therefore, as long as they take their notes, they will receive a 100 as a major grade. The

overall formative assessment average was 4.65625 points higher than the overall summative

assessment average. This means that the students consistently performed better on their

homework assignments than on their tests. The increase in performance could be attributed to
TESTING IN THE CLASSROOM 6

multiple factors including that the lowest two homework grades are dropped each 6 weeks,

homework is easier to cheat on than tests, and we help the students frequently when they do their

homework. For the few students that had higher summative assessment averages than formative

assessment averages, the discrepancy can most likely be attributed to the students not turning in

their assignments or turning them in partially finished. Another interesting aspect of this data is

that students with the most instances of dishonesty tended to have the lower formative

assessment averages. This is intriguing since it would seem as if the students that tended to cheat

or attempted to cheat still received lower grades on their assignments than their peers who did

not cheat.

I was slightly shocked when I discovered the substantial difference between the students’

major and minor grade averages. The students receive an ample amount of help when taking

their tests and are able to ask an unlimited amount of questions. Although not all of the students

take advantage of our willingness to help, the majority do seek out assistance when in need. This

is where the human error aspect comes in. The students who frequently made minute errors that

caused them to miss questions rarely asked questions about those specific problems because they

had done the steps correctly and were confident that they hadn’t made any mistakes. Therefore,

it was not their understanding that was lacking, but more so their inattentive nature.

The text anxiety survey I administered to the students was brief so that I would get

genuine answers from the students. I had students rank in all three categories of test anxiety

(low, medium, high). There was no correlation between students’ test anxiety scores and their

summative assessment scores. Only one student scored high on the test anxiety questionnaire.

This particular student also had one of the largest differentials (5.125) between her formative and

summative assessment averages, which could be indicative of her test anxiety impacting her
TESTING IN THE CLASSROOM 7

performance on the tests. The student with the highest averages in all categories and the lowest

differential (0.375) between his formative and summative assessment averages scored low on the

test anxiety questionnaire. While this could signify that Bob’s test grades are a decent

representation of his understanding and achievement, there were too many contradictions in the

data for me to make these conclusions.

Other Studies

Interestingly, as I scoured the internet for information pertaining testing in the classroom,

I ran across few scholarly articles on this topic. On the other hand, the more infamous form of

testing, standardized testing, has an overabundance of data available to review. While I agree

that standardized testing should be the focal point of our current studies because it not only

affects our students temporarily, but also informs the curriculum that our educators use, I think it

is also imperative that we pay closer attention to the summative assessments our students are

being given in class.

Testing Action Plan (Obama)

In October of 2015, President Barak Obama announced his administration’s Testing

Action Plan which wanted to reexamine how tests are utilized in school (Arnett, 2016). At one

point in the announcement, President Obama said that students "should only take tests that are

worth taking — tests that are high quality, aimed at good instruction, and make sure everyone is

on track" (Arnett, 2016). He also mentioned that assessments shouldn’t consume the student’s

classroom time and should only be one of many tools to identify student progress (Arnett, 2016).

The administration went as far as saying that some tests that need to be eliminated are low-

quality, redundant, and unnecessary. The administration listed seven crucial points assessments

should satisfy. These points are listed below.


TESTING IN THE CLASSROOM 8

• *They must be worth taking: “Testing should be a part of good instruction, not a
departure from it.”
• *They must be high-quality: “High-quality assessment results in actionable, objective
information about student knowledge and skills.
• *They must be time-limited.
• *They must be fair: “Assessments should be fair, including providing fair measures of
student learning for students with disabilities and English learners. Accessibility features
and accommodations must level the playing field so tests accurately reflect what students
really know and can do.
• *They must be “fully transparent” to students and parents: “States and districts should
ensure that every parent gets understandable information about the assessments their
students are taking.”
• *They must be just one evaluation measure: “Assessments provide critical information
about student learning, but no single assessment should ever be the sole factor in making
an educational decision about a student, an educator, or a school.”
• *They must be “tied to improved learning: While some tests are for accountability
purposes only, the vast majority of assessments should be tools in a broader strategy to
improve teaching and learning.” (Strauss, 2015)

This plan was teacher led and had a four-pronged approach that included “financial

support for states to develop and use better, less burdensome assessments, expertise to states and

school districts looking to reduce time spent on testing, flexibility from federal mandates and

greater support to innovate and reduce testing, and reducing the reliance on student test scores

through our rules and executive actions” (U.S Department of Education, 2015). While this plan

was positively ambitious, reversing a culture of testing is easier said than done. Therefore, this

plan was not deemed as successful and standardized testing, as well as frequent classroom

assessments, are still prevalent today (Strauss, 2015).

Although Obama’s Testing Action Plan wasn’t carried out fully, I think the ideas in it

were sound and could serve as the foundation of a new plan that takes the focus away from test

based data and pushes it towards a more holistic approach.

Negative Effects of Testing

At my current school, students are tested at least three times every six weeks. That is an

average of one test every two weeks. This is not necessarily problematic, but may be
TESTING IN THE CLASSROOM 9

unnecessary if the results aren’t indicative of the students’ actual levels of understanding.

Additionally, for students, the term “test” already has a negative connotation. Every time a new

test is announced, the announcement is contested and met with groans. Students do not want to

be assessed at all, so over testing them is not beneficial to the students or the teachers.

According to Shepard, Penuel, and Pellegrino, classroom assessments should be used to support

learning rather than as a “business-as-usual” model (Shepard, Penuel, & Pellegrino, 2018). A

shared curriculum is not in place for every school in every state and therefore, state standards

and assessments cannot possibly be fully aligned with classroom assessments. The main concern

with this is that students may excel on classroom assessments and fail to perform on state

assessments. Neither assessment alone can verify a student’s understanding of the material and

are inaccurate with providing a holistic report on a student’s achievement. Although neither

assessment is enough to classify a student as meeting expectations or not meeting expectations,

state assessments alone are used for district, school, and teacher accountability. If there is such a

discrepancy between the state and local assessments, then neither assessment should be used to

determine student success. (Shepard, Penuel, & Pellegrino, 2018)

Students are well aware of the accountability systems in place which unfortunately

probably affects the way they view and perform on tests. Since state assessments carry more

weight and can affect a student’s future, students will likely take them more seriously than a

classroom assessment that will only be viewed by their teacher and administer. This

phenomenon can be explained using the Expectancy-Value Theory of Motivation. An excerpt

explaining this theory in detail is included below:

Applied to large-scale testing, expectancy-value theory states that a test taker’s motivation to engage in
activities related to large-scale testing depends on their belief about experiencing success on the test and
the value that they place on the content, process, and/or outcomes of the test. That is, if a test taker
TESTING IN THE CLASSROOM 10

believes they will experience success on the large-scale test and they value it, they are more likely to be
motivated and engage with the tasks to the best of their ability. (Barneveld & Brinson, 2017)

The amount of effort put into a test can predict the results. As educators, we need results based

on learning rather than on effort given on a particular day. This is one reason why testing

alternatives may not be an unreasonable concept in education’s future.

Classroom Testing Alternatives

The current model of educational assessments in classrooms includes summative

assessments and formative assessments. Typically, formative assessments are used by teachers

to inform their teaching practices and help them to determine what to modify and what to keep,

while summative assessments are mostly used for data collection by school districts and states.

Although each type of test has its niche, both tests can be used interchangeably in the classroom.

By mixing up the type of assessment given to the student, the burn out often seen with typical

tests may be reduced. Dixson and Worrell discuss two types of formative assessments in their

article: spontaneous and planned (Dixson & Worrell, 2016). Spontaneous formative assessments

aren’t reasonable data collecting assessments, but formative assessments such as quizzes and

homework can be if done correctly. Formative assessments are usually given frequently

throughout the school year and can provide a more in-depth look into student success than one

standardized test can. Some examples of formative assessments that can be taken as grades are

major projects, portfolios, worksheets, quizzes, homework assignments, and exit tickets. These

assessments, while valuable, cannot and will not completely replace testing in classrooms.

Instead, hopefully teachers can find a balance between formative and summative assessments so

that the students’ achievements won’t be completely dependent on tests they don’t want to take.
TESTING IN THE CLASSROOM 11

Conclusions and Future Study

I am well aware that testing in the classroom is not going anywhere anytime soon due to

the accountability constraints created by school districts and the state, but there are steps we can

take to lessen it. First and foremost, testing in the classroom needs to be evaluated and studied

further. Similar to standardized testing, in class testing can also carry high stakes and therefore

cause unnecessary stress to the students and teachers. Additionally, with more studies, the

accuracy and validity of classroom testing can be analyzed to see how informative it actually is

regarding students’ achievement and understanding. If classroom testing is not serving its

purpose, then educators need to reevaluate their assessment methods and determine if the

constant testing is essential to the students’ success. More specifically, educators should make

sure that their assessment strategies are benefiting students rather than hindering them.
TESTING IN THE CLASSROOM 12

References

Arnett, A. A. (2016, April 18). Why testing prevails in K-12 education. Retrieved from

https://www.educationdive.com/news/why-testing-prevails-in-k-12-education-1/417294/

Barneveld, C. V., & Brinson, K. (2017). The Rights and Responsibility of Test Takers when

Large-Scale Testing Is Used for Classroom Assessment. Canadian Journal of

Education,40(1), 1-22.

Dixson, D. D., & Worrell, F. C. (2016). Formative and Summative Assessment in the

Classroom. Theory Into Practice,55(2), 153-159. doi:10.1080/00405841.2016.1148989

Murphy, R., & Daniel, A. (2015, December 08). Snook Secondary School. Retrieved from

https://schools.texastribune.org/districts/snook-isd/snook-secondary-school/

Shepard, L. A., Penuel, W. R., & Pellegrino, J. W. (2018). Classroom Assessment Principles to

Support Learning and Avoid the Harms of Testing. Educational Measurement: Issues

and Practice,37(1), 52-57. doi:10.1111/emip.12195

Strauss, V. (2015, October 27). Why Obama's new plan to cap standardized testing won't work.

Retrieved from https://www.washingtonpost.com/news/answer-

sheet/wp/2015/10/27/why-obamas-new-plan-to-cap-standardized-testing-wont-

work/?utm_term=.673fdcff2119

U.S Department of Education. (2015, October 24). Fact Sheet: Testing Action Plan. Retrieved

from https://www.ed.gov/news/press-releases/fact-sheet-testing-action-plan
TESTING IN THE CLASSROOM 13

Appendix A

Test Anxiety Questionnaire from Nist and Diehl

Nist and Diehl (1990) developed a short questionnaire for determining if a


student experiences a mild or severe case of test anxiety. To complete the
evaluation, read through each statement and reflect upon past testing
experiences. You may wish to consider all testing experiences or focus on a
particular subject (history, science, math, etc.) one at a time. Indicate how
often each statement describes you by choosing a number from one to five as
outlined below [note that the numbers are in reverse compared to the previous
questionnaire on stress vulnerability].

Never Rarely Sometimes Often Always


1 2 3 4 5

1. _____ I have visible signs of nervousness such as sweaty palms, shaky


hands, and so on right before a test.

2. _____ I have “butterflies” in my stomach before a test.

3. _____ I feel nauseated before a test.

4. _____ I read through the test and feel that I do not know any of the
answers.

5. _____ I panic before and during a test.

6. _____ My mind goes blank during a test.

7. _____ I remember the information that I blanked once I get out of the
testing situation.

8. _____ I have trouble sleeping the night before a test.

9. _____ I make mistakes on easy questions or put answers in the wrong


places.

10. _____ I have trouble choosing answers.


TESTING IN THE CLASSROOM 14

DETERMINE YOUR RESULTS


• now add up your score on all statements
• scores will range from 10 to 50..
• a low score (10-19) indicates that you do not suffer from
test anxiety.
• in fact, if your score was extremely low (close to 10), a
little more anxiety may be healthy to keep you focused
and to get your blood flowing during exams.
• scores between 20 – 35 indicate that, although you exhibit
some of the characteristics of test anxiety, the level of
stress and tension is probably healthy.
• scores over 35 suggest that you are experiencing an
unhealthy level of anxiety.
• you should evaluate the reason(s) for the stress and
identify strategies for compensating.
TESTING IN THE CLASSROOM 15

Tables

Table 1

Overall Student Data

Overall Overall Overall


Student Overall Assessment Summative Formative
Pseudonym Average Average Average Average
Kyle 93 90.9375 88.375 93.5
Hannah 96.75 96 95 97
Courtney 98.5 98 98.5 97.5
Katie 85.5 80.375 75 85.75
Haylie 91.25 89.75 87 92.5
John 82.5 79.8125 77.375 82.25
Bill 77.5 73.75 70.25 77.25
Bob 101.25 100.5625 100.375 100.75
Ashley 80.25 79.1875 79.875 78.5
Tim 92 90.0625 88.125 92
Reed 77.25 68.0625 59.125 77
Rylie 79 80.25 79.16666667 81.33333333
Averages 88.0625 85.75520833 83.42708333 88.08333333

Note: This table is a view of all of the students’ overall data taken from the first 24 weeks of

school. The last 12 weeks of school are not included in this data. Pseudonyms were assigned to

each of the students to maintain their privacy.


TESTING IN THE CLASSROOM 16

Tables 2-4

Six Weeks Data


1st 6 Weeks

Summative & Summative Formative


Student Gradebook Formative Assessment Assessment
Pseudonym Average Average Average Average
Kyle 94 91.75 87.5 96
Hannah 97 96 96 96
Courtney 96 95.25 95.5 95
Katie 86 82 78 86
Haylie 92 89.25 85.5 93
John 77 75.75 81.5 70
Bill 82 78.25 73.5 83
Bob 101 101.75 102.5 101
Ashley 84 80.25 78.5 82
Tim 95 93.75 93.5 94
Reed 77 73.5 75 72
Rylie T T T T
Averages 89 87.25 86.5 88

2nd 6 Weeks

Summative & Summative Formative


Student Gradebook Formative Assessment Assessment
Pseudonym Average Average Average Average
Kyle 93 92.25 91.5 93
Hannah 98 97.75 98.5 97
Courtney 99 99.5 101 98
Katie 87 87 91 83
Haylie 90 87.5 84 91
John 89 86.75 86.5 87
Bill 76 69 60 78
Bob 101 101.25 102.5 100
Ashley 77 75.5 79 72
Tim 89 85.5 82 89
Reed 80 69.75 50.5 89
Rylie 76 78.5 76 81
Averages 87.91666667 85.85416667 83.54166667 88.16666667
TESTING IN THE CLASSROOM 17

3rd 6 Weeks

Summative & Summative Formative


Student Gradebook Formative Assessment Assessment
Pseudonym Average Average Average Average
Kyle 91 87.75 84.5 91
Hannah 95 91.5 85 98
Courtney 99 99.25 101.5 97
Katie 79 69.5 55 84
Haylie 91 87.75 81.5 94
John 90 82.25 66.5 98
Bill 73 69 69 69
Bob 100 99.5 100 99
Ashley 81 80.25 79.5 81
Tim 93 90.75 87.5 94
Reed 71 65 64 66
Rylie 80 82.5 80 85
Averages 86.91666667 83.75 79.5 88

4th 6 Weeks

Summative & Summative Formative


Student Gradebook Formative Assessment Assessment
Pseudonym Average Average Average Average
Kyle 94 92 90 94
Hannah 97 98.75 100.5 97
Courtney 100 98 96 100
Katie 90 83 76 90
Haylie 92 94.5 97 92
John 74 74.5 75 74
Bill 79 78.75 78.5 79
Bob 103 99.75 96.5 103
Ashley 79 80.75 82.5 79
Tim 91 90.25 89.5 91
Reed 81 64 47 81
Rylie 81 79.75 81.5 78
Averages 88.41666667 86.16666667 84.16666667 88.16666667
TESTING IN THE CLASSROOM 18

Table 5

Anecdotal Evidence

Anecdotal Evidence (Jan 8th - April 5th)

Test Human Dishonesty


Student Test Anxiety Anxiety Error Instances (L,
Pseudonym Average** Score Mistakes* M, H)
Kyle 38 High M M
Hannah 20 Moderate M L
Courtney 16 Low M L ~Can be defined
Katie 23 Moderate M M as instances
where a sheet of
Haylie 15 Low M L homework was
John N/a N/a L H copied or each
Bill 27 Moderate H M instance where
Bob 19 Low L L the student was
Ashley 25 Moderate M M caught cheating
or attempting to
Tim 21 Moderate M L cheat on a test
Reed N/a N/a L H or quiz
Rylie 14 Low L H
Average 21.8 Moderate M M
* Can be
defined as
** 10- Low: x<10,
mistakes
19=Low, 20- Medium:
made due to
35=Moderate, 10≤x<20,
a calculation
35-50=High High: x≥20
or transfer
error.

This table contains data pertaining to the students’ test anxiety, human error mistakes, and

dishonesty instances. These items were tracked over a period of thirteen weeks and aside from

the test anxiety average, were completely from my observations.

You might also like