home.lagrange.eduhome.lagrange.edu/educate/Advanced Programs/M.Ed. Defense... · Web viewFigure 2.1...

THE EFFECTIVENESS OF TIERED SUMMATIVE ASSESSMENTS IN A GEORGIA HIGH SCHOOL MATH CLASS

Except where reference is made to the work of others, the work described in this thesis is my own or was done in collaboration with my advisor. This thesis does not include proprietary

or classified information.

Scott W. Barnett

Certificate of Approval

_____________________________ ______________________________

Donald R. Livingston, Ed.D. Sharon M. Livingston, Ph.D. Thesis Co-Chair Thesis Co-Chair Education Department Education Department

THE EFFECTIVENESS OF TIERED SUMMATIVE ASSESSMENTS IN A GEORGIA

HIGH SCHOOL MATH CLASS

A thesis submitted

by

Scott William Barnett

to

LaGrange College

in partial fulfillment of

the requirement for the

degree of

MASTER OF EDUCATION

In

Curriculum and Instruction

LaGrange, Georgia

May 10, 2011

Tiered Summative Assessment iii

Abstract

This study explores the effectiveness of a tiered assessment model in a regular

education multi-leveled Georgia high school mathematics classroom. Students from two

classes were pretested, instructed and treated in exactly the same manner. The treated class

was offered one of three student chosen summative assessments tiered by difficulty. The

control class was unaware an option. Data were gathered, interpreted, and concluded using

three focus questions guiding the study. Statistical and qualitative data were triangulated in

determining the results. This study found that students did in fact benefit from tiered testing

as an option serving as their summative assessment. This method is completely transferable

to multiple subjects and disciplines. Other terms to look into: student choice, alternative

assessments, multi-leveled tests.

Tiered Summative Assessment iv

Table of Contents

Abstract.…………………………………………………………………………………..…..iii

Table of Contents……………………………………………………………………………..iv

List of Tables and Figures………………………………………………………………..…..vi

Chapter 1: Introduction……………………………………………………………..…………1Statement of the Problem……………………………………………………………...1Significance of the Problem………………………………………………………...…2Theoretical and Conceptual Frameworks..……………………………………………3Focus Questions……………………………………………………………………….6Overview of Methodology…………………………………………………………….6Human as Researcher………………………………………………………………….7

Chapter 2: Review of the Literature…………………………………………………………..9Constructivism………………………………………………………………………...9Differentiation………………………………………………………………………..10Assessments………………………………………………………………………….12Student Outcomes…………………………………………………………...……….15Teacher’s Reaction………………………………………………………………...…18

Chapter 3: Methodology……………………………………………………………...…….. 22Research Design……………………………………………………………………...22Setting………………………………………………………………………………..23Subjects……………………….……………………………………………………...23Procedures and Data Collection Methods……………………………………………24 Validity, Reliability, Dependability, and Bias…..……………………………...……29Analysis of Data…………………………………………………………..………….34Validation…………………………………………………………………………….36Credibility……...…………………………………………………………………….36Transferability………………………………………………………………………..37Transformational……………………………………………………………………..38

Chapter 4: Results…………………………………………………………………………....39

Tiered Summative Assessment v

Chapter 5: Analysis and Discussion of Results…………………………………………...…48Analysis of Results…..………………………………………………………………48Discussion……………………………………………………………………………58Implications…………………………………………………………………………..62Impact on Student Learning………………………………………………………….64Recommendations for Future Research…………………………………..……….....65

References…………...……………………………………………………………………….66

Appendixes…………………………………………..………………………………………71

Tiered Summative Assessment vi

List of Tables and Figures

Tables

Table 3.1 Data Shell………………………………………………………………….…26

Table 4.1 Chi Square for Treatment and Control Student Surveys……...……………..45

Figures

Figure 2.1 The Assessment Equation……………………………………………………12

Figure 2.2 Grade Ranges on Tiered GCSE ..……………………………………………15

Tiered Summative Assessment vii

Tiered Summative Assessment 1

CHAPTER ONE -- INTRODUCTION

Statement of the Problem

Currently there is an outcry from the education community that learning is not taking

place in classrooms across the United States at its potential. Students are not grasping the

material as they should. Many scholars argue that mastery of the curriculum is not the

problem, but it is the method of assessment of that curriculum that is failing. Wood (2005)

says, “We have to establish assessments designed to reflect the variety of achievement targets

that underpin standards: mastery of content knowledge, the ability to use knowledge to

reason, demonstration of performance skills and product development capabilities” (p. 89).

The notion of information in, information out of these empty vessels called students is

outdated and not working. High stakes testing has expedited many to come to the conclusion

that summative assessment in today’s education arena just has no place; that it is a false

indicator of just how much a particular student really knows.

Education is in a transition period. And anyone who has been involved in the

education process during the last twenty-five years can clearly see it. However, students are

not ready to drop pencil and paper, break free from the classroom and teacher, and begin

personal pilgrimages searching for their own educational Zen. In the meantime, educators

push their students towards greatness using any strategy that they can imagine that makes

that student, not just pass the curriculum, but learn the curriculum. In doing so, educators

need to know, with the abundance of differing methods of assessment flooding today’s

education arena, which one or few will give that educator a true depiction of just how much

of the material did a given student learn. With alternate learning styles and differentiation


here to stay, how do educators know if the assessment style they use is the correct indicator

for what the students are learning?

Significance of the Problem

Wood (2005) says, “The entire method of evaluating what our high school students

have learned is unique to the school setting itself. Nowhere else in our society, will one’s

worth or abilities be measured by a paper-and-pencil test of short-term memory” (p.84).

Statements like Wood’s are plentiful. Many are opinions and many have scientific backing.

Summative assessments under No Child Left Behind (NCLB) have not only been brought to

the forefront of discussion in virtually every school system in the United States over the last

eight years, but they have been, according to Cizek (2010), designated as an end all, be all

decision maker whether a student graduates, transitions to the next level or course, or obtains

a license or credentials from a course. Now, in some areas, it is under investigation that in

the near future teacher’s pay could be based on the students’ scores from these types of

assessments. With so much riding on these summative assessments, educators need to

explore different methods of assessment; so promoting or retaining, or awarding or

reprimanding these students is actually a result of not learning the curriculum, not

understanding of the material, and failure to master the lessons. This thesis will explore

alternative summative assessments to investigate whether differing assessment styles will

have an impact on students’ grades.

Measuring a student’s mastery with the right assessment tool is vital in that student’s

overall success in education on the whole. Many failing grades can cause some students to

throw in the towel on particular courses and even on finishing school altogether. Sprick

(2002) echoes, “The way you organize instructional content and evaluate student mastery of


that content can play a major role in whether students’ expectancy of success is high or low”

(p. 27). Poor test scores can cause students to have a poor outlook on that class; behavior can

deteriorate, snowballing towards a poor outlook on school in general. This drowning in poor

grades and test scores can increase the dropout rate thus effecting graduation rates (Sprick,

2002). Educators have got to find a way to assess what the student knows without damaging

the student’s drive to succeed.

Theoretical and Conceptual Frameworks

Constructivism is a guiding philosophy adopted by the Education Department of

LaGrange College. In constructivism, a teacher or educator acts as a facilitator to education.

Students are charged with constructing their own education in their own terms and

background with steerage from the teacher. This increases the meaning of their education as

something that they own, not borrowed from a lecturer riddled with unfamiliar terminology

that means nothing to the pupil. Piaget describes constructivism as a method of teaching

where the student owns his or her knowledge and learning (Ackermann, 2001). Students

need to interpret what they are learning to make a personal connection with the material that

is being presented. An interaction must take place from what is being taught to life and the

world around the student. And finally, once personal connections are established, students

will seek more knowledge (Ackermann, 2001).

Tenet 1 of LaGrange College Education Department’s [LCED] (2008) Conceptual

Framework states an enthusiastic engagement in learning, coined the “professional

knowledge tenet.” In this tenet, candidates (educators) will understand the concepts and

structure of a given discipline and use that to create learning experiences that make the

subject matter meaningful to students. Furthermore, educators will journey across the


curriculum with the subject matter, linking it to real world applications making it more

relevant to the student. Educators will employ a range of instructional tools and techniques

while meeting state, national, and professional association content standards. Lastly,

educators will understand their students from learning styles and developmental growth to

diversity and culture and how they along with outside influences affect student learning and

engagement. (LCED, 2008)

This study will align with LCED’s Conceptual Framework in every tenet and its core

philosophy. The study will be conducted in the area of mathematics and summative

assessment. The study will take place in an actual active classroom during the regular school

year in a Georgia high school. Keeping students’ culture and learning styles on the forefront,

differentiated instruction will be used to teach new material as well as reviewing old. A

variety of methods of summative assessments will be used to gauge the progress of the

student subjects and participants used in the study. This study will follow the current

curriculum put forth by the state of Georgia and the county in which the study will take

place. This study will be testing the effectiveness of summative assessments in a high school

mathematics classroom; however, this study does involve real students during a real school

year. At no time will the education of these students suffer or be jeopardized for the

advancement of this study or its completion.

Tenet 2 of LCED’s Conceptual Framework states that educators will use professional

teaching practices while working with and preparing for students in the classroom. For this

study, backwards design will be used for instruction and assessment. Wiggins and McTighe

(1999) define backwards design as a process where “One starts with the end – the desired

results (goals or standards) –and then derives the curriculum from the evidence of learning


(performances) called for by the standard and the teaching needed to equip students to

perform” (p. 8). It is a process by which desired goals are determined, the assessment

method by which mastery of those goals will be measured is created, and then instruction

methods are derived, planned, and delivered to the students.

SSTs and IEPs will be considered, and modifications will be made as appropriate

during the planning, delivery, and assessment of the subject matter during this study. The

classroom management will be held to the highest standards, in terms of behavior plans, on-

task engagement, and educational integrity. Students will be given high quality hands-on

tasks suitable to the understanding of the curriculum and presented in such a way that is

respectful and applicable to students and their needs.

Tenet 3 of Conceptual Framework illustrates a caring and supportive classroom and

learning communities. This tenet requires the educator to be informed of a student’s

struggles during and away from school. And, the educator should take into account these

struggles in addition to the students’ cultures during instruction, assessment, and remediation.

The educator is charged with conforming teaching methods to better fit the student’s

background and learning style.

This study will take place in an environment conducive to learning. Remediation and

support, group and individual, will be provided for those students who need it. The idea of

student learning and success will remain paramount throughout the study. This study will

mimic an everyday classroom in a real school. To be true to the study, the execution of it

will go forth during the normal interactions of the school day, where multiple instances of

collaboration with other students, teachers, and even administrators take place. As


administrators of the school are aware and approve of this study, collaboration and the

findings of this study will be shared with them.

Focus Questions

This study will pinpoint the effectiveness of a tiered assessment model of summative

assessments in a single multi-ability math classroom by focusing on the following three

questions that will guide the research.

1. How can tiered assessments be infused into the curriculum?

2. What is the process by which tiered assessment effectiveness can be measured?

3. How do students respond attitudinally to tiered assessments?

Overview of Methodology

This study was completed using action research. Hendricks (2009) says action

research is a systematic, step by step, approach that allows findings through structured

experimentation and ongoing reflection. The research process is not steered towards a

desired outcome. Data are collected, evaluated, and concluded. Hendricks (2009) continues

to explain that action research is implemented in such a way that it is ongoing, either by the

individual performing the research or by future researchers to continue.

The study was performed using two classes of the same subject matter in a high

school mathematics classroom. It consisted of approximately 50 high school junior subjects

in Georgia’s Math 3, an equivalent to Trigonometry in the spring of 2011. Some of the

students had Individualized Education Programs (IEP) and others had Student Support

Teams (SST). Most were regular education students and the groups were quasi-randomly

selected based on the fact that they had me as their teacher and which period they were

placed in, neither of which I could control. For this study, the students were assessed over


the same material by alternative methods to see if a tiered assessment model presented a

more accurate account of a given student’s mastery of the curriculum. In order to do this, a

baseline needed to be determined. A pretest was given prior to any instruction in a given

unit. The same method of pretesting was given to all students in all classes participating in

the study. All classes were instructed for exactly the same amount of time and in the same

manner. The treated class was given the opportunity to choose one of three tiered summative

assessments aligned by difficulty to the GPS. The control class was not offered an option,

nor were they even aware there were multiple tests. The posttest scores were compared to the

pretest scores in order to determine a baseline for natural learning and progress. The posttest

scores were again compared with the baseline progress scores in order to determine if the test

shows a different level of progress or decline. Finally, the posttest scores from each group

were compared to determine if student chosen tiered summative testing positively impacted

student outcomes.

For qualitative data, two surveys were given to the study subjects in order to collect

data on the testing method that they prefer and why. Finally, I, as the administrator of the

assessments and data collection, kept a reflective journal consisting of my findings and

observations prior, throughout, and after the completion of the study.

In the planning stages, a unit plan and rubric for critiquing the unit plan was

developed. The rubric was used by a highly qualified third party otherwise unassociated with

the study, to ensure validity and aligned of the curriculum for the study.

Human as a Researcher

I have been teaching high school math at a Title One school in Georgia for 5 years. I

believe that some tests can be “beaten”, that is to say that, for example, multiple choice tests


questions can be answered correctly with the student not actually knowing the correct

answer. True/false tests are the same in their nature, a guess; 50 – 50 odds in this case that a

student will get the right answer and have no idea what the question is even talking about.

My greatest fear is, because mathematics is a science that builds from one class to the next,

will my students learn what they need to know to not only get to the next course in the math

track, but also, will be given the proper foundation by me to succeed on that next level.

The purpose of testing is for the teacher to measure how much of the taught and

sometimes background or foundation material is learned by the student. The test has to be a

true and accurate tool of measurement otherwise the teacher will get a false impression of the

learning that took place during that particular unit. As a teacher, I have to trust the tests

(measuring tools) that I am using. Because I used tests and other forms of assessment to

monitor progress from my students, I need to know which assessment will yield the best

correlation between what the student knows and the scores on their tests.


CHAPTER TWO – REVIEW OF THE LITERATURE

This study was accomplished using the Constructivism philosophy of learning along

with an Action Research approach at the core of this study driving its execution, data

collection, and findings. Constructivism, as defined by Crotty (1998), is knowledge

constructed, not discovered, by individuals by experiences. Later, in 2004, Maclellan and

Soden defined Constructivism as “knowledge, not passively received from the world or from

authoritative sources, but constructed by individuals or groups making sense of their

experimental worlds” (p. 255). From these definitions of Constructivism, one can deduce

that Constructivism at the heart of the meaning, is an active process of learning where the

student is partly the teacher building on his or her frame of knowledge from his or her own

experiences and the interpretations of those experiences. Simply put, it is a kin to a child

learning not to touch a red hot eye on a stove by touching it. The child actively decided to

and touched the eye after being intrigued by it. And learned, by being burned, in the future do

not touch a red hot eye on a stove because the child experienced the effects of the touch, and

deduced that it hurts and may hurt again next time. That child created or constructed his or

her own knowledge by experiment that it is definitely not a good idea to touch a red hot eye

on a stove. That in essence is Constructivism.

The purpose of Constructivism is the theory that this form of learning advances

meaning-making of knowledge gained by the learned or student if that knowledge is mostly

self-led and self-taught with the aid of an authoritative teacher or mentor (Yilmaz, 2008).

Yilmaz (2008) believes that a student who has learned under the philosophy of

Constructivism will have better use of what was learned because the knowledge gained will

have a personal meaning and connection to the educated and will be personalized to fit the


specific needs of the student who learned the material. In other words, the knowledge will

have a larger impact on the student; in turn, it will be internalized better by the student

because the student played a role in how the material was taught or delivered. If a student

was going to devise a research method for experimentation under the Constructivist

philosophy, the best fit would be action research. Because action research and

Constructivism are at their cores allies, it just fits that one cannot work without the other.

Differentiation

The notion of differentiation has reformed education to conform to today’s student.

With the revelation of different learning styles and how that affects students learning

abilities, differentiation is a tool developed to reach students that do not perform well in a

traditional classroom setting. Within the realm of the classroom, a practitioner of

differentiation will attempt differing and revolutionary tactics and strategies to reach all

students in the classroom with differing methods of instruction and assessment.

Differentiating the delivery of instruction and methods of assignments/ assessments is

designed to reach each student in their own learning style so learning can occur for that

student with all of the clutter of how it was delivered or assessed removed (Tomlinson,

1995).

An example of differentiating an assessment for a student would be giving an oral

examination to an English as a Second Language [ESL] student. The ESL student may not

speak English fluently enough to take a traditional test, but knows the material well enough

to pass the test. Giving the student an alternative form of assessment in this case would

benefit the student and the teacher by providing the student a fair chance to show what he or

she knows, and the teacher will receive a more accurate picture of what that student knows.


Carol Tomlinson (1995) has blazed the trail of modern differentiation in today’s

schools as one of the most published experts on the subject. She defines it as, “At its most

basic level, differentiating means ‘shaking up’ what goes on in the classroom so that students

have multiple options for taking in information, making sense of ideas, and expressing what

they learn” (p. 3). She argues that differentiation is not only necessary in today’s classroom,

but vital for an increase in student outcomes. She says, “Acknowledging that students learn

at different speeds and that they differ widely in their ability to think abstractly or understand

complex ideas is like acknowledging that students at any given age are not the same height”

(1995, p. 2).

Differentiation is crucial here because in Tomlinson’s words, “teachers can create a

‘user-friendly’ environment, one in which they flexibly adapt pacing, approaches to learning,

and channels for expressing learning in response to their students’ differing needs” (1995, p.

2). Tomlinson believes that education has been and needs to be reformed, replacing

instruction and teaching of yesterday, with differentiation. Because students are individually

different, they should be taught individually different, or as much as possible within the

teachers’ resources. Tomlinson (2000a) maintains that differentiation is not just an

instructional strategy, nor is it a recipe for teaching, rather it is an innovative way of thinking

about teaching and learning. She echoes that notion in 1995, “Differentiated instruction is so

powerful because it focuses on concepts and principles instead of predominantly on facts” (p.

47). Students given alternative options for expressing what they know will so show that the

intended concepts of the material were internalized by the student without mistranslation of

that from student to teacher based on the media of assessment getting in the way.


As differentiation gets a stronger hold on educators today, techniques expand and

become specialized. Differentiated instruction can be broken down into subgroups,

assessment, classroom instruction and delivery, curricula, classroom management, and

planning. Assessment, along with delivery, is on the forefront right now, especially for

students with special needs and ESL students. Most experts argue that when differentiation

in terms of planning and instruction occurs, assessment will naturally follow. Tomlinson,

Kaplan, Renzulli, Purcell, Leppien, Burns, Strickland, and Imbeau (2009) say, “An

assessment usually involves the demonstration of a behavior or product that results from the

student’s interaction with content” (p.45). Tomlinson et al. (2009) goes on to chart how a

student’s brain internalizes an assessment question, processes it, and answers the question in

Figure 2.1. Differentiated planning and instruction leads to differentiated tasks and

assignments. From there, differentiated assessments complete the differentiated process.

Differentiating assessments in a differentiated classroom are further attempts by the teacher

and curriculum planner to teach and assess towards students’ individual struggles and

readiness levels (Tomlinson, 1995).

Participant + Content + Task +

Cognitive Processing = Assessment

§

Figure 2.1 The Assessment Equation

Assessments

Assessments in a traditional classroom setting are the culminating activity that show

what students have learned throughout that testing period. Under differentiation, the

definition does not change as much as one might think. Tomlinson et al. (2009) defines


assessments using and not using differentiation as, “assessments are tasks assigned to

students in order to determine the extent to which they have acquired the knowledge and/or

skills embedded within a performance standard or content goal” (p.44). This is true with any

assessment, whether it is a pre-assessment, informative progress assessment, an intermediate

formative assessment, or a summative or culminating assessment. Tomlinson et al. (2009)

continue in terms of summative assessments, “More specifically, summative assessments

help teachers understand who has mastered content and skills objectives at a designated

‘ending point’ of instruction” (p. 45).

As defined earlier, assessments are supposed to be clear measurement tools that

illustrates to the teacher what the test taker knows about a given topic or list of topics. With

a better understanding of the way students learn today, several problems with traditional

assessment methods have arisen hindering the ability of the teacher to receive an accurate

picture of what that student actually knows. One problem that is widely overlooked is a

student’s interest in the test method itself. The assessment needs to be something tangible

that the student can visualize before instruction begins. The student needs to feel that he or

she has a personal stake in the assessment whether he or she actually does or not. Wormeli

(2006) reports, “Students are likely to do the homework assignment if they have a clear

picture of the finished product. If the assessment is fuzzy, they won’t” (p. 22). If the student

has no concept accomplishment or at least an initial grasp of success in the beginning of the

unit, then most likely the student will not perform well on the culminating assessment. One

of the most popular assessment methods especially in a math classroom is the teacher

secretly choosing test questions based on examples worked in class. This secrecy is well

guarded by the teacher sometimes to such extremes that those teachers will not even allow


colleagues to view their tests. Wormeli (2006) says this is hindering student learning,

“Nothing in the post-school world is kept a secret, so we shouldn’t play games with students,

coyly declaring that we maintain the right to choose anything we want from the chapter text

when they ask what’s on the test” (p.22).

Giving students a choice in their assessment method can add confidence to the

student that the assignment can be completed successfully. That feeling alone can increase

student outcomes. Without alternative assessments, this achievement can never be tapped. A

recent study by Scouller (1998) conducted on students’ performance outcome based solely on

choice of the method of the assessment found the following, “When performance outcome

was analyzed in terms of preference for assessment method, highly significant differences

were found between the two groups in terms of their assignment essay marks. Those, who

preferred assignment essays as the assessment method, were significantly more likely to be

successful in their assignment” (p. 465).

Assessments are supposed to measure knowledge. Most traditional tests are built to

test for facts and memorization of those facts. Just knowing those facts is not enough.

Today’s student needs to be able to decide, process, and infer information, and knowledge

based tests are just not measuring that. Schwartz and Arena (2009) argue that, “Knowledge

assessments are inherently retrospective, but past knowledge is a small slice of what matters.

Current knowledge assessments miss critical factors relevant to learning that include

motivations to learn, responses to feedback and change, tacit understandings, and abilities to

learn when no longer being told what to do” (p. 12).


Student Outcomes

In August of 2010, Wheadon and Beguin published an article in Assessments in

Education: Principles, Policy, & Practice testing the notion of multi-stage tiered tests

investigating whether tiering the test using an Item Response Theory [IRT] test would

increase student outcomes for learning in the British high school standards called General

Certificate of Secondary Education [GCSE], the British equivalent of a high school diploma

in the United States. The experiment grouped students of like abilities and labeled them

based on their grade A* through G as the passing grades and a failing grade simply labeled

‘fail’. In the model, the A* is the highest achieving student and the G is the lowest yet still

passing student or in this cases groups of students. This label is not the score on the test

instead it is the type of performer cumulatively of the student; much like an average ability

student is labeled a ‘C’ student and the highest achieving student is labeled and ‘A’ student.

The model from the article is depicted in Figure 2.2.

Since the

experiment

was testing two

levels of tests and

groups C, D, and E

took both versions of the test, the study only discusses the outcomes of those groups. This

group took the two versions of the same test; one being more difficult than the other but both

tests covered the same standards. The standards set forth by the GCSE were not

compromised during the study. Wheadon and Beguin found that in the treatment group 8%

of the C level test takers failed to achieve a grade higher than C on the higher leveled test,

Highest

A* A B C D E F G fail

Lowest

Figure 2.2: Grade ranges available on tiered GCSE papers.


and 4% failed to achieve a grade higher than C on the lower leveled test. That implies a 4%

pass/fail difference for the C students than if the group took only the higher leveled test, or if

tiered testing was not offered (Wheadon & Beguin, 2010). The D and E student groups also

showed an increase within the tiered system. According to Wheadon and Beguin (2010),

16% of the students who would have made D performance on the higher leveled test made a

C on the lower leveled test. Similarly the same statistic holds true of E performance on the

higher level test making D performance on the lower one. That is almost one in five students

in the D group and almost one in five of the E group; in addition one in twenty-five students

in the C group enjoyed benefits from the tier system of tests.

Wheadon and Beguin (2010) continue in terms of fairness, a maximum score was

placed on the lowered level test of C. In doing so, Wheadon and Beguin found that 25% of

the students in the C group had their grades capped by the maximum grade rule because

those students scored in the B range but because they were capped for taking the lowered

level test their final grade would be recorded as a C. That is an additional one in four

students that noticed improvement in outcomes with implementation of the tiered system.

Wheadon and Beguin (2010) warn that for a tiered system to be successful and relevant, the

standards cannot be fluid. They must not be altered in any way. He warns that it is easy to

alter the tests in such a way as to diminish the standards and urges the practitioner of tier

testing to be aware and careful of that.

A similar study in Australia occurred in 2004. This study was aimed at eight and nine

year old swimmers as part of a physical education class. Whipp (2004) argues, “Readiness

gaps were seen to negatively impact on a student’s level of concentration, involvement,

potency, achievement, motivation and self-worth” (p. 4). The study consisted of twenty-


eight students in three different physical education classes in Perth area of Western Australia.

The students were given swimming tasks to complete based on readiness of the individual

student and that student’s potential for growth. Each instructor chose the task for the

individual students without the student’s input based on teacher observations and past

performance. Whipp (2004) “believed the low ability swimmers improved their swimming,

and he thought that some of the middle ability girls also improved. [He] conceded a failure to

extend the higher ability swimmers, thoughts echoed by the students with 58.9% agreeing

that their swimming had improved” (p. 10). He continued to explain that the low and

middles ability students showed an increased interest in achieving the tasks set before them

and improving on their previous scores. This information was obtained through a non-

participant observer and by student surveys. Whipp (2004) concludes his findings by

explaining that the highest level students’ improvement was immeasurable. He explained

that he did not expect the highest ability students to improve much because those students

were already performing at the highest possible levels and their range for improvement was

already small especially because of the nature of the assessment in physical education where

preparation for examination is vastly different from a regular cognitive course of study.

In July of 1997, Herman, Klein, and Wakai studied student attitudes towards

alternative assessments. The study began in 1993 among 13 schools and over 800 eighth

grade students in California. The alternative assessments were designed to encourage critical

thinking and performance. The traditional test was a state mandated multiple-choice test.

The research group showed that 14% of students performed better on the alternative

assessment than the traditional assessment. But 67% of the students preferred the multiple

choice test method to the alternative method. In terms of alternative assessments, “students


try harder on these items; and they recognize that open-ended items require them to think

harder, explain their thinking, and communicate their understanding of mathematical

knowledge” (p. 16). Herman et al. (1997) explains the students’ perceptions of multiple

choice questions, “students express a preference for multiple-choice items. They find

multiple-choice items easier to understand and believe that they perform better on such

items” (p. 16).

Teachers’ Reaction

Tomlinson (2000b) said, “What we call differentiation is not a recipe for teaching. It

is not an instructional strategy. It is not what a teacher does when he or she has time” (p. 1).

She stresses that many educators struggle with differentiation simple because they do not

know what it is, or they do not have time, or there is some other reason that hinders their

execution of the philosophy. She continues to explain other factors that prevent teachers

from implementing differentiation in the classroom is standard-based learning. Teachers

have been pushed by local administrators, curriculum developers, and state standards to teach

exclusively towards high stakes tests. These tests are almost exclusively multiple choice

assessments that test students for trivia type knowledge, remembering facts instead of

processing information into inferences. The latter is what is driven by differentiation in the

classroom. Soloman (1998) reports, “unfortunately, a multiple choice test is out of sync with

the more constructive demand of real life” (p. 110). She continues to explain multiple choice

tests are easy to standardize, norm, and validate; and that is why states use them to measure

the learning of such standards. It is also easier and quicker to grade multiple choice tests.

Oberg (2009) says, “Finding adequate and appropriate assessments is a constant

challenge for teachers. Purpose, time, results, and how results will be used contribute to


determine the type of assessment that best fit teachers’ needs” (p. 3). Teachers are finding it

difficult to find assessments that meet the standard that are pushing all students to be alike

and at the same time assessing their individual needs and abilities.

Teachers who embrace differentiation in the classroom follow the natural flow into

differentiation assessments. If a teacher has the freedom from authorities and time to

prepare, they can become effective practitioners of differentiation (Tomlinson, 2000b).

Tomlinson (2000b) reports for another teacher, “I feel as if I'm a better teacher. I understand

what I'm teaching better, and I certainly have come to understand the students I teach more

fully. I no longer see my curriculum as a list to be covered” (p. 6). Differentiation comes

from collaboration with the school board, the teacher, the parent, and the student, without just

one of those, differentiation has lost its effectiveness. If differentiation is in regular practice

in the classroom, assessment will naturally be performance based and differentiated.

Tomlinson sums it all up by saying,

Teaching is hard. Teaching well is fiercely so. Confronted by too many

students, a schedule without breaks, a pile of papers that regenerates daily,

and incessant demands from every educational stakeholder, no wonder we

become habitual and standardized in our practices. Not only do we have

no time to question why we do what we do, but we also experience the

discomfort of change when we do ask the knotty questions (2000b, 7).

In 2005, Watt conducted a study on teachers and the use of alternative assessment.

She studied three math classes in schools in New South Wales and Sydney, Australia. The

purpose of the study was to examine teacher response acceptance of alternative methods of

assessment. She found that teachers on the whole are beginning to embrace the use


alternative forms of assessing although 71% of the teachers studied were using some form of

alternative assessment in the classroom. 68% of the teachers with more than three years’

experience had poor regards toward the assessment method. The number one reason given

for the lack of acceptance was time to plan. Teachers felt that creating and implementing

alternative measures was time consuming, and for an already time starved curriculum,

alternative assessments were not feasible (Watt, 2005). The next reason for teachers’ poor

acceptance to alternative assessment was the grading method of the assessments was

unstructured in the opinions of the seasoned teachers; they also felt there was little room to

make an alternative assessment ‘fit’ most lessons. Overall, seasoned teachers felt there was

no reason to change from traditional ways of assessment. That notion has nothing to do with

alternative instruction and classroom procedures, only assessment. They felt that their

traditional assessments did not need to be overhauled. Watt (2005) quotes another teacher

from the study as saying, “Teachers were relatively satisfied with traditional mathematics

tests as a measure of students’ mathematical ability” (p. 28).

Conversely, Watt (2005) explains, teachers with three years or less of teaching

experience were more eager to embrace alternative assessments and showed more

enthusiasm with planning and implementation of the assessments. They had the same

complaints, that creating alternative assessments were very time consuming, but they felt it

was worth it. She continues to rationalize the thought process of newer teachers in

explaining that in colleges alternative assessments and instruction were part of the curriculum

so newer teachers received their training with the philosophy of alternative methods already

embedded. The culture of newer teachers has differentiation and alternative methods as one

of it foundations, so the move from tradition is less resistant.


The significance of the problem points out a need for reform in assessment. With the

strong hold differentiation has had and continues to have on teaching structure and

techniques, assessment is naturally affected by the new treads in education. Countless

experts have emerged urging educators when to test, how to test, and even what to test. To

make matters even more complicated, according to Linn (1998), high stakes testing has

become the determining factor in whether students are promoted or retained, so these

methods of assessment have the education world’s attention right now. This study will

examine the effectiveness of a multi-leveled or tiered assessment model in a regular

education, with total inclusion, high school math classroom.


CHAPTER THREE - METHODOLOGY

Research Design

Hendricks (2009) says, “Educational research is conducted to advance our

understanding of a variety of issues…” (p. 1). She continues to explain that in education,

research is used to develop theory, test hypotheses, study relationships among variables,

describe educational phenomena, and determine whether actions are based on results. In its

infancy, action research was described by Kurt Lewin in the 1930s as “a spiraling process

that included reflection and inquiry on the part of its stakeholders for the purposes of

improving work environments and dealing with social problems” (Burns, 1999). This

definition originated in context of the work of Lewin, who was charged with improving the

production of factory workers as he studied them in the workplace (Burns, 1999). Burns

(1999) research was based on his doing an experiment with real workers in a real

environment in real time. This concept was new in those days as most research of the time

was done by thinkers theorizing about outcomes based on intellectual perceptions. As time

passed and action research evolved, it moved into the classroom, but its essence remains,

actively doing an experiment or study with real students in the actual classroom in real time,

interpreting real results. Hendricks (2009) argues, “The purpose of action research is for

practitioners to investigate and improve their practices” (p. 3). She wants teachers who use

teaching and instruction to educate their students to study how the teaching process can be

improved for the purpose of producing a better educated student. This research is done by

the teacher as a self-study, so the teacher can take the findings and improve their future

teaching in the classroom (Hendricks, 2009).


This study was conducted in a high school math classroom in January of 2011 in the

metro-Atlanta area of Georgia. The students involved were regular and inclusive special

education students of mixed abilities in a regular educations Math 3 classroom. Math 3 is

Georgia’s equivalent to Advanced Algebra or post Algebra 2. I was the teacher for the

classroom, and I wanted to test tiered assessments for mixed ability students in my class

because I was looking for a better way to access my students in the philosophy of

differentiation. The study was supervised by the Education Department at Lagrange College

in Lagrange Georgia with permissions from my county’s school board and my school’s

principal. To protect the integrity of the study and the interests of the students involved,

Institutional Review Board [IRB] approval was also obtained in order to conduct this study.

Setting

The study took place in a metropolitan Atlanta suburban high school in the spring of

2011. The school is located in the county seat and is deep rooted in the town’s culture. The

school housed 2200 students 16.4% were special education students in a total inclusion

environment. The school makeup was 28.8% black, 0.7% Hispanic, and 0.4% Asian

according to U.S. Census Bureau, 2000 Census and National Center for Education Statistics,

U.S. Department of Education. Additionally, the school was 49% male, 36.3%

economically disadvantaged. The school was a Title 1 distinguished school and had made

Adequate Yearly Progress [AYP] every year from its inception. The school had a 75.5%

graduation rate and a 17.4 student to teacher ratio.

Subjects


For the spring semester in 2011, I taught two Math 3 classes. The students in each

class, sixteen to seventeen years of age, were mixed-ability students in a traditional regular

education classroom. The sophomores and juniors had some mainstream special education

students mixed in each class. Those students were usually part of regular education

classrooms; as my school system practices total inclusion for special education students and

has for some time. The students, known as subjects, had no idea they were being studied.

The subjects for the study were not chosen at random, but were grouped only by their

schedule. The untreated group is one Math 3 class and the treated group is the other Math 3

class. I chose my first class as my treated class and my second block class as my untreated

class. The selection of which class was treated and untreated was completely random. I

made the decision on who was treated and untreated before I even saw the rosters of each

class before the upcoming semester. Because I tested an entire class versus another entire

class, subgroups were not necessary for this study. The untreated group consists of 21

students, 17 regular education students and 3 mainstream special educations students and 1

gifted student. The treated group consisted of 28 of students, 23 regular education students

and 2 mainstream special educations students and 3 gifted students.

Procedures and Data Collection Methods

Because my study deals with real grades and in the interest of fairness and protection

for my students, an agreement was made between my principal and me, that a student during

this study involving one unit could retake his or her test at their discretion post study. This is

to ensure that all students of the school have the same opportunity for success and no student

is given an advantage for passing over another student. That is not to imply that the

assessments used in the study are not aligned to the Georgia Performance Standards [GPS]. I


offered a tiered assessment for one group and no tiering for the other. My principal felt that

some students in the non-tiered class may benefit from the tiered assessment, and he wanted

the opportunity to be available without tarnishing the study. The scores on the retake

examination were not used for the study as it may have caused the data to be skewed in one

direction or another. Furthermore, the students were not aware of the retesting possibility at

the time of their assessment, as not to compromise the integrity of the original assessment.

They thought that the study’s assessment was their one and only summative grade for that

unit until after the conclusion of data collection.

This study examined student outcomes in a tiered assessment system on a unit in an

upper level mathematics classroom from start to finish. This study was conducted using

action research. Action research was the best type of research for this particular study

because the research was actually done in a real world setting with real outcomes. Because I

am the researcher and I am a practicing teacher trying to improve my methods, the study fits

an action research model best. Hendricks (2009) says that Classroom Action Research is a

form of action research used by active teachers in their classrooms to hone their skills. The

results will have an impact of the future of the methods of practice inside the classroom for

me. I was the administrator of the study, and data were collected and observations were taken

by me in real time as they happened. This method provided the best details in terms of

feelings of the subjects and researcher and observations made by the researcher as they are

inferred and recorded at the time the observations or data collection was made. Hendricks

(2009) says, “Observational data are the most important source of information in an action

research study” (p. 90). It is crucial that the correct interpretation of the observations is

recorded as soon as possible to protect the validity of the information retrieved. The study


was guided using a data shell, a table containing my focus questions and data collection

summary, see Table 3.1.

Table 3.1. Data ShellFocus Question Literature

SourcesData Sources Why do these data answer

the question?How are data analyzed?

How can tiered assessments be infused into the curriculum?

Wormeli, R

Scouller, K..

Tomlinson, C.

1.) Rubric from Unit Plan2.) Archival3.) Instructional Plan

1.) Peer reviewed for validity. 2.) Content validity for study. 3.) Implementation of treatment

Qualitatively: coded for themes aligned with focus questions.

What is the process by which tiered assessment effectiveness can be measured?

Wheadon, C. & Beguin, A.Oberg, C. Tomlinson, C.

Test scores obtained1.) Pre-Pre between treated/ untreated classes.2.) Pre-post in treated class.3.) Post-post between classes

1.) Scores will or will not show a significant difference in the treated/ untreated classes.2.) Scores will or will not show increases student outcomes3.) Scores will or will not show significant gains in the treated class.

Quantitatively: 1.) Pre-pre: independent T test with unequal variances2.) Pre-post: dependent T test 3.) Post-post: Independent T tests with unequal variances

How do students respond attitudinally to tiered assessments?

Watt, H.Herman, J., Klein, C. & Wakai, S. Tomlinson, C.

1.) Subject Survey;2.) Reflective Journal;3.) Subject Survey

1.) Subjects will take a survey as to their preferences on the level of test they felt would express their understanding of the material. 2.) I will record observations about tiered tests that I feel is significant to the process. 3.) An additional survey will be conducted pretreatment in order to draw conclusions on the types of students choose what test.

Quantitatively:1.) Chi-square done on survey with descriptive statistics. Qualitatively: 2.) Coded for themes aligned with focus questions. 3.) Coded for themes aligned with focus questions.


As the study embarked, a pretest was given to both groups of students. One group

consisted as a control group, separated from the treated group because they were in a

different class and grouped together because they were in the same class. The construction

of the two groups was semi-random, as they were chosen based on their schedule, but I had

neither control over who was in each group nor knowledge prior to the study to whom was in

each class, or group. The pretest was supposed to distinguish whether the two groups

showed significant differences in tests scores prior to treatment.

After the pretest, instruction for the unit was exactly the same for both groups. The

students were introduced to topics in a given unit of Math 3, the Georgia equivalent to

Algebra 2. The students were assigned classwork and homework, quizzed, remediated, and

lectured in exactly the same manner for the purposes of the study. The one difference

between the groups was the treated group was aware from the day after the pretest before any

instruction took place that they would have a choice of the summative assessment in terms of

level of difficulty. The process was explained to them along with how the assessment would

be administered and what level each assessment aligned with the Georgia High School

Graduation Test [GHSGT]. In addition, the score values for the three levels of assessments

were explained.

The explanation went as follows; at the end of the unit each student will have the

choice of assessment, Meets, Exceeds, and Excels. The Meets test is aligned with a Meets

Standards score of 500 on the GHSGT. Five hundred is the minimum score to pass the

GHSGT. In addition, the Meets test carries a maximum score of 80% in the classroom.

Students who choose this test can score no higher as the questions on it are testing for only a


basic knowledge of the concepts from the unit. The Exceeds test is aligned with the GHSGT

with an Exceeds Standards score of 516. This test expects a higher level of learning from the

student and awards appropriately. This test carries a maximum score of 100% in the

classroom. This score is representative of a traditional ‘B’ student, above average but not

tops in the class. The Excels test is the final level in the tier that assesses students on the

highest level of learning. Students who choose this test excelled in every concept from the

unit. This test has questions that are difficult and are aligned above the standards set forth by

the GHSGT. Some of the questions on this test were not covered in class, as they are

concepts achieved from transfer of knowledge or interpretation of the concepts presented in

class. Students will have to make inferences and predictions using the concepts taught. A

maximum grade of 110% is allowed for this test. The students were also informed that the

homework is based on the Meets test, quizzes were based on the Exceeds test, and the

delivery of instruction without the inferences and predictions was based on the Excels test.

The untreated group took only the Exceeds test with a maximum score of 100%.

Instruction, assignments, quizzes, and remediation went on throughout the unit

exactly for both classes. If one class spent twenty minutes on a quiz, then the other class

spent twenty minutes on the same quiz. The unit test review was also delivered exactly the

same way. As day to day work was conducted, notes were taken on observations of both

groups. Student attitudes were closely looked at here due to one group knowing there was a

choice on assessment and one group had no choice.

On test day, each student in the treated group informed me one by one which test they

were going to take, after questions were taken from students attempting to understand the

process of tiered assessments. The untreated group was just given the test. They took it and


turned it in. The treated group was given the test each student individually and anonymously

chose. I wanted the test choice to be a secret to eliminate peer pressure. The tests were

graded and the scores recorded. Later, the scores from the untreated group were compared to

the scores of the treated group looking for a significant difference. The pretest scores from

each group were compared to the posttest scores respectively looking for significant gains in

student outcomes from pretest to posttest. I expected gains because teaching occurred

between the two tests, but if the treated group’s scores showed a greater increase than the

non-treated group’s scores, then validity of a tiered system can be argued.

Validity, Reliability, Dependability, and Bias

Popham (2011) defines validity as “not simply a synonym for test-related goodness.

Rather, validity refers to the accuracy of test-based inferences” (p. 437). Popham (2011) feels

that validity of a study is the most significant concept of the study, and if a study is not valid

then its findings are not valid. Steps were taken during this study to ensure that the

inferences made on the findings were and are valid.

Focus question one queries, how can tiered assessments be infused into the

curriculum? First, an instructional plan (see Appendix A) was developed and peer-reviewed

by the school’s Title 1 math coach using a designed rubric (see Appendix B) to ensure

alignment of instruction and, more importantly, the assessments used to Georgia’s

educational standards and for validity. Once the plan was in place and aligned with the

content and curriculum, research was gathered focusing on other scholars that had attempted

similar studies. These studies were compared and used as a guide for this study. Similar

studies include Watt (2005), Whipp (2004), and Wheaton and Beguin (2010). These


scholars’ works can be found in the reference section of this thesis and specifics of each of

these studies along with more not mentioned here can be found in Chapter Two.

Content validity, as Popham (2011) defines, “refers to the adequacy with which the

content of the test represents the content of the curricular aim” (p. 89), was ensured in that all

lessons, practice questions, notes, instruction, and even quizzes were identical for both

groups, treated and untreated throughout the study. That was also reinforced by the math

coach’s critique of the instructional plan as explained earlier. Because this study was

focusing on the summative assessment, there was no reason to alter the instruction and day to

day teaching and exercises of the students between the groups. Because of this, both groups

were assigned exactly the same practice exercises and quizzes throughout the study. In

addition, both groups were presented the same notes and instruction throughout the execution

of the study. This was enabled by the use of Power Point presentations as notes and

instructional guides to ensure consistency. This serves as a tool for dependability for the

study.

Golafshani (2003) illustrates that dependability of qualitative data is akin to the

reliability of quantitative data. Since focus question one was measured with qualitative data,

reliability will not be discussed here, rather dependability of the study was the goal for focus

question one. In addition to the above example of dependability at the end of the previous

paragraph, dependability was also kept in check by the math coach’s critique of the

instructional plan, and her aligning of the assessments to the Georgia education standards to

and to each other. Popham (2011) explains the importance of reliability/dependability of

alternate forms of assessment when comparing student outcomes from two distinct groups.


As far as bias for the content portion of this study, there is inherently minimal risk of

bias infecting the study, due to the nature of how the content was presented and scored. The

content and curriculum having been aligned with the state’s curriculum left me little room to

alter the lessons taught. This keeps much of the bias, at least as far as content is concerned,

out of the picture.

Focus question two asks, what is the process by which tiered assessment effectiveness

can be measured? These data were collected by using the scores of the subjects pretest

scores and posttest scores for the unit in the study. The posttest refers to the tiered

assessment option for the treated group and the non-tiered option for the control group.

Those scores were used in multiple quantitative tests that will be discussed in further detail in

chapter 4. These methods are strong as they are statistical mathematical forms of comparison

that are time proven, and as Salkind (2010) states, “tools developed specifically to

understand the world around us” (p. 9). Popham (2011) describes this type of validity as

Criterion Validity, using measurements between two groups as a basis of a predictive

inference. In addition, care was taken in the grading process to ensure that the first

assessment was scored in the same manner and under the same scrutiny as the last. If one set

of assessments were scored more harshly than another set, validation of the findings would

be questionable.

Reliability was shown by correlations on the groups consisting of test/retest reliability

within each group. This gave me a clear vision of where each group is prior to and after the

study is complete. A correlation here can show progress of a particular group. Inferences

can be made on the correlation. Next, parallel reliability is illustrated by correlating the


control group to the treated group. Inferences can be made on those correlations as well

(Salkind, 2010).

Students’ identities were unknown during scoring to prevent the so called “halo

effect”, an overlooking of errors by certain students based on the expectation that the student

successfully answered the question correctly (Nisbett & Wilson, 1977). The study was also

done at the beginning of the semester before the scorer had the opportunity to learn which

students / groups would stand out or lag; this also aided in preventing the halo effect.

Focus question three probes, how do students respond attitudinally to tiered

assessments? The study was kept secret to its subjects as not to taint the efforts of the

students. If they thought they were part of a study and there was a chance that their grades

on the assessments may not count as a real grade, they may have not given the assessment

their best effort thus invalidating the findings for those students. The students were unaware

a study was taking place for that reason.

Data for this focus question were collected from surveys given to the students in the

treated group and control group. The surveys for each group were not the same. These data

were converted into numbers and a chi square quantitative analysis was performed. This

form of data collection concerns me because of the nature of students not having an interest

in the study. Rogelberg, Fisher, Maynard, Hakel, and Horvath (2001) warn of making

surveys mandatory, arguing that the responses given may be invalid due to the respondents

being forced to participate in the survey itself. Because the subjects had no idea they were

being studied, they may not have taken the surveys seriously. They also may not have

thought about their responses thoroughly before answering. Since this was a concern prior to

the distribution of the surveys, the subjects were prompted that the information gathered was


important and needed to be taken seriously. They also were told that it was not required and

only the subjects who intended to answer the survey seriously should participate. In addition,

the subjects were urged to think thoroughly about the question before answering each

question. Not all surveys distributed were returned, but more than enough were completed to

infer from the data. For criterion validity, a chi square correlation was calculated to

determine significance. Cronbach’s Alpha showed internal consistency on the surveys

showing reliability of the data collected (Salkind, 2010).

The qualitative portion of this focus question was in the form of a reflective journal

kept by me in order to record day to day observations on the treated and control groups. The

journal was recorded daily just after each group’s departure from class. And the observations

recorded were consistent based on writing prompts that were used each day. The writing

prompts for each day were the same (see Appendix C). This illustrates dependability for the

journal data.

Bias for the third focus question is based on the researcher. As I indicated in

Chapter One, I had some knowledge of tiered assessing as well as formed some opinions

prior to this study. However, the writing prompts used for the journal were adhered to during

the writing of the journal to protect against my personal feelings getting in the way.

For fairness, the negative aspects of tiered assessment were researched. With the

limited resources put forth toward this aspect of tiered assessments, Oberg (2009) and

Wheaton and Beguin (2010) reported findings that I would consider negative outcomes.

Oberg (2009) reported low teacher attitudes towards tiered assessments, as teachers felt they

lacked the time to prepare for such endeavors. This process does take an exuberant amount

of planning time to accomplish. Wheaton and Beguin (2010) reported that only certain


students were given a tiered assessment and that assessment was chosen by the instructor not

the student. This study concentrated on tiered assessments with a student choice.

Popham (2011) defines offensiveness as, “[something] that contains elements that

would insult any group of students on the basis of their personal characteristics, such as

religion or race” (p. 503). This study had no elements of offensiveness, the instruction was

aligned with the Georgia Performance Standards [GPS] and questions and instruction

techniques were peer reviewed and tested for offensiveness. In addition, the assessments

used were also aligned with the GPS and were peer reviewed testing for offensiveness.

Popham (2011) also explains if one group of students’ scores were decidedly different

from the rest of the test takers, then disparate impact has occurred. The groups of students

can be socioeconomic, religious, cultural, racial, or gender. The student outcomes from this

study yielded no disparate impact as no group or subgroup of students showed stand out

scores on the assessments.

Analysis of Data

How can tiered assessments be infused into the curriculum? Focus question one,

examines the pedagogy of tiered assessments. The data collected for this question was

analyzed qualitatively and coded for themes. An instructional plan was developed and

implemented. The plan was designed for dependability and consistency as a guide throughout

the study. This was to ensure that both groups were given the same treatment throughout the

study except for the treatment itself. This reduces the variables and margin of error so the

results had merit. A rubric of that plan was peer-reviewed for validity. It was also to ensure

that the methods adhered to the curriculum and content of the course. The rubric was also

examined for fairness and to ensure that no intended unwanted variables or by products


arose. Next archival data was collected examining the methods of other scholars. This

information was used to structure the study in a professional and research oriented manner.

The data here was coded for themes looking for common threads and consistency. Portions

of other scholars’ works, cited in Chapter Two, were used to fine tune this study to ensure

validity.

What is the process by which tiered assessment effectiveness can be measured?

This second focus question deals specifically with the scores of the subjects assessments.

This is the essence of this study as it focuses on test scores and how to improve those. The

data for this portion is the actual assessment scores from each group. First, with a null

hypothesis that there is no significant difference between the scores of each group, an

independent t-test with unequal variances at the P < 0.05 significance level was done on the

pretest of the control and the pretest of the treated group to determine if there were

significant differences between each group. Next, with a null hypothesis that there is no

significant difference between the pretest versus the posttest scores within each group, a

dependent t-test at a P < 0.05 significance level was done. This was to account for the normal

learning curve that took place between pre and posttests. Third, with a null hypothesis that

there is no significant difference between the scores of each group, an independent t-test with

unequal variances at a P < 0.05 significance level was done on the posttests between each

group to determine if there were significant differences between them. The effect size for

each analysis was also calculated.

How do students respond attitudinally to tiered assessments? Focus question three

questioned the attitudes of the students and me as the researcher. A reflective journal was

kept throughout the study guided by writing prompts. The journal was coded for themes


looking for categorical and repeating data that formed patterns of behaviors. One surveys

was given only to the treated group (see Appendix D) while another survey was given to both

groups (see Appendix E). Cronbach’s Alpha was done on the results from each survey for

internal consistency reliability. A Chi Square was calculated for each survey question to find

which questions were significant and which ones were not.

Validation

In terms of consensual validation of the study, my goal was to contribute to the

conversation about the use of tiered assessment model in a multi-level high school classroom

and its influence on student outcomes and learning. In addition, this work was reviewed by

the faculty at LaGrange College. As Eisner (1991) states, “’Consensual Validation’ is an

agreement among competent others that the description, interpretation, evaluation, and

thematic are right” (p. 112). Kvale (1995) echoes Eisner by saying, “consensual theory of

truth aims at universally valid truths as an ideal.” Meaning that analysis of this study will be

consistent with similar analyses of other like studies by other competent scholars. That is to

say that this study and its analysis of the data were considered with the whole and impact of

the results in mind.

Carberry, Ohland, and Swan (2010) define “Epistemology is a branch of

philosophy that concerns the nature and scope of knowledge and the

process(es) by which knowledge is gained”. Epistemological Validation is

validation gained on a piece of research by the nature or means the research was constructed,

executed and concluded. A study is valid if these aspects of the study were adhered to with

the nature of the research kept on the forefront of the intentions of the study/ research for the


benefit of the whole research community as well as its findings. This study, in keeping with

the spirit of valid research, was constructed from a montage of the works of other scholars.

Credibility

Eisner (1991) says structural corroboration is a confluence of multiple data sources

coming together to make an argument concerning the whole. From the data shell (Table 3.1)

multiple data collection techniques and devices have given rise to the inferences made in this

study. In Chapter Two, the fairness of the study has been illustrated by the example of the

study from Watt (2005). She found that many teachers did not want to use differentiation in

the classroom nor a tiered testing model. Conversely, Tomlinson (2000a) urges the necessity

for such techniques needed for student achievement. For rightness of fit, great care was

taken to ensure precision and accuracy for this study. Records were kept with the integrity of

the data collected in mind so that a tight argument could be made.

Transferability

Tronchim’s (2006) perspective on transferability, “Transferability refers to the degree

to which the results of qualitative research can be generalized or transferred to other contexts

or settings. The qualitative researcher can enhance transferability by doing a thorough job of

describing the research context and the assumptions that were central to the research.” This

study which was constructed in the spirit of other studies with their merit and credibility,

along with the original portion of this study also being true to the spirit of research that can

be used by future scholars and researches is true to the works before and after it. This work

is qualified to stand beside the works of others as credible and transferable.

For Referential Adequacy, this study was completely assessment based. Since

assessments are virtually universal to all disciplines, this study can easily be replicated. Care


was taken to reduce the variables that may skew the data in this study; and with the exception

of the assessments themselves, no differences occurred between the control and treated

groups. Only the assessments were different. Since most classrooms consist of instruction

then assessment, a research could easily reproduce this study.

Transformational

Catalytic Validity is the degree in which the researcher anticipates this study to

transform the subjects, participants, and the school (Lather as cited in Kinchloe & McLaren,

1998). Because of this study, I was approached by colleagues and administrators interested

in the concept of tiered assessment models for their fields for the other teachers, and how

student outcomes increased from administrators. Since differentiation now has a strong hold

on today’s education, I do expect this study to cause interest in the general area of this

researcher and hopefully to anyone this study reaches. Being a math teacher, I consulted the

science department at my school and they chose to roll out this model in the fall of 2011.

The students involved in this study seemed to develop an ownership of their grades

and learning from having the options to decide at what level they had to express it under the

tiered model. At this time is not possible to say that the students’ renewed interest in their

learning was linked to the choice or the tier. Further research in needed to develop those

inferences. I have implemented this model full time in my classroom and it has seen success

holistically, not only specific to grades but education on the whole.


CHAPTER FOUR – RESULTS

Focus question 1 investigates the pedagogy of the study, the design if you will. You

can refer to the data shell, table 3.1, on page 26 of this thesis for the focus questions and data

collection methods for each focus question. Focus question 1 presents three methods of data

collection; the unit plan for the study; the peer reviewed rubric for the unit plan; and the

archival data collected for the study. The unit plan and rubric are located in the Appendix, A

and B respectively. The archival data is throughout chapter two of this thesis.

The unit plan was written with the state of Georgia’s Department of Education

standards, called the Georgia Performance Standards [GPS], as a resource for alignment.

The validity for this resource and data collection method is discussed in Chapter Three of this

thesis. Furthermore, the rubric for the unit plan was peer reviewed by the Title I mathematics

instructional coach at my high school. This reviewed rubric was to ensure alignment to the

GPS and the mathematic content of the course while obtaining a highly qualified and trained

eye on the study’s pedagogy. The validity of this method of data collection was also

discussed in Chapter Three of this thesis.

The review of the literature was made to locate previous academic studies to ensure

consistency and reliable findings for this study. Once again, the validity for this method is

also discussed in detail in Chapter Three of this thesis. From the archival data, the emerging

theme of the literature is educators and education designers are probing for anything that will

increase student learning and mastery of the concepts. In doing so, there have been educators

and designers of education who have found success in tiered alternative assessment within

the classroom. Those researchers have run into their own problems with their own studies,


such as Watt (2005) having trouble with the veteran teacher not buying into the ideas of

tiered assessments, and the teachers’ attitudes toward the testing method were poor. This

same notion was echoed by Tomlinson (2000b), urging that teachers must accept that change

is necessary to teach today’s student and assessment change and differentiation was just a

natural progression of instructional differentiation; some teachers were resisting the change.

The literature also illustrates the emerging theme that tiered assessments within the

classroom, at least for many of the researchers spotlighted in Chapter Two of this thesis, have

had some success with tiered assessments and student outcomes have increased.

Focus question 2 of this research deals with student outcomes of my study. As

explained in chapter 3 of this thesis, my study consisted of 21 of students, 17 regular

education students and 3 mainstream special educations students and 1 gifted student in the

untreated or control group. The treated group consists of 28 of students, 23 regular education

students and 2 mainstream special educations students and 3 gifted students. This is

explained explicitly in Chapter Three of this study.

The students in both groups were given an identical pretest on the content prior to any

instruction. This took place on day 1 of the study. The students were never exposed to the

material prior to the pretest as the content is not covered in any prerequisite course. The

high grade on the pretest in the untreated group was a 72%, the lowest was a 0%. The class

mean of the untreated group on the pretest was 19.5%, the median score was 12%. For the

treated group, the high pretest score was 45% and the lowest was a 6%. The class mean for

the treated group was 21.8%, with a median score of 18%.

The two classes’ pretest scores were statistically independently t-tested with unequal

variances to an alpha (confidence level) set at P < 0.05. The null hypothesis for the t-test was


there was not a significant difference of the groups’ pretest scores. For the hypothesis to be

rejected, the obtained value [OV], obtained from the data, must be larger than the critical

value [CV], created by setting the alpha to 0.05. From the independent t-test of the pretest

scores between the treated and untreated group, the CV was 1.693889 and the results from

the t-test was t(32) = – 0.51458, p > 0.05. The purpose of this t-test was to show that both

groups were relatively equal in ability and prior knowledge of the material from the onset of

the study. Rejecting the null would quantitatively show a significant difference between the

groups and not rejecting the null would show no significant difference in the students’ ability

at the beginning of the study.

Posttests were given to both groups at the end of the unit, the last day of the study.

The high grade on the untreated group’s posttest was 95% and the low grade was 35%. The

mean of this group was 67.3% and the median score was 71%. This assessment, or posttest,

was untiered and without student choice. The students simply completed the assessment, or

posttest, that they were handed on the day of the assessment, the final day of the study. A

dependent t-test was conducted using a 0.05 alpha with the null hypothesis being no

significant difference between the two sets of scores. This dependent t-test was used to

determine the natural learning growth to normally expect on the posttests. As the instruction

was delivered throughout the unit, natural learning is going to occur. The t-test was

performed to determine how much was to be expected in the final score comparisons. The

results from this t-test of the untreated group’s scores, pretest and posttest, was t(20) = –

9.4381, p < 0.05 and the CV was 1.724718.

The treated group’s posttests scores were also dependently t-tested with the pretest

scores for the treated group. The treated group’s high grade on the posttest was 95% and the


low grade was 52%. The mean score was 75.1% and the median score was 72.5%. The

dependent t-test was conducted using a 0.05 alpha with the null hypothesis being no

significant difference between the two sets of scores. This t-test was used to compare with

the dependent t-test of the untreated groups’ pretest and posttest scores to determine if

student outcomes were higher with the treated or untreated group so conclusions could be

drawn on the effectiveness of tiered assessment in the classroom. Also, this test could show

a similarity in the normal learning curve for each group. The results for this t-test was

t(27) = – 17.183, p < 0.05 and the CV was 1.703288.

The final t-test was an independent t-test with unequal variances on the treated

groups’ posttest scores with the untreated groups’ posttest scores. The alpha for calculating

the critical value of the t-test was 0.05. The null hypothesis for this t-test was no significant

difference between the scores. This test was used to determine the effectiveness of a tiered

assessment model in a multi-level/ ability classroom. This gave a more accurate account of

how well of not so well the assessment model did for the study. The results for this t-test was

t(30) = – 1.89661, p < 0.05 and the CV was 1.695519.

An effect size test was run on the posttests of the treated group with the posttests of

the non-treated group. Since the test was run between two different classes of two different

populations, a Cohen’s d was run on this data. The treated group hosted a mean score of

75.07143% with a standard deviation of 11.98434. The non-treated group had a mean score

of 66% with a standard deviation of 18.84005; the Cohen’s d = 0.57. In addition an effect

size statistic was run on the pretest and posttest of the treated group and the control group’s

pretest and posttest scores. The effect size for the treated group’s pre/posttest scores was

0.91; and the effect size for the control group’s pre/posttest scores was 0.82.


Focus question 3 deals with the attitudes of the subjects and the researcher throughout

the study. As part of the data collection for this thesis a reflection journal was kept by this

researcher to record the day to day observations and attitudes of the students as well as this

researcher. Even though the journal was more of a summary piece that deals with the study

on the whole, it will be discussed first here, but last in chapter 5, when it will be analyzed.

Writing prompts for the journal were used to ensure consistency though out the process. The

emerging themes were recorded and interpreted. The interpretation of these themes will be

revealed in chapter 5 of this thesis. For now, the emerging themes are listed from the journal

in two parts, the students’ attitudes towards the study and the execution of it, and this

researchers’ attitude toward the study and pedagogy of it.

From the reflective journal kept by this researcher throughout the duration of this

study, several themes emerged. First, the students seemed to be a little bewildered with the

process of tiered assessments in the beginning of the study, but as the study progressed, this

confusion began to subside. The students frequently asked questions about how each test

was structured and aligned with the standards of the class. They were also curious about how

many points each test counted towards their overall grade and how they could plan for which

test they were going to choose for the points they felt they could earn.

Another observation made during the process was the students seemed very aware of

what they understood and what they needed to learn in terms of the content; they would

frequently ask questions such as, “If I understand addition, subtraction, and multiplication,

but not inverses, which test should I take to make the best grade I can?” Then follow up

questions would be, for example, “If I learn inverses, should I take the Excels test?”

Towards the end of the study, when the assessment was within reach, students would also say


things like, “I don’t believe I know enough to take the Excels test.” Another student was

overheard saying, “I made a 100 on the first quiz and a 92 on the last one, so I am definitely

taking the ‘X’ test.” X refers to the Excels test.

Pedagogical observations also became apparent during the study. It was noted that a

structure was seemingly needed in terms of a study guide for each test. The students wanted

to know what they needed to understand to use as a checkpoint with choosing the appropriate

test and to determine how well they would do on that chosen assessment. This was asked for

by many throughout the study, but mostly toward the end when the students started planning

for which assessment they would choose.

It was also recorded that my school’s administration and Title I content coach began

taking notice and asking questions of how the study was conducted and why. Some other

content department leaders also became interested in the study and the workings of it. It was

even mentioned at our school system’s monthly math meeting highlighted in the

differentiation portion of the meeting. Teachers, colleagues, and other education

professionals were taking interest in the study pedagogically, and they wanted the results of

the study, whether it worked or not; worked meaning did it increase student learning and

outcomes.

Two surveys were completed by the subjects of the study. One survey was

administered to both groups, Appendix E, as a hypothetical and baseline developing survey

after the completion of the unit. It consisted of eight questions answered on a Likert scale

from 1 to 5, 1 being “Strongly Disagree” and 5 being “Strongly Agree.” This anonymous

survey was designed to give the researcher an overall assessment of the attitudes of the

subjects with regards to the content and their attitudes about the possibility of a tiered


assessment program at some unknown point in the future. The control group knew nothing

of the study or a tiered assessment policy at the time of this survey. A Cronbach’s Alpha was

done on this survey for internal consistency reliability with an obtain Alpha of 0.44 for the

treated group and 0.19 for the control group.

Table 4.1: Chi Square for Treatment and Control Student Surveys2 Treatment n = 28 2 Control n = 21

Q1: I like math. 2.1 3.7Q2: I feel that a test grade shows my teacher how much I really know about a unit. 7.6 3.1Q3: I feel that having a choice on what level of test I take will improve my chances to pass. 18.8*** 18.6***Q4: I like having an option on which level test I take. 34.4*** 32***Q5: I feel that taking one version of a test will increase my chances of failure. 26.3*** 2Q6: I feel that if I know the material and I am properly prepared, the type of test I take will not affect my grade. 5.6 5.3Q7: The tier tests options game me confidence that I could pass the test. 19*** 6.4Q8: All students should take the same tests. 11.6* 10.9*

* P < 0.05, ** P < 0.01, *** P < 0.001

From this survey a chi squared statistical value was obtained from the Likert scale

formatted survey for each question of the survey. In the treated group, from figure 4.1,

question 1 showed 2(4) = 2.09; p > 0.05. Question 2 gave 2

(4) = 7.64; p > 0.05, question 3,


2(4) = 18.80; p < 0.001, question 4, 2

(4) = 34.40; p < 0.001. The remaining questions, 5, 6, 7,

and 8 yielded 2(4) = 26.31; p < 0.001, 2

(4) = 5.58; p > 0.05, 2(4) = 19.00; p < 0.001, 2

(4) =

11.60; p < 0.05, respectively. Each question will be listed in Chapter Five of this thesis, along

with its chi squared value with the interpretation of each question.

The non-treated group’s chi squared values for this survey are as follows, question 1,

2(4) = 3.67; p > 0.05; question 2, 2

(4) = 3.11; p > 0.05; question 3, 2(4) = 18.59; p < 0.001;

and question 4, 2(4) = 32.00; p < 0.001. The values for questions five through 8 are: question

5, 2(4) = 2.00; p > 0.05; question 6, 2

(4) = 5.33; p > 0.05; question 7, 2(4) = 6.44; p > 0.05;

and question 8, 2(4) = 10.89; p < 0.05.

Another survey, actually the first of the two sequentially, was given to the treated

group just prior to the beginning of the unit before any instruction and preparation for the

assessment had begun. The completely anonymous survey consisted of five open ended

questions that were coded for emerging themes. From question 1 the students simply listed

which level assessment they chose for the posttest. From the three choices, three students

chose the Meets, or lowest level, test, thirteen students chose the Exceeds, or middle level,

test, and 7 chose the Excels, or highest level, test.

Question 2 asked to students to explain why they chose that particular test. The three

factors that drove their decisions were security from failing, confidence in the material, and

the points offered for each test. Question 3 posed a counter reason for not choosing one of

the other posttests. The points offered for each test, confidence in the material, and the

student’s feeling of the test being a true measure of what they actually knew were the most

common reasons given for question 3.


Question 4 asked to students why having a choice in their post assessment was

important to them. The most common answers were students felt that the choice gave them

confidence that they could do well on the assessment; the choice gave them flexibility and

control of their learning putting it more in their terms. Also, the students felt that the multi-

levels gave them a sense of preparation, they either knew how prepared they were or needed

to be prior to the test. One student felt that having a choice did not matter.

Question 5 asked the students to explain why a tiered assessment program would

help/hinder his or her grade holistically. The emergent themes here were the common

confidence, preparation, and flexibility. The option gave the students confidence to do well

holistically if continued throughout the course, gave them a clear picture of how prepared

they were or needed to be from each unit to the next, and the flexibility to change their level

from unit to unit depending on how well the felt that understood each unit separately. And

once again, one student felt that the tiered assessment program did not matter or would not

present any changes to his or her final grade.


CHAPTER FIVE – ANALYSIS AND DISCUSSION OF RESULTS

Analysis of Results

For focus question one, how can tiered assessments be fused into the curriculum, data

were gathered on three aspects of the question. A unit plan was devised by the researcher

with detailed plans of implementation of the tiered assessment. This plan was aligned with

the content of the course as laid out by the county’s department of education which was in

turn structured based on the curriculum from the state of Georgia’s curriculum through

Georgia Performance Standards [GPS]. A rubric was also developed based on the unit plan to

be used solely for grading the unit plan by the school’s Title I Mathematics Coach. The

Math Coach has more than 20 years experience in the classroom.

The purpose of the rubric was to have an experienced set of eyes, unrelated to the

study, examine the unit plan checking for validity and alignment to the content and GPS

standards. In addition, the Math Coach, whose training is based on student success with

increased use of differentiation in the classroom, was used as the grader of the rubric for the

purpose of aligning the tiered assessments with the content under the umbrella of the use of

differentiated assessments in the classroom. With the training, experience, and title, the

Math Coach was well qualified to grade the rubric and unit plan accurately and with merit.

The results of the rubric from the Math Coach revealed that the unit plan was well

aligned not only with the county’s curriculum but also the GPS. In addition, the unit plan

and rubric also show that each assessment of the tier was also aligned by difficulty with the

Georgia High School Graduation Test [GHSGT] as the Meets assessment, the easiest test,

was aligned in question difficulty with a minimum passing grade on the GHSGT. The


Exceeds assessment was successfully aligned with a score of 516 on the GHSGT which is

equivalent with an Exceeds Standards score on the GHSGT. The Excels assessment was

successfully aligned in difficulty with the GHSGT as these questions are deemed too difficult

to show on the GHSGT but are still aligned with the content put forth by the GPS.

In addition to having the unit plan scrutinized by experienced personnel in teaching

experience and county and state standards, as well as, hands on workings with the GHSGT

by a rubric that was specifically designed to test for such alignment and validity, authors’

works were researched looking for specifics on how to integrate a tiered assessment program

seamlessly into the curriculum with success. Wormeli (2006) explains the importance of

students having a personal connection with the assessment if possible. The students making

a choice on which level assessment that they were to take is that student owning his or her

test and having a say in the final product. Scouller (1998) explains that infusing instruction

with a choice based tiered assessment with give the students a personal stake in the

assessment and their learning. And Scouller’s (1998) study showed significant differences in

students who were offered different levels of assessment as measured against a group of

students who were only offered one test.

Focus question two, what is the process by which tiered assessment effectiveness can

be measured; data were also gathered on three aspects of the question. First, the two groups

were given the same pretests over the material. This pretest was to establish that the groups

were on equal playing fields in terms of pre-knowledge prior to the beginning the unit. From

the scores of the pretests, an independent t-test was run on the two groups with unequal

variances, because the size of the two groups was different. The t-test showed a critical value

[CV] of 1.693889 and the results from the t-test was t(32) = – 0.51, p > 0.05. Since the


absolute value of the Obtained Value [OV] of the t-test was smaller than the CV, statistically

the researcher fails to reject the null hypothesis. The null hypothesis for the t-test yielded no

significant difference of the groups’ pretest scores. Therefore, one can conclude that there

were no significant differences between the two groups’ prior knowledge of the material at

the onset of the study. And the aforementioned even playing field has been established.

The second part of focus question two, the pre-post test scores of the two groups

independently from one another, was designed to show the natural learning curve as a direct

result from instruction, and a larger gain of student outcome for the treated group as a result

of the study. Because this study focuses on the summative assessment, measuring for an

increase in student outcomes in the treated group comes directly from the knowledge of the

students beforehand that they would have a choice in which test they would take at the end of

the unit. The increase here can be attributed to the students’ knowledge of what was to come

on the assessment as far as they would have a say in which level assessment they would take.

The untreated group did not know what to expect and had no choice in the matter. This is not

to say that the treated group saw test questions prior to their assessment; they did not, only

that they had a choice on the level of difficulty of the test. Scouller (1998) echoes this notion

in chapter two by saying that students perform better when they have a choice in the

assessment process.

These t-tests were run based on the group. A t-test was run on the pretest of the

untreated group versus the posttest scores of the same group. And another t-test was run on

the pretest of the treated group versus the posttests of the same group. For the untreated

group, the results on the t-test revealed a CV of 1.724 and an OV of t(20) = – 9.4, p < 0.05.

Again, with the absolute value of the OV greater than the CV, statistically the null hypothesis


would be rejected that there were no significant differences between the scores. But that is to

be expected because one score is before instruction and one score is after instruction.

An effect size was run on the untreated group to determine if the group was large

enough to yield valid results. The effect size for the control group’s pre/posttest scores was

0.82. According to Salkind (2010), any effect size larger than 0.50 is a large effect size.

This means that the two scores have little overlap in similarities which makes for a stronger

argument in the validity of the findings. The closer the effect size is to 2, the stronger the

argument, that the findings are validity because the chosen testing group was large enough to

yield accurate results. In addition, the effect size for the treated group’s pre/posttest scores

was 0.91. And again, because the effect size is larger than 0.5 and close to 1, the treated

group is also considered large enough to produce valid findings.

In the treated group, a t-test was run for the same reasons as the t-test for the control

group. The t-test of their pretest and posttest scores rendered t(27) = – 17.183, p < 0.05 and

the CV was 1.7. Once again, the null hypothesis should be rejected because the absolute

value of the OV is greater than the CV. Again this is to be expected.

When compared to the learning curve in the untreated group with effect size 0.82, it

can be argued that because the students were aware they had a say in the final assessment,

their learning curve increased and ultimately they retained more of the material during the

study and unit. The treated group had an effect size of 0.92 which shows a stronger argument

than the 0.82 effect size of the untreated group, and it has already been established that both

groups started out the lesson in the same place in terms of knowledge. It can be said that the

treated group learned more, almost double by comparison, of the same material in the same

amount of time from the same instruction, only differing in the final assessment and the


knowledge that there would be a student decision made in which assessment the students

would ultimately take. Herman et al., (1997) associated the increased student outcomes of

their study directly to the students knowing that they had a choice in the assessment prior to

instruction; and significant gains were measured.

The third part of focus question two dealt with the posttests of each group compared

for significant difference. A t-test was run on each group’s posttest scores, using unequal

variances of an independent t-test. This test was to test the null hypothesis that there was no

significant difference between the two scores. Anyone can see for this thesis to have

viability, there should be a significant difference between these two groups. The results for

this t-test was t(30) = – 1.89, p < 0.05 and the CV was 1.69. Because the absolute value of the

OV is greater than the CV, statistically the null hypothesis is rejected and this test showed

that there were significant differences in the scores. Wheadon and Beguin (2010), and Whipp

(2004) both recorded significant gains in the middle achieving and low achieving students

simply by introducing alterative levels of assessment those students would take. Ultimately,

the researchers all showed improvements in student scores by tiering the tests for their

respective groups.

As previously discussed in the second part of focus question two the treated group’s

scores were significantly better than the untreated group’s scores. Thus, the posttest scores

from the treated group increased on a greater magnitude than that of the untreated group. A

Cohen’s d was run on this data to determine if the groups’ sizes were large enough to

validate the study. The treated group hosted a mean score of 75.07143% with a standard

deviation of 11.98434. The non-treated group had a mean score of 66% with a standard

deviation of 18.84005; the Cohen’s d = 0.57. That value is a large significance which means


that the groups were large enough to create a strong argument for the viability. The strength

of Cohen’s d is the same as effect size, if the number is larger than 0.5 then the group is large

enough to argue a strong case for validity. Since this Cohen’s d is above that mark, then this

test has validity.

Focus question three, how do students respond attitudinally to tiered assessments,

deals with the students and the teachers feelings or attitudes about the tiered assessment

model posed in this study. For this part of the study, surveys and a reflective journal was

administered and kept recording attitudes and observations by the students and the teacher.

These findings were in turn coded for themes and interpreted. In the initial stages of the

study, the treated group was given a survey that asked for their attitudes towards a tiered

testing model and math in general. Second, at the conclusion of the study, a survey was

given to both groups, asking how they felt about having a choice of a tiered assessment

program. The control group’s answers were theoretical because this was their first exposure

to the process. The treated group’s answers were more of a reflection from the student’s

point of view. Lastly, a reflective journal was kept throughout the study by the researcher on

observations about the process and attitudes of the student and teacher, and any other

information that the researcher deemed relevant to the study.

The first part of focus question three, the initial survey given to the treated group was

coded for emerging themes. This survey was discussed last in chapter 4, but first here

because it was chronologically first. At the time of the survey, the students had been

explained the concept of tiered testing and specifically the program installed for the study.

All of the specifics were explained, the students were given a handout to read about it, and all

student questions were answered. However, the unit had not started at that time as the unit


had not begun. The survey questions were open ended and the results were gathered

qualitatively.

For the first three questions, the survey asked which test the student would ultimately

choose, why they would choose that one, and why they would avoid the other tests offered.

Most students chose the hardest, X, or the middle, E, test due to the rewards (points) offered

for taking the harder tests. Some shied from the hardest test, stating that lack of confidence

in the upcoming material would scare them away from that test. However, for the most part,

students did not “take the easy road” as some may think.

The students felt that they were rewarded from taking the harder tests. This was a

tiered points option that was part of the model. As the level of difficulty increased the

number of points that they could earn increased respectively. This motivated the students to

attempt the more difficult test for the chance to earn the most points translating into a higher

grade.

Question four of the survey dealt with student choice. It asked students to discuss

what it meant to them to have a choice of assessment. Herman et al. (1997) showed in their

study that students responded positively when given a choice in assessment. Scores

increased and more material was retained. The students of this study echoed that of Herman

et al.’s students. The students of this study showed that having a say in their assessment

motivated them and gave confidence in the impending assessment. The choice notion

emerged as being one of the vital components to a successful tiered model. Without the

strength of using choice with the tiered model, it would not have been as successful.

Question five from the survey deals with the student’s holistic view of a tiered

assessment system full time in the classroom. Again the vast majority of the students


enjoyed having a choice to choose the difficult test for one unit and changing to a less

difficult one for the next, if they struggled, and going back to the harder test for the next unit

if their struggles diminished. Confidence in how well they were learning the material was

another frequent response. Students felt that by having an option on which test to take and

when they had to make that choice, at the end of the unit, they felt there was a lower risk to

stretch their levels of learning to reach for a high learning threshold.

Part two of the focus question three was the second survey that was given to both

groups after the conclusion of the unit after the posttest. This survey was not open ended but

provided answer options on the Likert Scale from 1 to 5, 1 being “Strongly Disagree” and 5

being “Strongly Agree.” The eight questions asked both groups their opinions about tiered

assessments and were analyzed using Chi Squared and Cronbach’s Alpha statistics. The

Cronbach’s Alpha for the control group, 0.19, and for the treated group, 0.44, showed

according to George and Mallery (2003) to be unacceptable internal consistency. This could

be due to confusion on some of the questions, which was reported by some subjects; or

subjects not taking the survey seriously.

Neither group’s answers for question one, I like math, were concentrated enough to

make it significant. Although the result was not really expected for this question to be

significant, it does show that the studied groups were ordinary high school classrooms filled

with students of varying likes and dislikes toward school and particular subjects. Question

two, I feel that a test grade shows my teacher how much I really know about a unit, was

designed to see how much students understood about the reason for testing. The results were

also not significantly concentrated toward one answer. Again, this is not a surprise, for many

students never analyze why they are tested other than the teacher assigned it.


Question 3, I feel that having a choice on what level of test I take will improve my

chances to pass, both groups felt that this question was extremely significant in the Strongly

Agree direction to a significance of *** which is p < 0.001 that this question being answered

this way was a random, chance event. The students feel very strongly that having an option

will contribute to increasing their test scores. Having a choice was obviously very important

to the students. Question 4, I like having an option on which level test I take, also had major

significance. The students strongly agreed with this question making it strongly significant to

*** also which again is p < 0.001 a random, chance event. This question along with question

3 shows that the students felt that having an option was a good thing and they felt it had the

potential to increase their scores.

Question 5, I feel that taking one version of a test will increase my chances of failure,

was a problem question, many students complained during the administration of the survey

that is was confusing and the students did not understand what specifically it was asking.

The treated group, used the undecided vote for “I don’t know” so it became significant in the

middle or undecided direction. The control group did not make this question significant as

answers to this question were all over the option list. Because of the issue with this question,

I did not use the information obtained from this question in my interpretation or analysis of

the data.

Question 6, I feel that if I know that material and I am properly prepared, the type of

test I take will not affect my grade, was not significant to either group. Many students were

undecided here. It could have been from a question clarity problem or maybe the students

just did not have an opinion about this question.


Question 7, the tier tests option gave me confidence that I could pass the test, with the

treated group, this question was significant sharply toward strong agree to a significance of

*** or p < 0.001. This question really shows that those students felt the model gave them

confidence and ease with taking the test because it was tiered. The control group did not feel

this question was significant, but 10 out for the 18 that took that survey answered agree or

strongly agree and only 2 of the 18 marked disagree or strongly disagree. 6 students

answered undecided. I am not sure why they answered this way but even with how the

answers were scattered, it still appears that the majority of the students felt in favor of the

tiered model giving them confidence to pass the test.

Question 8, all students should take the same tests, was made significant by both

groups to * or p < 0.05. This question was a direct result from the tier assessment

conversation, so it definitely in the context of tiered of not tiered tests. From this

significance in the strongly disagree direction, it did matter to the students that they have the

option of taking a tiered assessment of the material. In all, the students showed in this survey

that having a choice of a tiered assessment was important to them because they felt it gave

them confidence on passing the test. They also showed, in their minds, that all tests are not

one size fits all.

The third part of focus question three was the reflective journal that was kept

throughout the study. The journal was an ongoing record of observations and attitudes of the

students and teacher during the process. The most emergent theme from the journal was the

students were continually trying to understand the process of the tiered model; it was obvious

that they had not been exposed to a tiered assessment model before. As the students got used

to the idea of a tiered model, they were very engaged and aware of how and how well they


needed to learn the material in order to take a particular test. In this journal it was recorded

that the students took a great interest in the test. That interest translated into the students

engaging the material with purpose throughout the study. This journal finding had

corroborated the finding from the pre/post student surveys in that the students felt having

choice of the test was beneficial to their learning and test scores.

Discussion

This study gave students a choice of a tiered summative assessment model that was

pre-organized and prepared. This model offered an easy, medium, and harder summative

assessment over the same material for the students to choose. It is important that the choice

and the different levels go hand in hand. Without the choice, I feel that the students would

not have embraced the different levels of assessment the way they did or reached farther than

their comfort zone to pass the hardest test. Of course without any options, the choice is a

moot point. As the study progressed, after the survey questions and observations made, it

became increasingly apparent that the choice was just as important as the multiple levels.

This study could have been done without the student choice. Research was found where

scholars chose the test for the students based on past and present ability and performance

with many finding positive results. But I felt giving the students the opportunity to choose

for themselves would hopefully tap into their motivation and accountability for their own

learning, in the process the students’ outcomes increased.

The results obtained from this student came directly from the students. All I did was

really prepared the lessons and structure of the tiered program. Other than teaching the

material and doing all the work any normal teacher does, the students drove the study;

especially the results. The students felt like they were participants in their learning because


they were part of the planning for it. They had a say so in the process, at least for the final

test, but they took ownership of that from very early in the study. Further, they prepared for

that hard test or that middle test from the onset. They understood that there was a failsafe

test, the easiest one, so they did not feel they were at risk for trying to learn the more difficult

concepts.

I was surprised on the findings, I must admit. I thought students would not step up to

the challenge, take the easy test and get it over with. I am not the pessimistic teacher by any

means. That is not at all what happened. The students, once they understood how the tiered

program worked, ran with it. They asked questions, I never thought they would come up

with. They showed interest in their own education, which is increasingly rare, especially in a

middle to low level high school math classroom. They cared about the grades again. They

worked toward understanding the material well enough to take the hardest test. From 49

students taking this test in both groups, only three M (easiest) tests were taken, one in the

treated group, and two in the control group. That surprised me; I thought the numbers would

never be that low on the easiest test. But that just proves, that the students strive to be better,

and they want to be better than basic.

Tomlinson (2000a) says about assessment, “with differentiated instruction in full

swing, differentiating the assessment is a natural progression” (p. 28). Two aspects of this

study are very relevant to today’s educational trends. The method of assessment is

considered differentiated instruction and assessing. No one in the education field has to be

told how important that is. Teachers are constantly trying to find and implement new what to

motivate, educate, and graduate students by changing up the pace, delivery, and method of

instruction and now testing. Also, it is an assessment, specifically a summative assessment.


Testing drives the education world right now. There are tests for everything, and there seems

to be a different high stakes tests every month in our classroom these days. Students in this

study became excited about the material and even the test itself. With this study, students

test scores increased. That is what, I am sure, every school administrator in the country is

preaching right now.

Choosing the test became a big deal to the students. They gave great thought into

which test they were to take. There were several students that prepared to take the middle test

that, at the last minute, requested to take the hardest one. That would drive some students to

tears of joy, but for me, pride. I became proud of my students for striving to be better than

average, for deducing for no one but themselves, which test to take and not fearing the worst.

As teachers, if we could get most or all of our students, especially the middle and low level

students, to work that hard and care that much, then it would all be worth it.

Structural corroboration was a concept that was carefully planned throughout this

study. The notion of triangulation was paramount in the development of the study. With

three focus questions, each one was researched from at least three different aspects, hopefully

pointing all back to the same answer. With this study triangulation was achieved, therefore

this study has credibility.

For the first focus question, how can tiered assessments be infused into the

curriculum. For this question, a detailed unit plan, a rubric critiquing the unit plan by a

highly qualified third party, and archival methods already accomplished by published

researchers guided the focus question towards one answer. The pedagogy of the study was

answered, therefore triangulation for focus question one was achieved.


Focus question two, what is the process by which tiered assessment effectiveness can

be measured? For this focus question, multiple statistical tests were run on the data. Four t-

tests, two Effect Size statistics, and one Cohen’s d statistic, all showed that the groups started

in the same place and the treated group showed more retention and yielded high test scores

than the control group. Just for good measure, archival data was introduced from other

successful researchers corroborating my findings as not uncommon.

Focus question three, how do students respond attitudinally to tiered assessments.

Two surveys were used for this part of the study along with a reflective journal for

triangulation. Again, archival data was introduced from scholars having already recorded

their findings and student attitudes for corroboration with my findings. As the student

attitudes towards tiered assessments become more prevalent during the study, this focus

question and its data gathering methods become the most important part of the study. This

was a surprise, because the study was focused on comparing test scores, seemingly focus

question two would have been the driving force, but no, the students’ attitudes towards the

study really are what makes the study worth repeating and implementing as the norm in the

classroom.

From the three focus questions, approached from at least three different directions all

pointing toward that same outcome, with statistical data showing validity and strong results,

this study in indeed strong enough to argue and make judgments from. It is credible and has

rightness of fit. This study was successfully accomplished with no altering of any data, and

research done during this study point toward the same findings of the study. The study has

impressed this researcher enough that it will become my testing policy for the near future,

ready for full implementation in my classroom for next term. Already, two other


departments at my school have shown interest in using this testing method in their classes,

and I am currently “teaching” those colleagues how to implement this program into their

classrooms.

Implications

Even though this study was too small to generalize for all classrooms, all students

who are taught will eventually take some sort of summative assessment during their

education. No matter what the discipline, because all courses have standards that guide

educators on what is considered basic and advanced understanding of a given topic, there

will be an assessment. When creating the assessment for a given topic, it is easy to build a

test bank of questions ranging from a basic understanding of the concepts taught to a more

complex and even higher understanding of the concepts than the standards require. From

that bank of questions, tests can be created by grouping the questions by like difficulty. If

more than one multi-leveled test is formed, then tiered assessment has been created. The

testing is summative and will appear at the end of the testing period; there is no need for

much altering of already constructed lesson plans. Further, because the testing choice is an

individual decision made by each student, the size of the class or group of students involved

in the testing model is limitless.

It was discovered that students enjoyed having power over their learning by being the

master over which assessment they would take to show what they learned. Choosing their

test gave the students a sense of ownership over what they learned and it motivated them to

work harder because they felt a sense of collaboration with the teacher instead of

subordination. With that said if teachers can find a way to make the students take ownership

of their learning, it will motivate them to learn to the top of their potential and sometimes


push the bar farther no matter the course description. These students engaged their education

in a way that I have never seen before. This increased engagement is because the students

found a personal connection, even if just for a good grade, to the material; the students made

the learning process their own because they had control of the outcome in more than a

normal situation.

For anyone in the education field who wants their students to achieve higher grades,

while taking accountability for their own learning, this process can help. This study has

shown that my students achieved higher test grades, retaining more information, while

challenging themselves to process the most difficult aspects of the topic all by their own

volition without any pressure from the teacher. Moreover, for Referential Adequacy, this

study is easy to replicate because it is simply test modification from the normal classroom

environment. This study was simple and focused on one thing, the summative assessment.

Great strides were taken to ensure that no other variables were introduced that may taint the

findings and so the findings would be reflective of only the alternative testing method. As a

byproduct, reducing the variables made the study simplistic in nature and easy to replicate.

Since the study ended my students did not want to return to not having a say in how

they were assessed. They reported that they felt more confident in what they learned and

suffered less test anxiety because they chose the test. It was their decision, not someone

else’s. The students requested that all of their assessments be tiered from now on. In trying

to help in any way I can to make them learn, I have accommodated. After a couple of units

passed since the study, I noticed that grades were dropping and the students’ attitudes began

to drop, so I tiered the last two unit tests, grades shot back up and morale increased. So,

tiered assessments is the method by which all of my students from here on will be assessed.


I have reported earlier that many departments in our school and throughout the county

that have become aware of the testing model have shown interest in implementing this model

for their departments/ classrooms. Tiered assessments in practice will be the norm for my

department next fall, as all the teachers in my department agreed to implement the model

across the board for math. The Social Studies and Science department have scheduled

meeting with me and my administrator to develop their own pilot programs for testing. The

school has embraced the notion of alternative assessment and is willing to hear from those

trying to help students achieve higher.

As for me, I am sold. Starting immediately, all of my classes, at the students’ request,

are tiered testing classes for summative assessments only. I have learned that I can become a

better teacher by listening to my students and working with them as “colleagues” with their

education, as a facilitator of the education instead of a provider of the education. My whole

philosophy toward teaching has changed. I no longer feel that I am giving students’

knowledge, but assisting them in creating their own knowledge.

Impact on Student Learning

As I have said in previous paragraphs, the major emerging theme from this study was

the students’ self-imposed accountability for the curriculum. That alone is enough to change

a syllabus. Getting students to care about their own education is an argument I have heard

from just about every teacher I have ever met. “If I could just get them [the students] to care

about their grade as much as I [the teacher] do…” is something that even I have said. But

taking a statistical look at what this model did for students is another but related story.

Administrators, schools, systems, states and even the federal government seemingly

care about one thing, test scores. If test scores are an indication of what a student has learned


then a higher test score means the student learned more. These scores increased because of

this study. Looking at the t-tests from the pre-post from each group, the natural learning

curve in the control group yielded a higher effect size calculation for the treated group.

The natural learning curve increased due to the study, and test scores improved. Classroom

means increased and the median score was higher.

Recommendations for Future Research

Fortunately for me, my statistical data all pointed towards to same outcome, so I

really did not have any data that I could not explain. In addition, the qualitative data also was

very concentrated into terms of the emerging themes. However, I do believe that I did not

eliminate all of the variables that I set out to eliminate; as it is ludicrous to believe that all

variables in a classroom of thirty students could all be controlled or eliminated. As I worked

on the study I realized that there is a current condition that I did not take into consideration.

I did not examine the effect of test anxiety on students and how it can reduce test

scores. In addition, I did not research how test anxiety can be reduced and whether student

choice is considered a reliever of test anxiety. As a spinoff of this study, the investigation of

tiered assessment as a reducer of test anxiety would make for a strong argument in favor of

tiered assessments’ implementation in the classroom if favorable outcomes can be obtained.

The largest part of this thesis that will need further investigation is the idea of student

choice. I realized during the study from my students that student choice was the major

contributing factor on the success of this tiered model. Without student choice, I do not feel

the study would have been such a success. I plan to continue this thesis and expand it into a

dissertation with the further inquiry in how student choice effects tiered assessments and how

working together they will increase student outcomes.


References

Ackermann, E. (2001). Piaget’s constructivism, Papert’s constructionism: What’s the

difference?. Future of Learning Group Publication (MIT). 4(3). 438 – 442.

http://learning.media.mit.edu/content/publications/EA.Piaget%20_%20Papert.pdf

Burns, A. (1999). Collaborative action research for English language teachers. Cambridge,

England: Cambridge University Press.

Carberry, A., Ohland, M., & Swan, C. (2010). A pilot validation study of the epistemological

beliefs assessment for engineering (EBAE): First-year engineering student beliefs.

American Society for Engineering Education. 9(1).

Cizek, G.J. (2010). An introduction to formative assessment. In H. L. Andrade, & G. J.

Cizek (Eds.), Handbook of formative assessment. 3 – 17. New York, NY: Routledge.

Crotty, M. (1998). The Foundations of social research: Meaning and perspective in the

research process. Thousand Oaks, CA: Sage Publications. ISBN 0761961054

Eisner, E.W. (1991). The enlightened eye. New York, NY: Macmillan.

George, D., & Mallery, P. (2003). SPSS for Windows step by step: A simple guide and

reference. 11.0 update. (4th ed.). Boston: Allyn & Bacon.

Golafshani. N. (2003). Understanding reliability and validity in qualitative research. The

Qualitative Report. 8(4). 597-607.

Hendricks, C. (2009). Improving schools through action research: A comprehensive guide

for educators. (2nd Ed.). Upper Saddle River, NJ: Pearson Education, Inc.

Herman, J., Klein, C. & Wakai, S. (1997). American students’ perspectives on alternative

assessment: do they know it’s different? CSE Technical Report 439.

CRESST/University of California, Los Angeles, CA.


http://www.nova.edu/ssss/QR/QR8-4/golafshani.pdf

Kinchloe, J., & McLaren, P. (1998). Rethinking critical theory and qualitative research. In N.

Denzin & Y. Lincoln (Eds.), The landscape of qualitative research: Theories and

issues (pp. 260 – 299). Thousand Oaks, CA: Sage Publications.

Kvale, S. (1995). The social construction of validity. Qualitative Inquiry. 1(1). 19 – 40.

Lagrange College Education Department. (2008). Conceptual framework. Lagrange, GA:

Lagrange College.

Linn, R. (1998 November). Assessment and accountability. CSE Technical Report 490.

National Center for Research on Evaluation. Los Angeles, CA.

http://research.cse.ucla.edu/Reports/TECH490.pdf

Maclellan, E. & Soden, R. (2004). The importance of epistemic cognition in student-centered

learning. Instructional Science. 32(3). 253–268. DOI:

10.1023/B:TRUC.0000024213.03972.ce

Nisbett, R.E. & Wilson, T. D. (1977). The halo effect: Evidence for unconscious alteration of

judgments. Journal of Personality and Social Psychology. 35(4). 250-256.

Oberg, C. (2009). Guiding classroom instruction through performance assessment. Online

Journal of Case Studies in Accreditation and Assessment. 1(1). 1–11. ISSN: 1941–

3386. http://www.aabri.com/manuscripts/09257.pdf

Popham, W. J. (2011). Classroom assessment what teachers need to know. (6th Ed.). Boston,

MA: Pearson Education Inc.

Rogelberg, S. G., Fisher, G. G., Maynard, D. C., Hakel M. D., & Horvath, M. (2001).

Attitudes toward surveys: Development of a measure and its relationship to


respondent behavior. Organizational Research Methods. 4(3). DOI:

10.1177/109442810141001. http://orm.sagepub.com/cgi/content/abstract/4/1/3

Salkind, N. J. (2010). Statistics for people who (think they) hate statistics: Excel 2007

Edition. (2nd Ed.). Thousand Oaks, CA: Sage Publications, Inc. ISBN 978-1-4129-

7102-7.

Schwartz D. & Arena D. (2009, August). Choice-based assessments for the digital age.

Stanford University, School of Education. Stanford, CA. White paper for the

MacArthur Foundation

http://aaalab.stanford.edu/papers/ChoiceSchwartzArenaAUGUST232009.pdf

Scouller, K. (1998). The influence of assessment method on students’ learning approaches:

multiple choice question examination versus assignment essay. Higher Education.

35(4). 453–472. Netherlands: Kluwer Academic Publishers. DOI:

10.1023/A:1003196224280

Soloman, P. (1998). The Curriculum Bridge: From Standards to Actual Classroom Practice.

Los Angeles, CA: Corwin Press.

Sprick, R. S. (2002). Discipline in the secondary classroom: A positive approach to behavior

management. San Francisco, CA: John Wiley & Sons.

Tomlinson, C. (1995). How to differentiate instruction in mixed-ability classrooms.

Alexandria, VA: Association for Supervision and Curriculum Development.

Tomlinson, C. (2000a). The differentiated classroom: Responding to the needs of all

learners. Alexandria, VA: Association for Supervision and Curriculum Development.

Tomlinson, C. (2000b). Reconcilable differences? Standards-based teaching and

differentiation. Educational Leadership. 58(1). 6 – 11.


Tomlinson, C., Kaplan, S., Renzulli, J., Purcell, J., Leppien, J., Burns, D., Strickland, C., &

Imbeau, M. (2009). The parallel curriculum: A design to develop learner potential

and challenge advanced learners. (2nd Ed.). Thousand Oaks, CA: Corwin Press.

ISBN 978-1-4129-6131-8 {pbk.}

Trochim,W. (2006). Research methods knowledge base. Social Research Methods. Online

journal. http://www.socialresearchmethods.net/kb/qualval.php

Watt. H. (2005). Attitudes to the use of alternative assessments methods in mathematics: a

study with secondary mathematics teachers in Sydney, Australia. Educational

Studies in Mathematics. 58(1). 21 – 44.

Wheadon, C., & Beguin, A. (2010). Fears for tiers: are candidates being appropriately

rewarded for their performance in tiered examinations? Assessment in Education:

Principles, Policy, & Practice. 17(3). 287 – 300. ISSN: 0969594X. DOI:

10.1080/0969594X.2010.496239

Whipp. P. (2004). Differentiation in outcomes focused physical education: pedagogical

rhetoric and reality. The University of Western Australia. Paper presented at the

AARE International Educational Research Conference, Melbourne, Nov-Dec 2004.

Wiggins, G. & McTighe J. (1999). Understanding by design. Alexandria, VA: Association

for Supervision and Curriculum Development. http://www.flec.ednet.ns.ca/staff/What

%20is%20Backward%20Design%20etc.pdf

Wood, G. H. (2005). Time to learn: How to create high schools that serve all students. (2nd

Ed.). Portsmouth, NH: Heinemann.

Wormeli, R. (2006). Fair isn’t always equal: Assessing and grading in the differentiated

classroom. Portland, ME: Stenhouse Publishers. ISBN 1-57110-424-0.


Yilmaz, K. (2008). Constructivism: Its theoretical underpinnings, variations, and implications

for classroom instruction. Educational Horizons. 86(3). 161 – 172. (EJ798521).

http://www.eric.ed.gov/PDFS/EJ798521.pdf


Appendix A

Lesson Plan: Math 3 Matrix Operations Unit

Name: Scott BarnettStage 1 – Desired Results

GPS and/or Elements (use only the elements that you teach in THIS lesson!):

MM3A4. Students will perform basic operations with matrices. a. Add/subtract, multiply, and invert matrices, when possible, choosing appropriate methods

including technology. b. Find the inverses of two-by-two matrices using pencil and paper, and find inverses of larger

matrices using technology. c. Examine the properties of matrices, contrasting them with properties of real numbers.

MM3A5. Students will use matrices to formulate and solve problems. a. Represent a system of linear equations as a matrix equation. b. Solve matrix equations using inverse matrices. c. Represent and solve realistic problems using systems of linear equations.

Enduring Understandings:

Students will understand that…

Matrices have many properties and will be able to answer questions concerning determinants, addition/subtraction, multiplication, Cramer’s rule, and inverses.

Real World Understandings (What might transfer to their world?):

Students will answer real world questions concerning matrices, such as encryption. They will use formulas to answer questions about inverses and solving systems.

Essential Question(s):

What are the properties of matrices?

How do you use the determinant to find the inverse of a 2 x 2 matrix?

How do you use Cramer’s rule to find the solution to a linear system?

What do the dimensions of a matrix have to do with how two matrices are related?

What kinds of matrices are commutative?

How do dimensions of a matrix rule how matrices are multiplied?

How is scalar multiplication different from matrix multiplication?

Knowledge (NOUNS for the GPS): Skills (VERBS from the GPS):


Students will know…

Properties of matrix addition/subtraction

Properties of matrix scalar/ matrix multiplication

Formula for determinant/ 2x2 matrix inverse

Real World knowledge (Where do they use this KNOWLEDGE in their real world):

Students will solve problems involving matrices in mock situations including computer email encryption.

Students will be able to…

Understand

Solve

Justify/ verify/ show

Apply

Determine

Real World Applications (Where do they use these SKILLS in their real world):

Use of these skills is evidenced by students’ ability to solve problems and work through tasks effectively.

Stage 2 – Assessment Evidence

Performance Task(s) and Product(s) to be assessed (What will they put in my hand to be assessed that they created individually):

Daily concept worksheets

1 performance task

Formal Assessment Grading Format(s) (How will I grade it, letting them know in advance how to receive every point in my grading scale):

1 unit pre test

3 homework concept checks

2 quizzes covering individual lessons

1 unit post test

Stage 3 – Learning Plan

Procedures/Sequence:

Day 1:

Students will complete the pre-test before beginning the Matrix unit


Students will collect and define terms that will become part of their word wall, (a cumulative collection of vocabulary terms needed for math 3).

Day 2-3:

Students will learn theorems and processes associated with matrix dimension and equality and the properties of matrix addition and subtraction. After a PowerPoint lesson, students will use properties of matrices from their notes to complete 20 questions. Answers will be checked by comparing student work to the worksheet key along with teacher checkpoints throughout the assignment with individual students. A student homework concept check will be taken after day 3.

Day 4:

Students will use learn theorems and processes associated with matrix and scalar multiplication. After a PowerPoint lesson, students will use their notes and individual teacher guidance to answer 18 questions. Answers will be checked by comparing student work to the worksheet key along with teacher checkpoints throughout the assignment with individual students. A student homework concept check will be taken after day 4.

Day 5:

Students will learn formulas and processes associated with matrix inverse of a 2 x 2 matrix. After a PowerPoint lesson, students will use properties of matrices from their notes to complete 20 questions. Answers will be checked by comparing student work to the worksheet key along with teacher checkpoints throughout the assignment with individual students.

Day 6 & 7:

Day 6: Quiz #1: concepts: matrix dimension, equality, addition/subtraction, matrix/scalar multiplication. 10 questions.

Day 6 & 7: Students will use performance task to discovery/ reinforce concepts and properties of matrices and transferring those concepts to real life situations. Answers will be checked by comparing student work to the worksheet key along with teacher checkpoints throughout the assignment with individual students.

Day 8 & 9:

Students will learn concepts and processes associated with solving linear systems of equations with matrix operations (day 8) and Cramer’s rule (day 9). After a PowerPoint lesson, students will use properties of matrices from their notes to complete 15 questions. Answers will be checked by comparing student work to the worksheet key along with teacher checkpoints throughout the assignment with individual students. A student homework concept check will be taken after day 8.


Day 10:

Quiz # 2: concepts: matrix inverse 2 x 2, solving linear systems, Cramer’s rule: 10 questions Unit test review/ flex grouping remediation

Day 11:

Post-Test

Enrichment, Hands-On, Student–Centered Activity:

(outlined above)

Materials:

Power Point lessons; worksheets for each day; performance task; tiered post-tests; pre-tests; quizzes; remediation assignments; homework concept checks.

1. Student LD: (i.e. Process, Product, Content)

Students may come in before or after school or during their study hall to receive more individual help from the teacher. Collaborative classes will utilize the collaborative teacher to assist in all activities, instruction and smaller group activities.

2. Student ESL: (Process, Product, Content)

Students will receive an outline of the unit in their native language.

Appendix B

Unit Plan Rubric for: Scott Barnett


3 2 1 0 Score Comments

Standards/ Learning Objectives

Curriculum standards and learning objectives are specific and clearly stated, linked to each concept

Curriculum standards and learning objectives are specific but vaguely stated and linked to each concept

Curriculum standards and learning objectives are included but not specific nor linked to each concept.

No presence of curriculum standards and learning objectives.

Check Points for Mastery

Check points for mastery are frequent and varied for student redirection and remediation

Check points for mastery are infrequent OR too similar for student redirection and remediation

Check points for mastery are infrequent AND to similar for student redirection and remediation

No presence of adequate check points for mastery

AssessmentPractices

Student product assessed on content and application of the content in a variety of ways.

Student product assessed on content and application of the content but not in a variety of ways.

Student product poorly assessed on content and application of the content and not enough variety.

There is no evidence of assessment of the student

Summative Assessment

Assessment adheres directly to the lesson’s standards and well-designed testing all parts of the unit.

Assessment adheres loosely to the lesson’s standards OR tests most parts of the unit.

Assessment adheres loosely to the lesson’s standards AND tests some or little parts of the unit.

Assessment has little to do with the standards and the unit.

Overall Focus on Student Outcomes

Formative assessments are well placed throughout the unit and are geared toward success on the summative assessment

Formative assessments are well placed throughout the unit and are poorly geared toward success on the summative assessment

Formative assessments are poorly placed throughout the unit and are poorly geared toward success on the summative assessment

Formative assessments are too few and poorly placed and have no correlation to the summative assessment

Appendix C


Reflective Journal Prompts

1. What did we do today?2. What went well?3. What went wrong?4. How did the students feel about what we did?5. How do I feel about what we did?6. Observations…?

Appendix D


Student Assessment Survey


Treatment Group

DO NOT PUT YOUR NAME ON THIS


Test Option 1: Meets

Mastery Level: Meets Standards GHSGTBasic Content Mastery (Score 500 on GHSGT)Maximum Grade: 80/100

Test Option 2: Exceeds

Mastery Level: Exceeds Standards GHSGTProficient Content Mastery (Score 516 on GHSGT)Maximum Grade: 95/100

Test Option 3: Excels

Mastery Level: Advanced Standards GHSGTExemplary Content Mastery (Score > 516 on GHSGT)Maximum Grade: 110/100


Other information necessary for your test decision:

1. Homework checks are aligned with the Meets Test2. Quizzes are aligned with the Exceeds Test3. Class notes and instruction are aligned with the Excels Test

1. Which test would you choose?

2. Why would you choose the test from the previous question?

3. Why would you not choose the other tests?

4. Explain why having a choice on which test you take does/does not matter to you.

5. Explain why a tiered testing program will improve/deteriorate my final grade for this course.


Appendix E



DO NOT PUT YOUR NAME ON THIS

Question Strongly Disagree

1 2Undecided

3 4

Strongly Agree

5I like math.I feel that a test grade shows my teacher how much I really know about a unit.I feel that having a choice on what level of test I take will improve my chances to pass.I like having an option on which level test I takeI feel that taking one version of a test will increase my chances of failure.I feel that if I know the material and I am properly prepared, the type of test I take will not affect my grade.The tier tests option gave me confidence that I could pass the test.All students should take the same tests.

home.lagrange.eduhome.lagrange.edu/educate/Advanced Programs/M.Ed. Defense... · Web viewFigure 2.1...

Documents

Transcript of home.lagrange.eduhome.lagrange.edu/educate/Advanced Programs/M.Ed. Defense... · Web viewFigure 2.1...