Independent Alignment Review of the Florida Standards Alternate … · 2019-08-21 · Independent...
Transcript of Independent Alignment Review of the Florida Standards Alternate … · 2019-08-21 · Independent...
Florida Standards Alternate Assessment Alignment Study i
2017 No. 041
Independent Alignment Review of the Florida
Standards Alternate Assessment – Performance Task (FSAA-PT): Civics, US History, and the
Writing Prompts Final Report
Prepared for:
Vince Verges Florida Department of Education Turlington Building 325 West Gaines Street Tallahassee, Florida 32399
Prepared under:
Florida Department of Education Turlington Building 325 West Gaines Street Tallahassee, Florida 32399
Authors: Yvette M. Nemeth Justin Purl Tatiana Longabach Elizabeth Patton Caroline Wiley
Date:
December 26, 2017
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts i
Independent Alignment Review of the Florida Standards Alternate Assessment – Performance Task (FSAA-PT): Civics, US History, and the
Writing Prompts
Table of Contents
Executive Summary ...................................................................................................................... 1
Overview ................................................................................................................................... 1
Methodology ............................................................................................................................. 1
Alignment Study Workshop ....................................................................................................... 2
Access Point to Standards Alignment Summary ...................................................................... 3
FSAA-PT to Access Points Alignment Summary ...................................................................... 6
Recommendations .................................................................................................................... 9
Chapter 1: Introduction ............................................................................................................... 11
Organization and Contents of the Report ............................................................................... 12
Chapter 2: Alignment Study Design and Methodology ............................................................... 13
Alignment of Assessments and Standards on Content and Performance .............................. 13 AP and FSAA-PT Overview ................................................................................................ 13
Content Alignment and Accessibility ....................................................................................... 14
Scope of Alignment Evaluations ............................................................................................. 16 Training ............................................................................................................................... 16 Panelists ............................................................................................................................. 17 Materials ............................................................................................................................. 18 Procedures .......................................................................................................................... 18 Workshop Progress ............................................................................................................ 20
Chapter 3: Alignment of Access Points to Standards ................................................................. 21
Overview of Access Points ..................................................................................................... 21
LAL Criteria ............................................................................................................................. 21 Criterion 1: Age Appropriateness ........................................................................................ 22 Criterion 2a: Content Centrality ........................................................................................... 22 Criterion 2b: Performance Centrality ................................................................................... 23 Criterion 3: Content Coverage – HumRRO Alignment Method ........................................... 25 Criterion 4: Content Differentiation ...................................................................................... 25 Criterion 5: Achievement ..................................................................................................... 26 Criterion 6: Performance Accuracy ..................................................................................... 27
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts ii
Table of Contents
Chapter 4: Alignment of FSAA-PT Tasks to APs ........................................................................ 28
LAL Criteria ............................................................................................................................. 28 Criterion 1: Age Appropriateness ........................................................................................ 28 Criterion 2a: Content Centrality ........................................................................................... 29 Criterion 2b: Performance Centrality ................................................................................... 29 Criterion 3a: Tasks Represent Intended Content ................................................................ 30 Criterion 3b: Tasks Represent Intended Categories ........................................................... 30 Criterion 3c: Task DOK Represent Alternate Standards ..................................................... 32 Criterion 4: Content Differentiation ...................................................................................... 37 Criterion 5: Achievement ..................................................................................................... 39 Criterion 6: Performance Accuracy ..................................................................................... 45
Chapter 5: Summary and Recommendations ............................................................................. 50
Access Point to Standards Alignment Summary .................................................................... 50
FSAA-PT Alignment Summary ............................................................................................... 52
Recommendations .................................................................................................................. 56
References .................................................................................................................................. 58
Appendix A. Panelist Alignment Review Materials Samples .................................................... A-1
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts iii
Table of Contents (Continued)
List of Tables
Table 1. Percent of Grade-Level APs Which Met Each LAL Criterion .......................................... 5
Table 2. Percent of Grade-Level Tasks Which Met Each LAL Criterion ....................................... 7
Table 3. Grade/Content Areas Included in Alignment Study ...................................................... 14
Table 4. LAL Criteria for AP and FSAA-PT Alignment Evaluation .............................................. 15
Table 5. Professional and Demographic Characteristics of Panelists ........................................ 17
Table 6. Alignment Steps for Panelists’ Ratings ......................................................................... 18
Table 7. Alignment Steps Completed by Each Panel Group June 21-22, 2017 ......................... 20
Table 8. Number of Blueprint Standards Compared to APs for Writing ...................................... 21
Table 9. Number of Blueprint Standards Compared to APs for Social Studies .......................... 21
Table 10. Percent of Writing APs Rated as Age Appropriate ..................................................... 22
Table 11. Percent of Social Studies APs Rated as Age Appropriate .......................................... 22
Table 12. Percent of Writing APs Linked to On-Grade Level Writing LAFS ............................... 23
Table 13. Percent of Social Studies APs Linked to On-Grade Level NGSSS for Social Studies ...... 23
Table 14. Percent of Writing APs at Lower, Same, or Higher Levels of Complexity Compared to Related Writing Standards ................................................................... 24
Table 15. Percent of Social Studies APs at Lower, Same, or Higher Levels of Complexity Compared to Related NGSSS for Social Studies ...................................................... 24
Table 16. Percent of Linked APs at Various Levels of Performance Centrality – Writing ........... 25
Table 17. Percent of Linked APs at Various Levels of Performance Centrality – Social Studies ...... 25
Table 18. Consensus AP Content Differentiation – Writing ........................................................ 26
Table 19. Percent of APs Rated as Accessible to Different Disability Groups – Writing............. 27
Table 20. Percent of APs Rated as Accessible to Different Disability Groups – Social Studies ........ 27
Table 21. Percent of Writing Tasks Rated as Age Appropriate .................................................. 28
Table 22. Percent of Social Studies Tasks Rated as Age Appropriate ....................................... 29
Table 23. Percent of Writing Tasks at Various Levels of Performance Centrality....................... 29
Table 24. Percent of Social Studies Tasks at Various Levels of Performance Centrality ........... 30
Table 25. Writing Task Alignment Ratings .................................................................................. 30
Table 26. Social Studies Task Alignment Ratings ...................................................................... 30
Table 27. Mean Number of Aligned Writing Items by Content Category .................................... 31
Table 28. Mean Number of Aligned Social Studies Items by Content Category ......................... 31
Table 29. Percent of Writing Tasks at Lower, Same, or Higher Levels of Complexity ................ 32
Table 30. Percent of Social Studies Tasks at Lower, Same, or Higher Levels of Complexity ........... 33
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts iv
Table of Contents (Continued)
List of Tables
Table 31. Distribution of Panelist DOK Ratings – Writing ........................................................... 33
Table 32. Distribution of Panelist DOK Ratings – Social Studies ............................................... 34
Table 33. Percent of Writing Tasks at Lower, Same, or Higher Levels of Complexity Compared to Related APs ......................................................................................... 34
Table 34. Percent of Social Studies Tasks at Lower, Same, or Higher Levels of Complexity Compared to Related APs ...................................................................... 34
Table 35. Percent of Writing Tasks at Lower, Same, or Higher Levels of Volume of Information ................................................................................................................ 35
Table 36. Percent of Social Studies Tasks at Lower, Same, or Higher Levels of Volume of Information ............................................................................................................ 35
Table 37. Percent of Writing Tasks at Lower, Same, or Higher Levels of Vocabulary................ 36
Table 38. Percent of Social Studies Tasks at Lower, Same, or Higher Levels of Vocabulary ........... 36
Table 39. Percent of Writing Tasks at Lower, Same, or Higher Levels of Context ..................... 37
Table 40. Percent of Social Studies Tasks at Lower, Same, or Higher Levels of Context ......... 37
Table 41. Consensus Content Differentiation Across Grades – Writing Prompt Portion of the ELA FSAA-PT ..................................................................................................... 38
Table 42. Consensus Content Differentiation – Social Studies FSAA-PT Item Set .................... 39
Table 43. Consensus Student Learning – Writing Prompt Portion of the ELA FSAA-PT ............ 40
Table 44. Consensus Student Learning – Social Studies FSAA-PT ........................................... 44
Table 45. Percent of FSAA-PT Tasks as Accessible to Different Disability Groups – Writing ........... 46
Table 46. Percent of FSAA-PT Tasks as Accessible to Different Disability Groups – Social Studies ............................................................................................................ 46
Table 47. Percent of FSAA-PT Tasks as Amenable to Accommodations or Supports – Writing ...... 46
Table 48. Percent of FSAA-PT Tasks as Amenable to Accommodations or Supports – Social Studies ............................................................................................................ 47
Table 49. Consensus Whole Test Barriers to Demonstrating Student Knowledge ..................... 47
Table 50. Consensus Whole Test Barriers to Demonstrating Student Knowledge for Certain Disability Groups ........................................................................................... 48
Table 51. Percent of Grade-Level APs Which Met Each LAL Criterion ...................................... 51
Table 52. Percent of Grade-Level Tasks Which Met Each LAL Criterion ................................... 53
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 1
Independent Alignment Review of the Florida Standards Alternate Assessment – Performance Task (FSAA-PT): Civics, US History, and the
Writing Prompts
Executive Summary
Overview
The Florida Department of Education (FDOE) requested an external, independent alignment study (review and analysis) of the Florida Standards Alternate Assessment – Performance Task (FSAA-PT) in civics and US history. In addition, the writing prompt portion of the English/Language Arts (ELA) assessment1 was evaluated. In general, the ELA assessment includes Reading and Writing selected response items as well as a writing prompt. An alignment review provides one form of evidence supporting the validity of the state assessment system. All aspects of the state assessment system must coincide, including the grade-level standards, academic content standards, and each assessment. In general, the FSAA-PT is an alternate assessment designed for students with significant cognitive disabilities. As a result of their cognitive disabilities, these students would not be appropriately assessed by the general statewide assessment program. The assessment is designed to evaluate the Florida Standards Access Points (AP) for Language Arts and the Next Generation Sunshine State Standards for Social Studies Access Points (AP)2, a reduced and marginally simplified version of the Florida content standards.
The Human Resources Research Organization (HumRRO) was contracted to complete the alignment of the FSAA-PT for FDOE. Our alignment approach was designed to indicate the extent to which the reviewed APs, associated with the FSAA-PT blueprint for the ELA writing prompt section, civics, and US history, are related to the Language Arts Florida Standards (LAFS) and the Next Generation Sunshine State Standards (NGSSS) for Social Studies. In addition, we evaluated whether the APs are age appropriate; the APs differ in breadth and depth across grade levels; and the APs are accessible to a wide range of students with varying disabilities. Our approach is flexible and allows for whether items on the FSAA-PT are not only related to assigned APs, reporting categories, and cognitive complexity, but to age appropriateness, differing levels of complexity across tasks, and accessibility of students with varying disabilities as well.
Methodology
HumRRO used the Links for Academic Learning alignment method (LAL) developed by the National Alternate Assessment Center as a basis to conduct the content alignment reviews and analyze the results (Flowers, Wakeman, Browder, & Karvonen, 2007). The original LAL method includes Webb’s methodology for Criterion 3: Content Coverage. HumRRO adapted the LAL
1 An alignment study of the ELA FSAA-PT was conducted in 2016 and results can be found in Nemeth, Purl, and Smith (2016 No. 029). For that study, the full ELA FSAA-PT was evaluated; however, the writing prompts were at the field test stage while the rest of the assessment was operational. Thus, the current alignment study only focuses on the now operational writing prompts and the associated writing APs. 2 Downloadable versions of the Florida Standards and Next Generation Sunshine State Standards Access Points can be found at: http://www.cpalms.org/Downloads.aspx
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 2
method3 to best fit FDOE’s data analysis needs and substituted the HumRRO alignment methodology for Webb’s methodology in Criterion 3. The criteria considered in this study are listed below:
Criterion 1: Age Appropriate – The content is referenced to the student’s assigned grade-level (based on chronological age).
Criterion 2: Standards Fidelity - 2a: Content Centrality – The target content of the APs maintain fidelity with the
content of the original grade-level standards. - 2b: Performance Centrality – The focus of achievement of the APs maintain
fidelity with the specified performance in the grade-level standards.
Criterion 3: Content Coverage – (using the HumRRO Alignment Method) - 3a: Content Representation – A basic measure of alignment between APs and
item content. Simply stated, this criterion is a check of the AP assigned to each item by item writers
- 3b: Category Representation – This is a measure of how well items represent reporting categories as indicated in the test blueprint.
- 3c: Depth of Knowledge (DOK) Representation – This is a measure of the cognitive complexity of tasks and whether that represents cognitive complexity of the content in the APs.
- 3d: Category Reporting – Reporting categories are sufficient measured.
Criterion 4: Content Differentiation – The level of differentiation of content across grade-levels within a grade span panel group.
Criterion 5: Achievement – The expected achievement provides the students an adequate opportunity to show learning of grade referenced academic content.
Criterion 6: Performance Accuracy – The potential barriers to demonstrating what students know and can do are minimized in the assessment to increase measurement accuracy of student performance.
The LAL method is appropriate for alignment of the APs to the corresponding LAFS and NGSSS for Social Studies, as well as for alignment of the FSAA-PT to APs. Criteria 1, 2, 4, and 6 are appropriate for the alignment of APs to Standards. For the alignment of the FSAA-PT to APs, all six criteria are applicable. The methodology described above meets or exceeds prior requirements for Federal peer review.
Alignment Study Workshop
Five panel groups (writing grades 4-5, 6-8, and 9-10; civics EOC; and US history EOC were recruited from a database of Florida educators, both general education and Exceptional Student Education (ESE) teachers, provided by FDOE and Measured Progress. HumRRO conducted the alignment workshop on June 21-22, 2017 at a hotel in Jacksonville, Florida. The workshop began with a general session to introduce HumRRO staff, review reimbursement logistics, read and sign affidavits of nondisclosure for the secure materials panelists would review, and conduct general training. Throughout the workshop, panelists were told and reminded that the alignment
3 The full LAL method contains an additional criterion. Criterion 1: Academic evaluates whether the content is academic and includes the major domains/strands of the content area. As alternate assessments have progressed, this criterion is no longer of added value. Thus, we did not ask panelists to rate tasks on this criterion and do not refer to it in the report.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 3
ratings and other evaluative information they provided, were independent of both FDOE and Measured Progress.
Panelists received paper copies of the FSAA-PT to review first. They were also provided paper and electronic copies of various resource materials such as the APs, presentation rubric, DOK definitions, and Panelist Instructions to support their evaluation. Panelists used electronic Microsoft Excel rating forms for their data entry and notes.
The HumRRO project director oriented all of the panelists to the work they would conduct at the workshop and supported the facilitators by answering questions and providing further guidance if needed. Because the project director oversaw all groups, she made certain that process decisions and information was shared among the rooms.
Following the general session, panelists began working in their assigned groups. US history and civics panel groups were located in a separate room free from other groups and distractions. Writing 4-5, 6-8, and 9-10 panel groups were located in one room, since their instructions were similar and could be provided to them at the same time. A HumRRO facilitator was assigned to each of the panel groups. The facilitators reviewed how panelists should use materials and provided detailed training on rating procedures. They also answered questions and guided the pace of the workshop.
Before each alignment step was conducted, facilitators trained panelists on the purpose of the step, the rating code definitions, and entering data in the appropriate rating form. Before allowing panelists to work independently on certain tasks, facilitators had panelists complete the first two to three ratings as a group to ensure everyone understood the task and rating code definitions. Additionally, facilitators conducted periodic consistency checks to ensure panelists were continuing to understand the task.
Access Point to Standards Alignment Summary
For this alignment evaluation, panelists reviewed APs, associated with the FSAA-PT blueprints, for the writing prompt section of the ELA assessment, civics, and US history in multiple ways. First, they evaluated the content centrality (Criterion 2) between the blueprint APs and the corresponding LAFS, and NGSSS for Social Studies. Second, panelists evaluated the progression of content (Criterion 4) from one grade to the next only for the blueprint identified writing APs. Lastly, panelists rated the appropriateness and accessibility (Criteria 1 and 6) of the AP content for this population of students.
The rules for the LAL criterion applied to the alignment between blueprint identified APs and the LAFS and NGSSS for Social Studies are as follows:
Criterion 1: Age Appropriateness (individual panelist rating) - 90% or more of the APs were rated as ‘adapted’ or ‘neutral’
Criterion 2a: Content Centrality (individual panelist rating) - 90% or more of the APs were linked to the LAFS or NGSSS for Social Studies
Criterion 2b: Performance Centrality (individual panelist rating) - 90% or more of the APs were comparable in complexity to the LAFS or NGSSS for
Social Studies
Criterion 4: Content Differentiation (consensus group rating) - Dimension ratings were ‘clear’ or ‘partial’ and the Identical dimension is ‘no’
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 4
Criterion 6: Performance Accuracy (individual panelist rating) - 90% or more of the APs were accessible to different disability groups
Criterion 3: Content Coverage is not included because the content coverage criterion focuses on the relationship between items and APs regarding content, category, and DOK representation and is not applicable to the AP to Standards evaluation. Criterion 5: Achievement focuses on the degree to which the assessment provides evidence of a student’s ability to demonstrate what they know and can do on grade referenced academic content. Thus, this criterion is not applicable to the evaluation of the AP to Standards relationship.
Table 1 provides summary conclusions on the alignment of the blueprint identified APs to their respective LAFS and NGSSS for Social Studies. As a reminder, only the writing APs and LAFS are of interest in this alignment study. For non-writing APs and LAFS, refer to the Nemeth et al. (2016 No. 029) report. If APs met the criterion, then a green highlighted box containing a ‘’ is assigned. For results falling slightly below a criterion, then a yellow highlighted box containing the criterion results is assigned. Finally, a red highlighted box contains results that fell well below the criterion.
Independent Alignm
ent Review
of the FS
AA
-PT
: Civics, U
S H
istory, and the Writing P
rompts
5
Table 1. Percent of Grade-Level APs Which Met Each LAL Criterion
Criterion 1 Criterion 2 Criterion 4 Criterion 6
Age Appropriate Content Centrality Performance Centrality Content Differentiation Performance Accuracy
Is the content of the
APs age appropriate?
Does the AP content link with the associated LAFS
or NGSSS?
Are the APs comparable in complexity to the LAFS & NGSSS?
Does content differ across grade-levels
within a grade span?4
Are barriers to demonstrating student knowledge minimized?
Tables 10-11 Tables 12-13 Tables 16-17 Table 18 Tables 19-20
W4 0 out of 5
W5
W6
2 out of 5
W7
W8
W9 NA
W10
Civ NA
USH NA
4 For Writing grades 4-8, a comparison between this study and the 2016 alignment study (see Nemeth, et al. [2016 No. 029]) reveals drastically different results. In the 2016 alignment study, panelists evaluated all the blueprint identified APs for Language Arts associated with the ELA FSAA-PT. However, the current alignment study required panelists to review the blueprint identified APs for Language Arts associated with only the writing prompt section of the ELA FSAA-PT.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 6
In general, the blueprint identified APs exhibited high content linkage with the grade-level standards. Specifically, the APs across all grades and subjects were rated by panelists as age appropriate (Criterion 1) and were found to assess the same content and performance expectations as the grade-level standards (Criterion 2) for all grades and subjects. Panelists felt that the blueprint identified APs were accessible to different disability groups (Criterion 6).
Criterion 4 (content differentiation) was assessed only for writing grades 4-5 and 6-8 and was conducted as a group consensus activity within each grade span. The civics assessment and the US history assessment were not intended to have content differentiation between grades. Similarly, writing grades 9-10 APs were the same between these two grades. Content differentiation appears to be an area in need of improvement. For writing grades 4-5, panelists found content differentiation to be low in all areas (breadth, depth, prerequisite, new learning), and consequently rated the APs to be identical between the grades. For writing grades 6-8, the panelists found no differentiation in breadth between any of the three grades, low differentiation in new learning between grades 7 and 8, and partial differentiation in depth and prerequisite. As a result, they concluded that one of the APs (AP 2.4) is identical across the grades.
FSAA-PT to Access Points Alignment Summary
Table 2 provides summary conclusions on the alignment of the FSAA-PT writing prompt section of the ELA, civics, and US history assessments to the associated LAFS and NGSSS for Social Studies APs, respectively. If tasks met the criterion, then a green highlighted box containing a ‘’ was assigned. For results falling slightly below a criterion, a yellow highlighted box containing the criterion results was assigned. Finally, a red highlighted box contains results that fell well below the criterion.
The rules for the LAL and HumRRO criterion applied to the alignment between FSAA-PT tasks and APs are as follows:
Criterion 1: Age Appropriateness (individual panelist rating) - 90% or more of the tasks were rated as ‘adapted’ or ‘neutral’
Criterion 2b: Performance Centrality (individual panelist rating) - 90% or more of the tasks were rated as ‘some’ or ‘all’
Criterion 3a: Content Representation (individual panelist rating) - 90% or more of the tasks were rated as ‘partial’ or ‘fully’ aligned
Criterion 3b: Category Representation (based on individual panelist rating) - Tasks match the FSAA-PT Test Specifications targets
Criterion 3c: DOK Representation (individual panelist rating) - 50% or more of the prompt tasks and task 3 of an item set were at the same or
higher DOK level as the AP - 90% or more of the assigned complexity ratings are confirmed by panelists for DOK,
Volume of Information, Vocabulary, and Context
Criterion 4: Content Differentiation (consensus group rating) - Dimension ratings were ‘clear’ or ‘partial’ and the Identical dimension is ‘no’
Criterion 5: Achievement (consensus group rating) - 6 of the 7 dimensions have some level of inference, either low or high - At least 4 dimensions have a high level of inference
Criterion 6: Performance Accuracy (individual panelist rating) - 90% or more of the tasks were accessible to different disability groups - 90% or more of the tasks were amenable to accommodations or supports
Independent Alignm
ent Review
of the FS
AA
-PT
: Civics, U
S H
istory, and the Writing P
rompts
7
Table 2. Percent of Grade-Level Tasks Which Met Each LAL Criterion
Criterion 1 Criterion 2 Criterion 3 Criterion 4 Criterion 5 Criterion 6
Age
Appropriate Performance Centrality
Content Coverage Content
Differentiation Achievemen
t Performance
Accuracy
Item
Alignment
Represent Intended Categorie
s
Task Complexity
Is th
e co
nten
t of
the
task
s ag
e ap
pro
pria
te?
Is th
e ite
m s
et ta
sk
com
para
ble
in
com
plex
ity to
the
A
P?
Are
task
s fu
lly
alig
ned
with
A
Ps?
Do
task
s ad
equa
tely
re
pres
ent r
epor
ting
cate
gorie
s?
Do
task
s re
flect
the
rang
e of
DO
K in
the
AP
s?5
Do
pane
lists
agr
ee w
ith
DO
K?
Do
pane
lists
agr
ee w
ith
Vol
ume
of In
form
atio
n?
Do
pane
lists
agr
ee w
ith
Voc
abul
ary?
Do
pane
lists
agr
ee w
ith
Con
text
?
Wri
tin
g:
Do
prom
pts
incr
ease
in c
om
plex
ity
acro
ss g
rade
leve
ls?
6
C
ivic
s &
US
H:
Do
task
s w
ithin
an
item
set
in
crea
se in
co
mpl
exity
?
Stu
dent
ach
ieve
men
t de
mon
stra
tes
lear
ning
.
Are
task
s ac
cess
ible
to
diffe
rent
dis
abili
ty
grou
ps?
Are
task
s am
ena
ble
to
acco
mm
odat
ions
or
supp
orts
?
Tables 21-22
Tables 23-24
Tables 25-26
Tables 27-28
Tables 33-34
Tables 29-30
Tables 35-36
Tables 37-38
Tables 39-40
Tables 41-42
Tables 43-44
Tables 45-46
Tables 47-48
W4 17%
80% 70% 75% 75% 2 out of 5
6 out of 7; 3
75%
W5 13% 6 out of 7; 3
W6 88% 17%
0 out of 5
W7 88% 0%
W8 0%
W9 21% 65% 85% 75%
W10 33%
Civ 3 out of 7; 3
USH
5 For Writing grades 4-10, a comparison between this study and the 2016 alignment study (see Nemeth, et al. [2016 No. 029]) reveals different results. In the 2016 alignment study, panelists evaluated the field test writing prompts still under development. Also, panelists from last year and this year were not the same educators. 6 In the 2016 alignment study, panelists evaluated all tasks and prompts on the ELA FSAA-PT. However, the current alignment study required panelists to review only the writing prompts of the ELA FSAA-PT.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 8
In general, the civics and US history FSAA-PT exhibited good overall alignment with the fewest areas for improvement. The writing prompts associated with the ELA FSAA-PT showed more areas for improvement. Panelists found the APs and assessment tasks for all subjects and grades to be age appropriate (Criterion 1). They determined that for the most part, the assessment tasks maintain fidelity with the performance expectations in the APs for civics and US history, and for writing grades 4-5, 8, and 9-10. For writing grades 6 and 7, 88% of tasks were found to call for comparable performance levels as the standards (Criterion 2).
There were mixed results on Criterion 3. Panelists found the tasks for each grade and subject to be fully aligned with the standards, and the percent of aligned tasks matches test specifications. However, panelists found the task cognitive complexity to be substantially lower than the AP complexity in writing for all grades. In civics and US history, the cognitive complexity of tasks was found to match the AP cognitive complexity. For the most part, panelists agreed with the assigned DOK. There was some disagreement in writing grades 4 and 9, but the overall cognitive complexity assigned by the panelists was either the same or higher. For writing grade 4, panelists agreed with 80% of assigned DOK levels, rating 10% of tasks as requiring a higher DOK, and 10% of tasks as requiring a lower DOK. For writing grade 9, panelists agreed with only 65% of assigned DOK levels, rating other tasks as lower (25%) and higher (10%). Similarly, panelists agreed with most of the Volume of Information levels, except for writing grades 4 and 9. They agreed with 70% of grade 4 writing tasks, and rated the other 30% as having a higher Volume of Information. For grade 9, on the other hand, panelists agreed with 85% of the tasks, and rated the other 15% as having a lower Volume of Information. For the most part, panelists agreed with the Vocabulary rating, with the exception of writing grade 4 and grade 9, where they agreed with 75% of the tasks. For grade 4, the other 15% of the tasks were rated as having a higher Vocabulary level, and for grade 9, 10% of the tasks were rated as having a lower Vocabulary level while 15% were rated as having a higher Vocabulary level. Panelists agreed with the rating of Context in most cases, with the exception of grade 4. In this case, they agreed with 75% of the tasks, and rated the Context of the other 25% of the tasks as higher.
Criterion 4 was evaluated differently for the writing and social studies assessments; however, the criterion was evaluated as a group consensus rating for all panel groups. Since the writing tasks, unlike the tasks for civics and US history, were not ordered from easiest to hardest, these tasks were evaluated for content differentiation in the following way: Is there a progression in breadth, depth, prerequisite, new learning from lower grade prompt 1 and prompt 2 to higher grade prompt 1 and prompt 2? Content differentiation ratings at the prompt level agree with the overall AP content differentiation ratings for these grades. The progression of prompts for grades 4-5 writing was judged to have no differentiation in new learning, limited prerequisite differentiation, and partial differentiation in the breadth and depth. As a result, panelists concluded that content differentiation was limited for these grades. However, for writing grades 6-8, panelists evaluated depth as limited between grades 6-7 and 8, and prerequisite as limited between grades 6 and 7. They stated that breadth and new learning were absent across all three grades, and consequently determined that the tasks were identical between grades 6 and 7, but not between grade 8 and the other two grades. For writing grades 9-10, panelists found clear content differentiation. For civics and US history, panelists were asked to evaluate whether the content differentiation existed from task 1 to task 3 in an item set. Overall, panelists found clear content differentiation for civics and US history.
Criterion 5, a group consensus rating within each grade span, is an evaluation of whether the assessment system, in general, provides student demonstration of learning. Here, as well, some dimensions were rated by panelists as providing ‘no inference’ of student learning. For example,
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 9
in civics panelists stated that little inference can be made about the presence of new learning, and the assessment results may be challenging to generalize across people and settings, and materials and activities. In grade 4-5 writing, panelists stated as a group that the assessment was seen to include tasks where hand over hand teacher guidance may be reducing the level of inference about student knowledge; therefore, the level of independence was judged to provide no inference about student knowledge. For the most part, across the subjects, panelists felt the FSAA-PT provides an assessment in which student learning can be demonstrated.
One thing we found in the course of this study is that even after we discussed with panelists the allowable accommodations and modifications as described in the test administration manual, panelists tend to think back to how these and similar assessments are being administered in the field. In some cases, for example, if a teacher is unable to elicit a response from a student by the means specified in the test administration manual, they are going to implement some solutions that are not explicitly prohibited, but also not explicitly endorsed in the manual. While the manual does not explicitly endorse hand over hand assistance, the participants observed it being implemented in the field, and mentioned it in the discussion. We value these statements by the teachers, even though they diverge from the test administration manual instructions, since they come from their expertise. To make ratings more consistent, it may be helpful to state more explicitly in the test administration manual not only which accommodations/modifications are allowed, but also which ones are prohibited.
For Criterion 6, the ratings provided by panelists for all grades and subjects except for writing grade 4 found 100% of the tasks to be accessible to different disability groups. For writing grade 4, only 75% of the tasks were rated as accessible to different disability groups. Panelists voiced concerns about the tasks translating to ASL, and students with visual impairments having trouble with some tasks.
Recommendations7
HumRRO makes the following recommendations to strengthen the alignment between the components of the Florida assessment system:
Review the cognitive complexity of writing tasks. Tasks should assess APs at the same or higher complexity level. This ensures the tasks are appropriately assessing the content of the AP that the task is asking a student to demonstrate knowledge and ability. The majority of writing tasks, associated with prompt 1, did not assess students on a cognitive complexity level that was similar to the cognitive complexity level of the AP; instead, tasks were judged to be too low. It is recommended that the writing tasks, particularly for prompt 1, be reviewed to ensure the cognitive complexity level of the tasks are in accordance with the assessment design and, if needed, additional writing tasks developed measuring a wider range of complexity to better match the cognitive complexity of the APs.
Review content differentiation of writing APs and tasks across grades. APs should increase in content breadth, depth and newer knowledge, as well as growth on prerequisite skills. Similarly, for assessments in which tasks are structured in such a way that they increase in cognitive complexity between grades (writing grades 4-10), there should be a progression of breadth and depth between the tasks between grades. However, in writing grades 4-5 and 6-8 no content differentiation was found between
7 A supplemental appendix, not for public dissemination as it contains item information, identifying specific items and tasks that FDOE and Measured Progress may want to review will be provided.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 10
APs across grades within grade spans, and little task differentiation between grades within grade spans among the prompts. It is recommended, especially for writing grades 4-5 and 6-8 that the APs and tasks be reviewed to ensure appropriate content differentiation within and across grade spans. If the content differentiation between APs and thus prompts is not meant to be reflected in the AP, per se, but in the complexity of the reading passage associated with the writing prompt, then additional training or communication to educators in the field regarding such is recommended.
Review the DOK, Volume of Information, Vocabulary, and Context assigned to tasks. For writing grades 4 and 9, panelists agreed with less than 90% of assigned DOK, Volume of Information, Vocabulary, and Context (writing grade 4 only) assigned to tasks. It is recommended, especially for these grades and subjects, for the tasks to be reviewed to ensure they reflect the appropriate DOK, Volume of Information, Vocabulary, and Context.
Review the degree to which the assessment provides evidence of a student’s ability to demonstrate what they know and can do. For civics and writing grades 4-5, panelists expressed concern that the tasks may not generalize to people and settings, and materials and activities, and that a student’s responses may not be sufficiently independent from the teacher. It is recommended that the accommodations and modifications allowed and not allowed is explicitly stated in the test administration manual.
Review the accessibility of tasks to different disability groups. For grade 4 writing, panelists rated only 75% of the tasks as accessible to all disability groups. Their specific concerns were the accessibility of tasks for deaf, deaf/ blind students, and students who communicate nonverbally with pictures. While only grade 4 writing did not meet the criterion for accessibility of tasks, concerns about these population groups were voiced by panelists in other subject groups as well. It is recommended that accommodations for these groups are provided and/or outlined in a more clear and specific fashion.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 11
Independent Alignment Review of the Florida Standards Alternate Assessment – Performance Task (FSAA-PT): Civics, US History, and the
Writing Prompt
Chapter 1: Introduction
The Florida Department of Education (FDOE) requested an external, independent alignment study (review and analysis) of the Florida Standards Alternate Assessment – Performance Task (FSAA-PT) End of Course (EOC) in civics and US history. In addition, the writing prompt portion of the English/Language Arts (ELA) assessment was evaluated. In general, the ELA assessment includes Reading and Writing selected response items as well as a writing prompt. An alignment review provides one form of evidence supporting the validity of the state assessment system. Alignment results demonstrate that assessments represent the full range of the content standards, and they measure student knowledge in the same manner and at the same level of complexity as expected in the content standards. All aspects of the state assessment system must coincide, including the grade-level standards, academic content standards, and each assessment.
FDOE requested the alignment study to meet both state and federal requirements. The federal requirement of the U.S. Department of Education (USDE) stems from the No Child Left Behind (NCLB) Act of 2001 and most recently the Every Student Succeeds Act (ESSA) of 2015. The federal government has established regulations for students with significant cognitive disabilities, often referred to as the “1% rule” (U.S. Department of Education, 2005). This rule allows the state to accommodate students with significant cognitive disabilities by setting different performance expectations for up to 1% of their student population. States can develop alternate academic standards, achievement standards, and assessments that more fairly and accurately demonstrate the achievement of these students. However, states must show that the alternate academic standards and achievement standards for these students link to the general, statewide grade-level expectations, although the breadth and depth of these expectations can be reduced (USDE, 2005).
The FSAA-PT is an alternate assessment designed for students with significant cognitive disabilities. Because of their cognitive disabilities, these students would not be appropriately assessed by the general statewide assessment program. The FSAA-PT EOC in civics and US history consists of 16 operational item sets with each item set containing three tasks ranging from low to high complexity. A student’s teacher has the ability to scaffold the first task, if needed, by reducing the response options if the student does not respond correctly. The second and third tasks do not allow for scaffolding if the student responds incorrectly. Students are also provided with appropriate stimuli, if needed, to demonstrate parts of the question, as necessary. The FSAA-PT writing prompt portion of the ELA assessment consists of two prompts that represent two levels of complexity. Prompt 1 consists of five selected-response questions in response to text. These questions are not written to increase in complexity, but are intended to lead a student to a full writing product. All five questions must be administered to the student; there is no scaffolding allowed. Prompt 2 is an open response format that requires a student to create a writing product (e.g., an essay). The assessment is designed to evaluate the Florida Standards Access Points (AP) for Language Arts and the Next Generation Sunshine State
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 12
Standards for Social Studies Access Points (AP)8, a reduced and marginally simplified version of the Florida content standards.
Therefore, in accordance with federal requirements, Florida must demonstrate that: (1) the Language Arts Florida Standards (LAFS) and the Next Generation Sunshine State Standards (NGSSS) for Social Studies link to the corresponding APs; and (2) the FSAA-PT writing prompt portion of ELA, civics, and US history link to the corresponding APs.
Organization and Contents of the Report
This report contains five chapters. Chapter 2 explains alignment methodologies, including general methods used to evaluate alignment of alternate assessments. Subsequent chapters provide alignment results for comparison between the components of the assessment system: (a) Chapter 3 presents results of the alignment comparison between the APs and the LAFS and NGSSS for Social Studies; (b) Chapter 4 presents results on the content review of the FSAA-PT writing prompts and tasks relative to the corresponding LAFS and NGSSS for Social Studies; and (c) Chapter 5 provides recommendations for FDOE to strengthen alignment over time. Appendix A includes examples of rating forms and training materials used in the alignment workshop.
8 Downloadable versions of the Florida Standards and Next Generation Sunshine State Standards Access Points can be found at: http://www.cpalms.org/Downloads.aspx
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 13
Chapter 2: Alignment Study Design and Methodology
In this section, we discuss key concepts related to alignment research, followed by a description of the alignment evaluations and methods used as part of the study.
Alignment of Assessments and Standards on Content and Performance
Alignment studies answer one vital question related to the validity of an assessment, “Does the assessment content adequately reflect the content that students are expected to learn as provided in the state standards?” For Florida, the content is found in the APs associated with the LAFS and NGSSS for Social Studies. Assessments must measure only the content specified in the standards, and student scores generated from these assessments should adequately reflect student knowledge of the content standards. The FSAA-PTs were built based on the assessable APs listed in the Florida Standards Alternate Assessment Performance Task: Test Design, Blueprint, and Item Specifications for English Language Arts and Social Studies (FSAA-PT Test Specifications).
In general, alignment evaluations for an assessment reveal the breadth, or scope, of knowledge as well as the depth-of-knowledge, or cognitive processing, expected of students by the state’s content standards. Alignment analyses for alternate assessments help to answer questions such as the following:
How much and what type of content is covered by the FSAA-PT?
Is the content in the alternate assessment or alternate standards sufficiently similar to the expectations of Florida’s content standards?
Are students asked to demonstrate this knowledge at the same level of rigor as expected in the full content standards?
Does the assessment accurately measure student knowledge of content standards?
Is the alternate assessment accessible to all students in the targeted population?
These questions can be grouped into two categoriescontent alignment and performance alignment. However, all alignment evaluations tie back to the state content standards.
AP and FSAA-PT Overview
The FSAA-PT is available for those students with significant cognitive disabilities who, even with accommodations, the general assessment is not suitable due to a variety of disabilities. A students’ knowledge and understanding of the APs in writing, civics, and US history are measured by the set of prompts and item tasks on the FSAA-PT.
The FSAA-PT is administered one-on-one between the student and the student’s teacher or other licensed professional who has worked with the student. The FSAA-PT in civics and US history is designed such that an item set, containing three tasks, measures one AP. The first task is written to the Participatory AP, the second task is written to the Supported AP, and the third task is written to the Independent AP. A student’s teacher can scaffold the first task by reducing the response options if the student does not respond correctly. If scaffolding is used on the first task, then the student does not receive the second and third tasks. Subsequently, the second and third tasks do not allow for scaffolding if the student responds incorrectly.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 14
The exception to the structure of the tasks outlined above is the writing prompt portion of the ELA assessment. To begin, scaffolding is not allowed for the writing prompt section of the ELA assessment. The writing prompt section consists of two different prompts. The first prompt consists of five selected-response questions associated with a passage, and the second prompt consists of a single open-response format prompt associated with another passage. Table 3 shows the grade levels in which each FSAA-PT subject area is administered as well as all the assessments reviewed in this alignment study. The civics and US history EOC assessments are administered to the majority of students at the indicated grade level but not exclusively.
Table 3. Grade/Content Areas Included in Alignment Study
Grade Writing Civics EOC US History EOC
4 X
5 X
6 X
7 X X
8 X
9 X
10 X
High School X
Content Alignment and Accessibility
Alignment methodologies can be used on general and alternate assessments. Several methods of alignment (e.g., Porter, 2002; Webb, 1997, 1999, 2005) are in current use and involve rating a number of different aspects of assessment items relative to the content standards. In particular, alignment studies of alternate assessments often require review of additional aspects of alignment unique to the design of the alternate assessments. These dimensions include the extent to which the alternate benchmarks link to the general content standards and to the accessibility of the assessment system to students with a variety of disabilities. Alternate assessments differ from general state assessments in form and structure; thus, an alignment methodology must be responsive to these differences. Approaches outlined in the Links for Academic Learning (LAL) Alignment Method (Flowers, Wakeman, Browder, & Karvonen, 2007) and HumRRO’s alignment methodology (Nemeth, Purl, & Smith, 2016; Smith, Deatz, Wen, & Nemeth, 2014; Smith, Wen, Nemeth, Levinson, & Deatz, 2014) provides an overall model for evaluating alternate assessments. The methodology we used meets or exceeds prior requirements for Federal peer review.
Links for Academic Learning Alignment Method. For this alignment study, HumRRO used the Links for Academic Learning alignment method (referred to in this report as LAL) developed by the National Alternate Assessment Center as a basis to conduct the content alignment reviews and analyze the results (Flowers et. al.,2007). The original LAL method includes Webb’s methodology for Criterion 3: Content Coverage. HumRRO adapted the LAL method9 to best fit FDOE’s data analysis needs and substituted the HumRRO alignment methodology for Webb’s methodology in Criterion 3. The criteria are listed below:
9 The full LAL method contains an additional criterion. Criterion 1: Academic evaluates whether the content is academic and includes the major domains/strands of the content area. As alternate assessments have progressed, this criterion is no longer of added value. Thus, we did not ask panelists to rate tasks on this criterion and do not refer to it in the report.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 15
Criterion 1: Age Appropriate – The content is referenced to the student’s assigned grade-level (based on chronological age).
Criterion 2: Standards Fidelity - 2a: Content Centrality – The target content of the APs maintain fidelity with the
content of the original grade-level standards. - 2b: Performance Centrality – The focus of achievement of the APs maintain
fidelity with the specified performance in the grade-level standards.
Criterion 3: Content Coverage – (HumRRO Alignment Method, described in more detail on the following page) - 3a: Content Representation – Items represent AP content. - 3b: Category Representation – Items represent content categories. - 3c: Depth of Knowledge (DOK) Representation – Item DOK represent content
APs. - 3d: Category Reporting – Reporting categories are sufficient measured.
Criterion 4: Content Differentiation – The level of differentiation of content across grade-levels within a grade span panel group.
Criterion 5: Achievement – The expected achievement provides the students an adequate opportunity to show learning of grade referenced academic content.
Criterion 6: Performance Accuracy – The potential barriers to demonstrating what students know and can do are minimized in the assessment to increase measurement accuracy of student performance.
The LAL method is appropriate for alignment of the APs to the corresponding LAFS, and NGSSS for Social Studies, as well as for alignment of the FSAA-PT to APs. Table 4 shows which of the LAL criteria are appropriate for each evaluation. An IR denotes the criterion data was obtained from individual panelist ratings while a CR denotes the criterion data was obtained through a consensus group rating where panelists collectively determined the response.
Table 4. LAL Criteria for AP and FSAA-PT Alignment Evaluation
Alignment Criterion 1 Criterion 2 Criterion 3
Criterion 4 Criterion 5 Criterion 6 2a 2b 3a 3b 3c 3d
APs to Standards
IR IR IR CR IR
FSAA-PT Items to APs
IR IR IR IR IR CR CR IR
Criterion 3: Content Coverage using HumRRO Alignment Method. HumRRO has used this method in previous alignments of alternate assessments for the Minnesota Department of Education in 2013 and 2014 as well as the Indiana Department of Education in 2015. The method borrows much from the Webb (1997, 1999, 2005) alignment methodology, but diverges in key ways that include:
Instructed reviewers to determine whether they agree with item writers’ link to the standards instead of having reviewers provide independent ratings (allowing comparison of reviewers’ judgements to item writers).
Instructed reviewers to assign an overall degree of alignment rating to ascertain if assessments adequately capture the intended content.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 16
Criterion 3a: Tasks Represent Intended Content. This is a basic measure of alignment between APs and tasks. Simply stated, this criterion is a check of the AP, assigned to each task during the item writing process, by a group of independent panelists that were not involved in the item writing process. Using a previously developed rating scale, panelists rated task alignment as (1) not aligned, (2) partially aligned, or (3) strongly aligned. For ratings of 1 or 2, panelists provided an explanation for why the task is poorly aligned or unrepresented within the indicated content and identified another AP to which the task is better aligned, if applicable. We reported the proportion of tasks with each rating. Tasks with ratings of 1 or 2 were identified for scrutiny by FDOE and/or the testing contractor in a confidential, supplemental appendix.
In addition to the ratings, the total number of APs indicated in the test specifications were compared to the task results to verify that a range of APs was being assessed by the test.
Criterion 3b: Tasks Represent Intended Categories For this criterion, we compared the expected distribution of tasks by reporting category (e.g., Origin and Purposes of Law and Government; Roles, Rights, and Responsibilities of Citizens; Government Policies and Political Processes; Organization and Function of Government), as presented in the test specifications, to the actual proportion found on each test. We report acceptability in terms of meeting test blueprint requirements.
Criterion 3c: Task DOK Represent Alternate Standards. This measure is a comparison of the DOK ratings assigned by panelists to FSAA-PT tasks and the APs linked to that task (HumRRO Criterion 1). The DOK ratings assigned to the APs was completed as one of the panelists’ first alignment process steps.
Since the FSAA-PT Test Specifications do not contain ranges for the proportion of tasks at each DOK level, the recommended level of cognitive complexity is taken from Webb’s alignment criteria (2005). For ratings of acceptable, 50% of tasks must be rated at the same or higher DOK level as the APs.
Criterion 3d: Item Sufficiency for Category Reporting. This is a measure of the extent to which reporting categories are sufficiently measured. In contrast to the other criteria, student assessment data is used to inform this criterion. Specifically, we conduct psychometric analyses to determine if the category reporting practices can be supported by evidence of factor structure and reliability estimates rather than simply requiring a minimum number of items per reporting category. Criterion 3d is not included in this alignment study due to the smaller student sample, reduced number of items, and potential variances in assessment administration for the alternate assessment.
Scope of Alignment Evaluations
Two different types of alignment evaluations were performed for this study: (a) the APs linked to the LAFS and NGSSS for Social Studies and (b) the writing prompt portion of the ELA, civics, and US history FSAA-PT tasks linked to the APs in writing, civics, and US history. Both alignment evaluations were conducted using Florida educators and HumRRO staff familiar with alignment studies.
Training
An essential aspect of alignment is training for both HumRRO facilitators and panelists so they are familiar with the methodology. Alignment workshops do not occur weekly nor are all studies
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 17
exactly the same, so it is important to train even experienced alignment facilitators and panelists for the nuances of each study.
Facilitators attended a 2-hour training session that included a presentation of the Florida assessment system, the alignment process steps, and examples of the rating documents panelists would use. The alignment steps for facilitators were summarized in a Facilitator Instructions document. Facilitators participated in a detailed walk-through of the document and specific procedural and anecdotal guidance that could/should be provided to panelists was highlighted.
Panelists’ training was conducted in two ways at the workshop: (1) alignment familiarization training on Day 1 of the workshop as a full group, and (2) targeted procedural training in their panel groups prior to starting each alignment task. The full group training focused on the Florida assessment system and included information specific to the FSAA-PT requirements, the APs, and recent changes that required the current alignment study. The training also covered the roles of FDOE, Measured Progress, HumRRO, and panelists; the definition of alignment; why alignment is important; the alignment process; cognitive complexity; and the rating forms used in the study. The in-group training focused on specific task processes, rating definitions, and calibration activities to reinforce panelists’ shared understanding.
During the general and targeted training, panelists were reminded that their role was to provide their independent judgements using their expert knowledge.
Panelists
Panelists were recruited by FDOE, Measured Progress, and HumRRO from a database of Florida educators, both general education and Exceptional Student Education (ESE) teachers, provided by FDOE and Measured Progress. Each of the five panels (writing grades 4-5, writing grades 6-8, writing grades 9-10, and civics and US history EOC) consisted of a combination of special education teachers and general education teachers or content specialists; each group had at least one special education teacher and at least one general education teacher. Panelists were assigned to groups based on their experience in the subject area and grade level. Table 5 presents the characteristics of the panelists.
Table 5. Professional and Demographic Characteristics of Panelists
Panel Experience Current Position Gender Ethnicity Current Position
Less
than
1
year
1-5
year
s
6-15
ye
ars
Mor
e th
an
15 y
ears
Fem
ale
Mal
e
Whi
te
Bla
ck
His
pani
c
ES
Eb
Tea
cher
Gen
Ed
Tea
cher
Writing Gr 4-5 1 2 1 3 1 2 1 1 2 2
Writing Gr 6-8a 1 2 1 4 3 1 3
Writing Gr 9-10a 2 2 3 1 2 2 1 3
Civics 2 2 2 2 4 1 3
US History 1 2 1 3 1 3 1 2 2
Total: 1 6 9 4 15 5 14 4 1 7 13 a One panelist did not provide ethnicity. b Exceptional Student Education (ESE).
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 18
Materials
Panelists received paper copies of the FSAA-PT to review. They were also provided paper and electronic copies of various resource materials such as the APs, presentation rubric, DOK definitions, and Panelist Instructions to support their evaluation. The panelists used electronic rating forms, in Microsoft Excel. Examples of rating forms and panelist instructions are presented in Appendix A.
Test Forms. There were four alternate forms (Form A – Form D) of the FSAA-PT, which included a common set of items and a set of field test items some of which were on multiple forms. Panelists reviewed all the common FSAA-PT tasks in civics and US history and only the two writing prompts associated with the ELA assessment.
Panelist Instructions and Rating Forms. Panelists were given a Panelist Instructions document listing their alignment tasks, as well as rating codes and code definitions (see Appendix A). The rating forms were Excel documents and panelists completed two individual tasks while the other tasks were consensus during the 2-day workshop (see Appendix A).
Procedures
HumRRO conducted the alignment workshop on June 21-22, 2017 in Jacksonville, Florida. The workshop began with a general session to introduce HumRRO staff, review reimbursement logistics, read and sign affidavits of nondisclosure for the secure materials panelists would review, and conduct 30 minutes of general training. In both the general session and in each panel group, panelists were informed that the alignment reviews were independent from FDOE and Measured Progress, the testing vendor.
Following the general session, panelists began working in their panel groups. US history and civics panel groups were located in a separate room free from other groups and distractions. Writing 4-5, 6-8, and 9-12 panel groups were located in one room, since their instructions were similar and could be provided to them at the same time. A HumRRO facilitator was assigned to each of the panel groups, and the HumRRO project director supported the facilitators by answering questions and providing further guidance if needed. The project director also made certain that the different groups retained their shared understanding of the alignment method and tasks. Panelists received detailed training on rating procedures by the facilitator responsible for leading the group through each alignment step as listed in Table 6.
Table 6. Alignment Steps for Panelists’ Ratings
Step Alignment Step Description
1 LAFS & NGSSS DOK (consensus)
2 LAFS & NGSSS AP DOK (consensus)
3 AP to LAFS/NGSSS alignment
4 AP content differentiation (if applicable) (consensus)
5 FSAA-PT item review
6 FSAA-PT content differentiation (consensus)
7 Whole test (consensus)
8 Student learning review (consensus)
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 19
To begin, the facilitator gave a brief introduction and had all panelists introduce themselves, where they are from, and what they teach. The facilitator provided panelists with the Panelist Instructions document (see Appendix A), writing specific LAFS or NGSSS in Social Studies specific to the panel group, a Depth of Knowledge reference guide specific to the subject area and provided by Measured Progress (see Appendix A), the APs specific to the panel group, electronic Excel files on individual computers, all assessment materials for each grade/subject being reviewed (Test Booklet, Response Booklet, and Passage Booklets [writing only]), and a Presentation Rubric reference sheet provided by Measured Progress (see Appendix A). A single copy of the Test Administration Manual was available in each panel group.
Throughout the workshop, facilitators offered general suggestions and comments when appropriate on procedural concerns; however, they emphasized they would not get involved in determining the ratings since the panelists are valued as the content experts. Before each alignment step was conducted, facilitators trained panelists on the purpose of the step, the rating code definitions, and entering data in the appropriate rating form. Before allowing panelists to work independently on certain tasks, facilitators had panelists complete the first two to three ratings as a group to ensure that everyone understood the task and rating code definitions. Additionally, facilitators conducted periodic consistency checks to ensure that panelists were continuing to understand the task. If ratings varied widely across panelists, then the facilitator would review the task and rating code definitions and inform panelists to alter their ratings only if the panelist felt they were misinterpreting the task and/or rating code definitions.
The first alignment step was to assign a DOK level to the writing specific LAFS or NGSSS for Social Studies that were linked to the corresponding APs. The second step was similar, assigning a DOK level to the APs for each grade/subject listed in the FSAA-PT Test Specifications. Both steps were completed as consensus ratings. For each step, panelists first assigned DOK ratings, independently. Panelists then discussed their ratings and determined a consensus DOK rating for each writing specific LAFS or NGSSS for Social Studies and the corresponding APs. If full group consensus could not be reached, then the DOK agreed upon by the majority of panelists was recorded as consensus.
Next, panelists evaluated the APs on a variety of factors. The first rating was to indicate if the AP content was fully aligned with the linked writing LAFS or NGSSS for Social Studies listed. If not, panelists were asked to explain what content was missing and provide another standard if it was better linked. Additional factors for rating APs included: (a) whether the AP matched the measure of student performance expected in the writing LAFS or NGSSS for Social Studies, (b) whether the AP was appropriate for the chronological age at which it was measured, and (c) whether the content expectation of the AP was accessible to various disability groups. These ratings were made individually; no consensus ratings were obtained.
Panelists then evaluated the APs for differentiation of breadth, depth, prerequisite knowledge, and new knowledge across grades, step 4. This step was only applicable for the panel groups evaluating writing grade 4-5 and writing grade 6-8. Panelists indicated if they found clear, limited, partial, or no differentiation across the grades they reviewed and provided comments regarding their reasoning for their response, with evidence. This task was completed as a consensus rating among panelists.
For step 5, panelists conducted an evaluation of the FSAA-PT tasks on several factors similar to the AP review. FSAA-PT tasks were linked to APs during the item development process, and reviewers were asked to rate how well the task content was aligned (not, partially, fully) with the assigned AP. If they indicated the alignment was partial or not aligned, panelists were asked to
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 20
describe their reasoning and provide another AP they felt was better linked with the task. Panelists continued the FSAA-PT task review by (a) verifying the complexity levels (DOK, Volume of Information [VI], Vocabulary [V], & Context [C]) assigned to the task, (b) whether the task measured student performance of the AP, (c) whether the task was appropriate for the chronological age at which it was measured, and (d) whether the task could be modified or supported without changing the meaning or difficulty.
Steps 6, 7, and 8 were completed as a consensus rating. In step 6, the content differentiation was conducted using the same dimensions and rating levels as the AP review in Step 4. However, for civics and US history panel groups were asked to complete this step with the focus being on the progression of task 1 to task 3 within an item set on the complexity levels. For writing, since there was no progression in difficulty from task 1 to task 5 in prompt 1, the panelists were asked to evaluate content differentiation between grades. Step 7 provided a ‘Whole Test’ rating in which panelists were asked to determine if, overall, barriers existed for some students (i.e., blind, deaf) to demonstrate learning on the FSAA-PT. Lastly, panelists evaluated student learning by providing ratings on the level of inference that can be made about students based on the score they receive, or if the score may be more a result of the teachers or assessment program. As with all the alignment steps, panelists were encouraged to provide comments if they rated a task low on any dimension.
Workshop Progress
The first day consisted of panelists providing DOK consensus for all the writing LAFS and NGSSS for Social Studies, as well as the corresponding APs. All APs were reviewed and rated for step 3. On day 2, panelists completed the remaining tasks (4-8). Table 7 shows the steps that were completed by each panel group.
Table 7. Alignment Steps Completed by Each Panel Group June 21-22, 2017
Panel Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Step 8
Writing Gr 4-5
Writing Gr 6-8
Writing Gr 9-10 NA
Civics NA
US History NA
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 21
Chapter 3: Alignment of Access Points to Standards
Overview of Access Points
The first challenge for evaluating the alignment of any alternate assessment to traditional standards is to define what the alternate assessment purposefully measures, versus what is intentionally omitted from the assessment. The FSAA-PT is designed from a test blueprint specifying the assessed standards items should measure. Items are written to address these blueprint standards, and the blueprint standards are a subset of APs. In this alignment study, panelists evaluated the APs associated with the writing LAFS and NGSSS for Social Studies in the FSAA-PT Test Specifications.
The assessable APs listed in the FSAA-PT Test Specifications is a subset of the grade-level APs, as indicated in Tables 8 and 9 below. Roughly 21 – 44% of the available writing APs are eligible for use in the writing assessments, while 58% and 30% are represented on civics and US history, respectively. The assessable APs are typically selected to represent the most important or key aspects of the content, to be accessible to the widest possible group of students, and to provide the most actionable test results for alternate assessment students and educators.
Table 8. Number of Blueprint Standards Compared to APs for Writing
Grade Number of
Assessable APs Total Number of
Writing APs Percent of APs Represented
Grade 4 7 34 20.59%
Grade 5 15 38 39.47%
Grade 6 17 41 41.46%
Grade 7 9 43 20.93%
Grade 8 18 43 41.86%
Grade 9 20 45 44.44%
Grade 10 17 45 37.78%
Table 9. Number of Blueprint Standards Compared to APs for Social Studies
Grade Number of
Assessable APs Total Number of APs
Percent of APs Represented
Civics 23 40 57.50%
US History EOC 25 82 30.49%
LAL Criteria
For the alignment of APs to Standards, four of the six LAL criteria are suitable: age appropriateness, standards fidelity, content differentiation, and performance accuracy. The remainder of this chapter will highlight the results of these four LAL criteria.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 22
Criterion 1: Age Appropriateness
Criterion 1 pertains to the developmental level of the content included in the APs. For this evaluation, panelists were asked to individually determine whether the content of the APs is appropriate for the age and grade-level indicated. Several response options were possible:
Adapted = Linked to grade-level content Neutral = Content is not age-bound and is appropriate at any age Inappropriate = Content is off-grade level For this criterion, at least 90% of the APs should be rated as ‘adapted’ or ‘neutral’10. As seen in Tables 10 and 11, 100% of the APs were rated as ‘adapted’ or ‘neutral’ for all subjects and grade levels.
Table 10. Percent of Writing APs Rated as Age Appropriate
Grade N N
% Inappropriate % Neutral % Adapted Raters APs
Grade 4 4 7 0.00 75.00 25.00
Grade 5 4 13 0.00 75.00 25.00
Grade 6 4 17 0.00 100.00 0.00
Grade 7 4 9 0.00 100.00 0.00
Grade 8 4 18 0.00 100.00 0.00
Grade 9 4 20 0.00 0.00 100.00
Grade 10 4 17 0.00 0.00 100.00
Table 11. Percent of Social Studies APs Rated as Age Appropriate
Grade N N
% Inappropriate % Neutral % Adapted Raters APs
Civics 4 23 0.00 0.00 100.00
US History 4 25 0.00 44.00 56.00
Criterion 2a: Content Centrality
Panelists were asked to indicate, individually, whether the AP content fully linked to the writing LAFS or NGSSS for Social Studies associated with the AP. For every AP not fully linked to the content standard, panelists provided an explanation of the content missing from the standard and identified an alternate content standard, if applicable. For this criterion, at least 90% of the APs should be rated as ‘yes, the AP content fully links to the writing LAFS or NGSSS for Social Studies’.11
Tables 12 and 13 show the relationship between the APs and the writing LAFS and NGSSS for Social Studies. For writing across all grades, 100% of the blueprint APs contained content fully
10 The LAL method does not specify a minimum for Criterion 1. This minimum level was established by HumRRO. 11 The LAL method does not specify a minimum for Criterion 2a. This minimum level was established by HumRRO.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 23
linked to the writing LAFS. In civics and US history, 99% of the blueprint APs contained content fully linked to the NGSSS for Social Studies.
Table 12. Percent of Writing APs Linked to On-Grade Level Writing LAFS
Grade N
Raters
N % Yes % No
APs
Grade 4 4 7 100.00 0.00
Grade 5 4 13 100.00 0.00
Grade 6 4 17 100.00 0.00
Grade 7 4 9 100.00 0.00
Grade 8 4 18 100.00 0.00
Grade 9 4 20 100.00 0.00
Grade 10 4 17 100.00 0.00
Table 13. Percent of Social Studies APs Linked to On-Grade Level NGSSS for Social Studies
Grade N
Raters
N % Yes % No
APs
Civics 4 23 98.91 1.09
US History 4 25 99.33 0.67
Criterion 2b: Performance Centrality
The APs should link to the writing LAFS and NGSSS for Social Studies in performance expectations as well as content, although the depth of these expectations can be reduced for the alternate assessment. Several analyses were conducted to compare the performance levels specified in the APs to the writing LAFS and NGSSS for Social Studies. One analysis focused on the depth of knowledge (DOK) ratings. Panelists worked together to achieve consensus DOK ratings on the APs and the writing LAFS and NGSSS for Social Studies separately. These ratings were analyzed for comparability.
We compared the DOK ratings of the APs from the FSAA-PT Test Specifications to the ratings given to the corresponding writing LAFS and NGSSS for Social Studies. Tables 14 and 15 present the percentage of APs per grade-level/subject rated as expecting performance at the same level, or higher or lower levels, as the writing LAFS and NGSSS for Social Studies. Although there is no minimum level of acceptable overlap in DOK established by the LAL criteria, there is an assumption that APs should be skewed to lower cognitive complexity than the state standards (Flowers et. al, 2007). It may be reasonable, then, to expect that as many as half of the APs would require students to demonstrate performance at a lower level than the state standards. On the other hand, it would be problematic to find several APs with performance expectations at a higher level than the writing LAFS and NGSSS for Social Studies.
Across all content areas, at least 84% of the APs were given a DOK level at the same or lower level than the corresponding writing LAFS and NGSSS for Social Studies. In civics, 16% of APs were assigned higher levels of cognitive complexity than the corresponding NGSSS. Some of
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 24
the writing APs were also assigned a higher level of complexity than the state standards, most notably in grade 7 where 11% of the writing APs were assigned higher levels of cognitive complexity than the corresponding writing standards.
Table 14. Percent of Writing APs at Lower, Same, or Higher Levels of Complexity Compared to Related Writing Standards
Grade N APs % Lower % Same % Higher % Same or Lower
Grade 4 7 57.14 42.86 0.00 100.00
Grade 5 15 46.67 53.33 0.00 100.00
Grade 6 17 76.47 17.65 5.88 94.12
Grade 7 9 77.78 11.11 11.11 88.89
Grade 8 18 88.89 5.56 5.56 94.45
Grade 9 20 75.00 20.00 5.00 100.00
Grade 10 17 75.00 25.00 0.00 100.00
Table 15. Percent of Social Studies APs at Lower, Same, or Higher Levels of Complexity Compared to Related NGSSS for Social Studies
Grade N APs % Lower % Same % Higher % Same or Lower
Civics 23 50.72 33.33 15.94 84.05
US History 25 82.67 16.00 1.33 98.67
We also asked panelists to directly compare the written performance expectations in the APs with the associated writing LAFS and NGSSS for Social Studies. Panelists, individually, evaluated the language of each AP to decide whether the expectations are the same, partially similar, or differ entirely from what is expected in the corresponding writing LAFS and NGSSS for Social Studies. For example, if an NGSSS for Social Studies expects students to ‘identify and explain’, while the AP asks students to ‘identify’ only, these expectations are rated as partially similar. When students are asked to ‘distinguish between’ in the writing LAFS, but the AP requires students to ‘recognize’, then the expectation for demonstrating knowledge is different. Tables 16 and 17 show the results of this comparison. At least 90% of the APs should be rated as ‘Some’ or ‘All’ compared with the state standards.
In all grades and subjects, panelists rated 99% or more of the APs as having some or all of the same performance expectations as the corresponding writing LAFS and NGSSS for Social Studies.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 25
Table 16. Percent of Linked APs at Various Levels of Performance Centrality – Writing
Grade N
Raters
N % None % Some % All
Aps
Grade 4 4 7 0.00 0.00 100.00
Grade 5 4 13 0.00 0.00 100.00
Grade 6 4 17 0.00 79.41 20.59
Grade 7 4 9 0.00 91.67 8.33
Grade 8 4 18 0.00 83.33 16.67
Grade 9 4 20 0.00 0.00 100.00
Grade 10 4 17 0.00 0.00 100.00
Table 17. Percent of Linked APs at Various Levels of Performance Centrality – Social Studies
Grade N
Raters
N % None % Some % All
APs
Civics 4 23 1.09 94.57 4.35
US History EOC 4 25 1.00 57.67 41.33
Criterion 3: Content Coverage – HumRRO Alignment Method
Since the content coverage criterion focuses on the relationship between items and APs regarding content, category, and DOK representation, Criterion 3: Content Coverage is not applicable to the AP to Standards evaluation.
Criterion 4: Content Differentiation
This criterion focuses on whether the content expectations change appropriately between grade-levels within a grade span panel group. For this reason, the evaluation of content differentiation involves a comparison between grade-level content expectations. Panelists in the writing grade 4-5 and grade 6-8 panel groups were asked to review the APs of the grades they were evaluating, and determine a consensus rating of the extent to which higher grade-levels evidenced broader, deeper, and newer knowledge, as well as growth on prerequisite skills (see Appendix A for a more detailed explanation of the categories). For each category in Table 18, panelists came to a consensus as to whether the content differentiation of the APs between grades was clear, partial, limited, or there was none. According to the LAL method, content expectations should show evidence of at least partial differences in content between grades on the dimensions of Broader, Deeper, Prerequisite, and New. After panelists evaluated the four categories, they were asked to give an overall yes/no rating of whether the content expectations between grades were identical. A rating of ‘yes’ (they are identical) would suggest there are generally no increases or changes in the expectations between grade-levels. Thus, a rating of ‘No’ would be preferable.
As Table 18 exhibits, the degree of content differentiation varies across dimensions and grade-levels. The LAL method suggests that all ratings indicating differentiation exists (clear, partial, or limited) indicate acceptability for each category., Because the standards are identical for writing grades 9 and 10 and civics and US history are EOC assessments, this evaluation step was
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 26
completed for writing grades 4-5 and 6-8 only. For writing grades 4-5, panelists found content differentiation to be limited in all areas (breadth, depth, prerequisite, new learning), and consequently rated the APs to be identical between the grades. For writing grades 6-8, the panelists found no differentiation in breadth between any of the three grades, limited differentiation in new learning between grades 7 and 8, and partial differentiation in depth and prerequisite. They concluded that AP 2.4 is identical across the grades.
Table 18. Consensus AP Content Differentiation – Writing
Grades Reviewed
Category Rating Rating Support
4 – 5
Broader L
There is limited increase in breadth between grades 4 and 5. We have taken the same concept and added another component to it. This applies to one access point (AP LAFS5.W.1.AP.2b and AP LAFS4.W.1.AP.2b) but not to the majority of the access points. Organizing is the deeper concept. The addition of entertainment in AP LAFS5.W.1.AP.4b is abstract - moving away from purely concrete ideas.
Deeper L
There is an increase in complexity between grades 4 and 5. We have taken the same concept and added another component to it. This applies to one access point (AP LAFS5.W.1.AP.2b and AP LAFS4.W.1.AP.2b) but not to the majority of the access points. Organizing is the deeper concept. The addition of entertainment in AP LAFS5.W.1.AP.5b is abstract - moving away from purely concrete ideas.
Prerequisite L They are not prerequisite skills because they are the same skills from grade 4 APs to grade 5.
New L
There are new skills and strategies mentioned at grade 5 that are not in grade 4 (AP LAFS5.W.1.AP.2b and AP LAFS4.W.1.AP.2b). Otherwise, the access points are nearly identical.
Identical Y
The standards are identical, with the exception of added complexity between AP LAFS5.W.1.AP.2b and AP LAFS4.W.1.AP.2b, and the added complexity of entertainment in LAFS5.W.1.AP.4b.
6 – 8
Broader N For standard 2.4 in grade 6, 7, & 8 there is no differentiation - they are identical; other APs only get deeper and not broader - additional text types are not added
Deeper P in 1.1 the claims and counter claims become more specific across the grades; in 1.2 transitions are deeper because there is mastery of transitions as the standards move up the grades
Prerequisite P 1.1 and 1.2 build on each other as move up in grade level
New L there is an added task from 6th - 7th grade, "identify claims" to "identify and acknowledge claims" in 1.1; this doesn't occur between 7th and 8th grades
Identical Y 2.4 is identical across grades 6 - 8
Criterion 5: Achievement
Criterion 5: Achievement focuses on the degree to which the assessment provides evidence of a student’s ability to demonstrate what they know and can do on grade referenced academic
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 27
content. Thus, this criterion is not applicable to the evaluation of the AP to Standards relationship.
Criterion 6: Performance Accuracy
Panelists, individually, evaluated whether students could reasonably demonstrate the content and performance expected in the APs. In general, for alternate standards and assessments, it is expected that teachers and test administrators can modify the content to instruct and assess students at the appropriate level based on their Individual Education Plans (IEPs). Panelists rated the general accessibility to students based on various types of disabilities. For example, can students with visual impairments, an inability to follow instructions, or need for assistive technology demonstrate the knowledge expected by the APs? Panelists provided a simple ‘yes’ (accessible to all) or ‘no’ (not accessible to some groups) response to indicate their judgments. Tables 19 and 20 include the percent of APs judged as accessible to all groups. At least 90% of the APs should be rated as ‘Yes.’
Across all grades and subjects, panelists rated nearly 100% of the APs as accessible by a wide range of students with different physical and cognitive disabilities.
Table 19. Percent of APs Rated as Accessible to Different Disability Groups – Writing
Grade N N
% Yes % No Raters APs
Grade 4 4 7 100.00 0.00
Grade 5 4 13 100.00 0.00
Grade 6 4 17 100.00 0.00
Grade 7 4 9 100.00 0.00
Grade 8 4 18 100.00 0.00
Grade 9 4 20 100.00 0.00
Grade 10 4 17 100.00 0.00
Table 20. Percent of APs Rated as Accessible to Different Disability Groups – Social Studies
Grade N N
% Yes % No Raters APs
Civics 4 23 99.64 0.36
US History 4 25 100.00 0.00
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 28
Chapter 4: Alignment of FSAA-PT Tasks to APs
In this chapter, we report on the results of panelists’ ratings on the FSAA-PT tasks in the writing prompt portion of the ELA per grade assessment as well as civics, and US history End of Course (EOC). We present the results on the LAL Criteria 1 through 6. In general, and unless otherwise specified, at least 90% of FSAA-PT tasks must achieve acceptable ratings to demonstrate linkage to grade-level content for each LAL criterion.
As a reminder, the FSAA-PT for civics and US history consists of 16 item sets, containing three tasks each and purportedly measuring one AP. The first task is written to the Participatory AP, the second task is written to the Supported AP, and the third task is written to the Independent AP. The student’s teacher has the ability to scaffold the first task by reducing the response options if the student does not respond correctly. The second and third tasks do not allow for scaffolding if the student responds incorrectly. The writing section of the ELA assessment consists of two different prompts. The first prompt consists of five selected-response questions associated with a passage, and the second prompt consists of a single open-response format prompt associated with another passage. Scaffolding is not allowed in the writing assessment. Unless otherwise stated, the results presented are across all tasks regardless of the item set or prompt.
Throughout this chapter, the column ‘N Raters’ will denote the total number of panelists used in the analyses, while the ‘N Tasks’ column shows the range of tasks that panelists evaluated. If a panelist was not able to review a task or skipped a task, the total number of tasks, for a particular panelist, equals the number of tasks actually evaluated by the panelist and not all of the tasks.
LAL Criteria
Criterion 1: Age Appropriateness
Panelists, individually, evaluated the FSAA-PT tasks on whether the content and task assessed students at an appropriate level linked to their assigned grade. Tables 21 and 22 display the percentage of tasks judged as adapted (linked on-grade level), inappropriate (off-grade), and neutral (not age-bound). For acceptable linkage, at least 90% of tasks must be judged ‘adapted’ or ‘neutral.’ In this case, all of the FSAA-PT tasks across subjects and grades were rated by panelists as being either adapted or neutral.
Table 21. Percent of Writing Tasks Rated as Age Appropriate
% of Tasks Rated as
Grade N N
Inappropriate Neutral Adapted Raters Tasks
Grade 4 4 6 0.00 29.17 70.83
Grade 5 4 6 0.00 25.00 75.00
Grade 6 4 6 0.00 100.00 0.00
Grade 7 4 6 0.00 100.00 0.00
Grade 8 4 6 0.00 100.00 0.00
Grade 9 4 6 0.00 0.00 100.00
Grade 10 4 6 0.00 0.00 100.00
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 29
Table 22. Percent of Social Studies Tasks Rated as Age Appropriate
% of Tasks Rated as
Grade N
Raters N
Tasks Inappropriate Neutral Adapted
Civics 4 48 0.00 0.00 100.00
US History 4 48 0.00 52.60 47.40
Criterion 2a: Content Centrality
Since panelists were provided the AP linked to the FSAA-PT task, a content centrality rating was not made. Instead, the task content match to the assigned AP was evaluated as part of criterion 3 below.
Criterion 2b: Performance Centrality
In addition to the targeted content, the FSAA-PT tasks should retain the performance intended by the APs to some extent. For example, if the AP requires students to ‘compare and contrast’ content, the task should necessitate students make some type of distinction. Tables 23 and 24 show the mean number of tasks rated, individually by panelists, as retaining all (same performance), some, or none of the performance expectations of the corresponding APs. At least 90% of tasks should receive ratings of ‘some’ or ‘all.’
For the majority of grades and subjects, panelists rated the number of FSAA-PT tasks as surpassing the 90% minimum level of acceptability for performance centrality. For civics and US history, panelists rated all tasks as measuring the same performance level of the AP. However, panelists rated 12.5% of grade 6 and 7 writing tasks as not having the same performance expectation as the corresponding AP. Panelists in the writing group stated that the tasks did not require students to perform to the full extent of the associated AP.
Table 23. Percent of Writing Tasks at Various Levels of Performance Centrality
% of Tasks Rated as
Grade N
Raters N
Tasks None Some All
% of Tasks Rated as All or Some
Grade 4 4 6 0.00 8.33 91.67 100.00
Grade 5 4 6 4.17 4.17 91.67 95.84
Grade 6 4 6 12.50 20.83 66.67 87.50
Grade 7 4 6 12.50 16.67 70.83 87.50
Grade 8 4 6 4.17 20.83 75.00 95.83
Grade 9 4 6 0.00 0.00 100.00 100.00
Grade 10 4 6 0.00 0.00 100.00 100.00
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 30
Table 24. Percent of Social Studies Tasks at Various Levels of Performance Centrality
% of Tasks Rated as
Grade N
Raters N
Tasks None Some All
% of Tasks Rated as All or Some
Civics 4 48 0.00 0.00 100.00 100.00
US History 4 48 0.00 0.00 100.00 100.00
Criterion 3a: Tasks Represent Intended Content
Panelists did not identify an AP for each FSAA-PT task. Instead, panelists, individually, verified that the AP assigned to the task by item writers was an accurate match. Panelists gave each task and matching AP a rating of (1) not aligned, (2) partially aligned, or (3) fully aligned. The cross-tabulation of the ratings is presented below in Tables 25 and 26. To estimate the approximate number of tasks assigned a specific rating, we divided by the number of panelists who provided ratings at the task level. At least 90% of tasks should receive ratings of ‘partially’ or ‘fully’ aligned.
More than 90% of tasks were rated as either partially or fully aligned to the indicated AP. In general, these results indicate that the FSAA-PT tasks are assessing the intended APs.
Table 25. Writing Task Alignment Ratings
% of Tasks Rated as
Grade N
Raters N
Tasks Not Aligned
Partially Aligned
Fully Aligned % of Tasks Rated
as Fully or Partially Aligned
Grade 4 4 6 0.00 4.17 95.83 100.00
Grade 5 4 6 4.17 0.00 95.83 95.83
Grade 6 4 6 0.00 16.67 83.33 100.00
Grade 7 4 6 0.00 16.67 83.33 100.00
Grade 8 4 6 0.00 16.67 83.33 100.00
Grade 9 4 6 0.00 0.00 100.00 100.00
Grade 10 4 6 0.00 0.00 100.00 100.00
Table 26. Social Studies Task Alignment Ratings
% of Tasks Rated as
Grade N
Raters N
Tasks Not Aligned
Partially Aligned
Fully Aligned % of Tasks Rated
as Fully or Partially Aligned
Civics 4 48 2.08 0.00 97.92 97.92
US History 4 48 1.04 0.52 98.44 98.96
Criterion 3b: Tasks Represent Intended Categories
To address this criterion, we examined the distribution of FSAA-PT item sets aligned by AP and compared it to the target stated in the FSAA-PT Test Specifications.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 31
Data for this analysis was provided through individual panelists’ evaluation of the task to AP alignment. Each AP is associated with a content category; thus, when panelists agreed with the AP paired with a task, or they proposed an alternate AP that better assessed the task, they also aligned tasks to content categories.
Table 27 below shows that the mean number of aligned writing tasks (partially or fully aligned rating) across panelists resulted in tasks that matched the target criterion stated in the FSAA-PT Test Specifications for the content category for all writing grades. Even though Table 25 shows 4% of the tasks in writing grade 5 are not aligned, alternate AP assigned still placed the task into the same content category as the one assigned.
Table 27. Mean Number of Aligned Writing Items by Content Category
Grade Reporting Category Genre Criterion
Mean SD (N of Tasks)
Grade 4 Text-based Writing Informative 6 6.00 0.00
Grade 5 Text-based Writing Informative 6 6.00 0.00
Grade 6 Text-based Writing Informative 6 6.00 0.00
Grade 7 Text-based Writing Informative 6 6.00 0.00
Grade 8 Text-based Writing Informative 6 6.00 0.00
Grade 9 Text-based Writing Informative 6 6.00 0.00
Grade 10 Text-based Writing Informative 6 6.00 0.00
Table 28 shows the mean number of aligned item sets (partially or fully aligned rating) across panelists resulted in item sets that generally matched the criterion percentage for each content category for civics and US history. In civics, panelists assignment of alternate APs resulted in item sets measuring the reporting category “Origin and Purposes of Law and Government” changing to the reporting category “Roles, Rights, and Responsibilities of Citizens.” There was a slight variation in US history in the reporting category “Late Nineteenth and Early Twentieth Century, 1860-1910”. Overall though, the item sets generally matched the target criterion in the FSAA-PT Test Specifications.
Table 28. Mean Number of Aligned Social Studies Items by Content Category
Grade Reporting Category Criterion
Mean SD (N of Item Sets)
Civics
Origin and Purposes of Law and Government 12 9.00 0.00
Roles, Rights, and Responsibilities of Citizens 12 15.00 0.00
Government Policies and Political Processes 12 12.00 0.00
Organization and Function of Government 12 12.00 0.00
Total Mean Number of Linked Item Sets 48.00 0.00
US History
Late Nineteenth and Early Twentieth Century, 1860-1910
15 14.75 0.50
Global Military, Political, and Economic Challenges, 1890-1940
18 18.00 0.00
The United States and the Defense of the International Peace, 1940-present
12 12.00 0.00
Introduced in all Reporting Categories 3 3.00 0.00
Total Mean Number of Linked Item Sets 47.75 0.50
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 32
Criterion 3c: Task DOK Represent Alternate Standards
The tasks on each assessment should reflect the range of cognitive complexity in the APs, as interpreted by the state. Since the FSAA-PT Test Specifications do not indicate an intended DOK target, this criterion will be assessed by evaluating the assigned DOK of a task, evaluating the distribution of DOK levels, and comparing the DOK level of the aligned tasks and APs.
Data for these analyses was provided through panelists’ DOK evaluations. Panelists used the following DOK levels while evaluating the tasks (see Appendix A for the complete LAL DOK level descriptions).
DOK 1 = Attention DOK 2 = Memorize/recall DOK 3 = Performance DOK 4 = Comprehension DOK 5 = Application DOK 6 = Analysis, Synthesis, Evaluation As panelists reviewed FSAA-PT tasks, they, individually, determined whether the DOK level assigned to each task during the item writing process matched the task, or whether it was too low or too high. Tables 29 and 30 summarize the percent of tasks (across panelists) assigned DOK levels lower, the same, or higher than the DOK level assigned to the task. It is reasonable to expect panelists to agree with 90% of the DOK levels assigned to tasks.
Across all grades and subjects, the majority of tasks met the expectation of 90% agreement between panelists and assigned DOK levels. However, writing grades 4 and 9 did not meet this expectation. For grade 4, panelists were able to confirm the cognitive complexity of only 80% of the tasks and only 65% in grade 9. For all subjects, panelists reported that the lower DOK level resulted mainly from the task requiring a simple inference or recall and not a further extension of drawing a conclusion. Note that while there were 2 prompts (prompt 1 includes 5 selected-response tasks and prompt 2 is an open-response) for writing in all grades, prompt 2 was not assigned a DOK by the prompt writer; therefore, we could not make a comparison between the levels of DOK assigned by the prompt writers and the panelists. Only prompt 1 (5 selected-response tasks) are included in these comparisons for writing. When the panelists rated the task DOK lower, they usually cited lack of inference in the task as the reason.
Table 29. Percent of Writing Tasks at Lower, Same, or Higher Levels of Complexity
% of Linked Tasks with
Grade N
Raters N
Tasks Lower
Complexity Same
Complexity Higher
Complexity
Grade 4 4 5 10.00 80.00 10.00
Grade 5 4 5 5.00 90.00 5.00
Grade 6 4 5 0.00 90.00 10.00
Grade 7 4 5 0.00 95.00 5.00
Grade 8 4 5 0.00 100.00 0.00
Grade 9 4 5 25.00 65.00 10.00
Grade 10 4 5 0.00 100.00 0.00
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 33
Table 30. Percent of Social Studies Tasks at Lower, Same, or Higher Levels of Complexity
% of Linked Tasks with
Grade N N Lower Same Higher
Raters Tasks Complexity Complexity Complexity
Civics 4 48 1.56 98.44 0.00
US History 4 48 1.56 96.88 1.56
To examine the distribution of DOK levels across tasks, we used the DOK that panelists rated as the best fit for the item. This means that when evaluating whether an assigned task DOK level was too low, matched, or too high, panelists rated what they thought was the task DOK level. If panelists agreed with the assigned DOK level, then the task was given that DOK level. In contrast, if the panelist felt the assigned DOK level was too low or too high, we asked panelists to identify the DOK level that was more appropriate for the task. In determining the DOK distributions across tasks, the DOK level associated with a task for any one panelist could consist of DOK levels that are assigned to the task and DOK levels assigned by the panelist.
In writing (Table 31), most tasks, in general, were rated DOK levels 2, 3, and 4. None of the tasks were given a DOK level of 1 or 5, but a few tasks were given a DOK 6 in grades 9 and 10. While grades 8 and 9 writing had about 50% of tasks in DOK level 3, in the other grades the distribution was approximately even between DOK 2, 3, and 4.
Table 31. Distribution of Panelist DOK Ratings – Writing
Grade Statistic DOK 1 DOK 2 DOK 3 DOK 4 DOK 5 DOK 6
Grade 4
Mean 0.00 2.00 2.25 1.75 0.00 0.00
SD NA 1.63 2.06 0.50 0.00 NA
Percent 0.00 33.33 37.50 29.17 0.00 0.00
Grade 5
Mean 0.00 2.00 2.00 2.00 0.00 0.00
SD NA 0.00 0.82 0.82 NA NA
Percent 0.00 33.33 33.33 33.33 0.00 0.00
Grade 6
Mean 0.00 1.75 2.00 2.25 0.00 0.00
SD NA 0.50 0.00 0.50 NA NA
Percent 0.00 29.17 33.33 37.50 0.00 0.00
Grade 7
Mean 0.00 1.75 2.25 2.00 0.00 0.00
SD NA 0.50 0.50 0.00 NA NA
Percent 0.00 29.17 37.50 33.33 0.00 0.00
Grade 8
Mean 0.00 1.00 3.00 2.00 0.00 0.00
SD NA 0.00 0.00 0.00 NA NA
Percent 0.00 16.67 50.00 33.33 0.00 0.00
Grade 9
Mean 0.00 1.50 3.00 1.00 0.00 0.50
SD NA 0.58 0.82 0.82 NA 0.58
Percent 0.00 25.00 50.00 16.67 0.00 8.33
Grade 10
Mean 0.00 2.00 2.00 1.00 0.00 1.00
SD NA 0.00 0.00 0.00 NA 0.00
Percent 0.00 33.33 33.33 16.67 0.00 16.67
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 34
In civics and US history (Table 32), the majority of tasks were given a DOK level of 2, 3, or 4, with an approximately even distribution of tasks among those DOK levels. No items were rated as DOK level 1 or 6, and a few items were rated as DOK 5.
Table 32. Distribution of Panelist DOK Ratings – Social Studies
Grade Statistic DOK 1 DOK 2 DOK 3 DOK 4 DOK 5 DOK 6
Civics
Mean 0.00 16.00 15.00 14.75 2.25 0.00
SD NA 0.00 0.00 1.50 1.50 NA
Percent 0.00 33.33 31.25 30.73 4.69 0.00
US History
Mean 0.00 16.00 15.00 16.00 1.00 0.00
SD NA 0.00 1.15 1.15 0.00 NA
Percent 0.00 33.33 31.25 33.33 2.08 0.00
In addition to determining the agreement between panelists’ DOK ratings and the DOK levels assigned to tasks, we compared the DOK ratings panelists provided for the APs and FSAA-PT tasks to evaluate the degree of alignment between the cognitive expectations. Tables 33 and 34 summarize the percent of tasks (across panelists) which were assigned DOK levels that were lower, the same, or higher than the DOK level of the aligned AP. It is reasonable to expect 50% of the tasks to be at the same or higher complexity level as the corresponding AP.
Table 33. Percent of Writing Tasks at Lower, Same, or Higher Levels of Complexity Compared to Related APs
% of Linked Tasks with
Grade N
Raters N
Tasks Lower
Complexity Same
Complexity Higher
Complexity
% of Linked Tasks with Same or Higher
Complexity
Grade 4 4 6 83.33 16.67 0.00 16.67
Grade 5 4 6 87.50 12.50 0.00 12.50
Grade 6 4 6 83.33 12.50 4.17 16.67
Grade 7 4 6 100.00 0.00 0.00 0.00
Grade 8 4 6 100.00 0.00 0.00 0.00
Grade 9 4 6 79.17 12.50 8.33 20.83
Grade 10 4 6 66.67 33.33 0.00 33.33
Table 34. Percent of Social Studies Tasks at Lower, Same, or Higher Levels of Complexity Compared to Related APs
% of Linked Tasks with
Grade N
Raters
N Task
s
Lower Complexity
Same Complexity
Higher Complexity
% of Linked Tasks with Same or Higher
Complexity
Civics 4 48 4.17 54.17 41.67 95.83
US History 4 48 10.42 70.83 18.75 89.58
In civics and US history, the majority of tasks were rated as the same or higher complexity than the AP. However, the majority of writing tasks for all grades were rated as having lower
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 35
complexity than the AP. In fact, none of the writing grades met the 50% criterion, with grades 7 and 8 having no tasks with the same or higher complexity.
Besides DOK, FSAA-PT tasks are assigned three additional complexity ratings, Volume of Information, Vocabulary, and Context, according to the Presentation Rubric (see Appendix A).
Panelists, individually, evaluated whether the additional complexity ratings assigned to each task during the item writing process matched the task, or whether it was too low or too high. We would expect panelists to agree with at least 90% of each additional complexity rating associated with the FSAA-PT tasks to achieve acceptability. Note that while there were 2 prompts (prompt 1 includes 5 selected-response tasks and prompt 2 is an open-response) for writing in all grades, prompt 2 was not assigned a Volume of Information, Vocabulary, or Context by the prompt writer; therefore, we could not make a comparison between the levels of Volume of Information, Vocabulary, and Context assigned by the prompt writers and the panelists. Only prompt 1 (the 5 selected-response tasks) are included in these comparisons for writing. Tables 35 and 36 present average panelist agreement with the assigned task complexity rating for Volume of Information. The number of tasks with each rating were averaged across panelists and presented as percentages.
As seen in Tables 35 and 36, tasks in civics and US history met the expectation that 90% of the tasks were rated by panelists as at the same Volume of Information as assigned the task. All of the writing grades except for grade 4 (70%) and grade 9 (85%) met or exceeded the 90% agreement expectation.
Table 35. Percent of Writing Tasks at Lower, Same, or Higher Levels of Volume of Information
% of Linked Tasks with
Grade N
Raters N
Tasks
Lower Volume of Information
Same Volume of Information
Higher Volume of Information
Grade 4 4 5 0.00 70.00 30.00
Grade 5 4 5 0.00 90.00 10.00
Grade 6 4 5 0.00 100.00 0.00
Grade 7 4 5 0.00 100.00 0.00
Grade 8 4 5 0.00 100.00 0.00
Grade 9 4 5 15.00 85.00 0.00
Grade 10 4 5 0.00 100.00 0.00
Table 36. Percent of Social Studies Tasks at Lower, Same, or Higher Levels of Volume of Information
% of Linked Tasks with
Grade N
Raters N
Tasks
Lower Volume of Information
Same Volume of Information
Higher Volume of Information
Civics 4 48 0.00 100.00 0.00
US History 4 48 0.52 98.44 1.04
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 36
Tables 37 and 38 show the average panelist agreement with the assigned task complexity ratings for Vocabulary. The number of tasks with each rating were averaged across panelists and presented as percentages.
The majority of grades and subjects met the expectation that 90% of the tasks were rated by panelists as at the same Vocabulary as assigned the task. In civics and US history, panelists rated more than 90% of the tasks at the same Vocabulary level as assigned the task. All of the writing grades except for grade 4 and grade 9 (75%) exceeded the 90% agreement expectation.
Table 37. Percent of Writing Tasks at Lower, Same, or Higher Levels of Vocabulary
% of Linked Tasks with
Grade N
Raters N
Tasks Lower
VocabularySame
VocabularyHigher
Vocabulary
Grade 4 4 5 0.00 75.00 25.00
Grade 5 4 5 0.00 100.00 0.00
Grade 6 4 5 0.00 100.00 0.00
Grade 7 4 5 0.00 100.00 0.00
Grade 8 4 5 0.00 100.00 0.00
Grade 9 4 5 10.00 75.00 15.00
Grade 10 4 5 0.00 100.00 0.00
Table 38. Percent of Social Studies Tasks at Lower, Same, or Higher Levels of Vocabulary
% of Linked Tasks with
Grade N
Raters N
Tasks Lower
VocabularySame
VocabularyHigher
Vocabulary
Civics 4 48 0.52 99.48 0.00
US History 4 48 3.65 95.83 0.52
Tables 39 and 40 show the average panelist agreement with the assigned task complexity ratings for Context. The number of tasks with each rating were averaged across panelists and presented as percentages.
As seen in Table 39 and 40, tasks in civics, US history, and writing grades 5-10 exceeded the expectation that 90% of the tasks were rated by panelists as at the same Context level as assigned the task. In grade 4 writing, 75% of the tasks were rated as having the same context while the other 25% of tasks were rated as written to a higher Context level.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 37
Table 39. Percent of Writing Tasks at Lower, Same, or Higher Levels of Context
% of Linked Tasks with
Grade N
Raters N
Tasks Lower
ContextSame
ContextHigher Context
Grade 4 4 5 0.00 75.00 25.00
Grade 5 4 5 0.00 100.00 0.00
Grade 6 4 5 0.00 100.00 0.00
Grade 7 4 5 0.00 100.00 0.00
Grade 8 4 5 0.00 100.00 0.00
Grade 9 4 5 0.00 100.00 0.00
Grade 10 4 5 0.00 100.00 0.00
Table 40. Percent of Social Studies Tasks at Lower, Same, or Higher Levels of Context
% of Linked Tasks with
Grade N
Raters N
Tasks Lower
ContextSame
ContextHigher Context
Civics 4 48 0.00 99.48 0.52
US History 4 48 3.13 95.31 1.56
Criterion 4: Content Differentiation
This criterion focuses on whether the content increases in depth, breadth, and complexity at higher grade-levels for FSAA-PT tasks. For the writing prompt portion of the ELA assessment, the comparison was made across grades. However, we modified this criterion to focus instead on the three tasks within each item set for civics and US history. The FSAA-PT is structured such that each item contains three tasks which are written in increasing complexity on at least one of the complexity levels (DOK, Volume of Information, Vocabulary, or Context). In the civics and US history panel groups, panelists were asked to review the item sets and rate the amount of content differentiation evident between the tasks. This comparison required a more global judgment of each set of item tasks. Tables 41 and 42 show consensus ratings among panelists across the categories using the rating scheme of clear, partial, limited, or no differentiation. Although no minimum level of differentiation has been established, the LAL method suggests that all ratings of differentiation (clear, partial, or limited) are acceptable across grade-levels for each category so this same principle will be applied in evaluating the three tasks within an item set.
For writing, the content differentiation was determined by looking at the prompts across grade levels. As seen in Table 41, panelists found a clear task differentiation for writing prompts in grade 9-10. However, partial, limited, or no content differentiation was found for the writing prompts in grades 4-5 and 6-8. For writing grades 6 and 7, panelists found no content differentiation in the FSAA-PT tasks.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 38
Table 41. Consensus Content Differentiation Across Grades – Writing Prompt Portion of the ELA FSAA-PT
Grade Category Consensus Rating Consensus Rating Support
Grade 4-5
Broader Partial
Grade five introduces "reasons" versus grade four "details" in the teacher script (Q1 T1). However, this does not impact the breadth or depth of the question. Otherwise processes remain the same, "link" and "link".
Deeper Partial
Grade four Q1 T4 is simpler than a comparable question and grade five demonstrating a requirement of deeper mastery. The reading passage in grade five is longer, with more distractors and complexity. Sentences are longer in Q1 T5 grade five versus four.
Prerequisite Limited Grade four Q1 T4 is simpler than a comparable question and grade five thus building upon a skill.
New No
differentiationVocabulary is familiar in both passages, comparable processes used, some complexity is added.
Identical No Different reading passages, some additional complexity in wording, length of responses.
Grade 6-8
Broader No
differentiation
each item sticks with one source text for all grades; a simple 2 - 3 paragraph with a graphic is used at each grade level; there's no additional text, graphs, etc. that are being included as the grades increase - don't ask for making connections
Deeper Limited
in 6th/7th grade there is textual scaffolding in the writing prompt seen through the tasks but in 8th grade the scaffolding is done through the teacher presentation of the tasks
Prerequisite Limited 6th/7th grade provide the foundation for 8th grade but 6th & 7th grade do not build on each other
New No
differentiationsame concept is presented across all grades
Identical Yes
6th & 7th are identical regarding the presentation of tasks, the format of tasks, and same concepts of assessment; however, 8th grade is not identical to 6th/7th because there is a different presentation of tasks and the format of tasks is different
Grade 9-10
Broader Clear
Going from informative to argumentative writing provides progression; the task in 9th grade only involves previewing and summarizing, but in 10th grade the task includes developing an argument.
Deeper Clear
Grade 10 task 5 states that a significant support for idea development is needed, which is deeper than the requirements of task 5 in grade 9. Instead of connecting different ideas, you are finding relationships between ideas.
Prerequisite Clear Informative writing is a prerequisite for argumentative writing; you have to have basic skills down before you create an argument.
New Clear Increased and new skills are needed to create an argument.
Identical No Clear content differentiation is present.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 39
For civics and US history, the consensus content differentiation was determined by looking at the set of tasks associated with each item. Table 42 shows that panelists determined clear task differentiation present for the set of tasks for each item. Table 42. Consensus Content Differentiation – Social Studies FSAA-PT Item Set
Grade Category Consensus Rating Consensus Rating Support
Civics
Broader Clear The higher tasks clearly reflect a broader application of the target skill.
Deeper Clear The higher tasks clearly reflect a deeper mastery of the target skill.
Prerequisite Clear The higher tasks clearly reflect a target prerequisite for mastery of the AP.
New Clear Task 1 plus Task 2 plus task 3 clearly reflects a new skill.
Identical No
Each level of the task clearly builds to the performance level of the AP. Without having the tasks available the understanding of the Access points is not as clear. Specifically, the verb interpretation in reference to the performance task.
US History
Broader Clear Overall it does get broader. Independent are the most broad.
Deeper Clear The first one is general and then more specific. More rigorous
Prerequisite Clear Participatory tells the answer, then later ones require the first ones.
New Clear Have to know more than what is in the first one to get the later one right. They build to the last one.
Identical No Very definitely different
Criterion 5: Achievement
The fifth LAL criterion pertains to inferences that can be made about a student based on their FSAA-PT score. The alternate assessment should allow students with disabilities to demonstrate academic skills or knowledge acquired from their coursework on the assessment, free from teacher or program influence. To determine the extent to which the FSAA-PT enables students to demonstrate this learning, panelists evaluated the scoring rubrics, scoring guidelines, assessment administration manual, FSAA-PT tasks, and FSAA-PT Test Specifications. Panelists worked together to form consensus ratings regarding the level of inference (high, low, or no evidence) of student learning provided by the alternate assessment system. The ratings were made across several learning dimensions, which are described below (adapted from Flowers et al, 2007):
Level of accuracy – extent to which scoring makes clear distinctions in student responses (minimal leeway for teacher interpretation of student response).
Level of independence – extent to which student performance is based on independent response without teacher supports.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 40
New learning – extent to which evidence of new learning is demonstrable based on use of baseline or pretest OR clear content differentiation between grade tests.
Generalization across people and settings
– extent to which students demonstrate knowledge regardless of people (test administrator) or assessment setting.
Generalization across materials and activities
– extent to which students demonstrate knowledge across different types of materials (i.e., objects) or activities.
Standard setting – extent to which achievement standards are distinct and based on demonstration of independent student performance.
Program quality indicators
– extent to which the inclusion of program characteristics (e.g., quality of the task, completeness or accuracy of IEP) are not part of a student’s score.
Ratings of ‘no inference’ suggest an assessment may not allow students to adequately demonstrate their knowledge, while ratings of ‘high inference’ indicate that students’ scores clearly reflect their level of learning. It is reasonable to expect some inference (low or high) on at least six of the seven dimensions and high inference on at least four of those dimensions. Tables 43 and 44 contain the group consensus ratings on the degree of inference on student learning evident in the FSAA-PT, along with rationales for their ratings.
As seen in Table 43 the majority of writing grades’ dimensions were rated as having some level of inference. The panelists’ main concerns were some tasks (where hand over hand teacher guidance is allowed) may provide the lowest level of inference; some task completion may only be understood by a teacher very familiar with the student; and some tasks may be more material-dependent than others. For writing grades 4 and 5, while panelists rated the dimensions as having some level of inference, only 3 out of 7 dimensions had a high level of inference.
Table 43. Consensus Student Learning – Writing Prompt Portion of the ELA FSAA-PT
Grade Dimension Consensus Rating Consensus Rating Support
Grade 4
Level of Accuracy High Prompt two may have less inference.
Level of Independence No Hand over hand is a current practice per administration manual.
New Learning Low Limited differentiation between grades four and five, no pretest.
Generalizations Across People and Settings
High A person with student familiarity would be able to administer test.
Generalizations Across Materials and Activities
Low Some items are material dependent.
(continued)
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 41
Table 43. (Continued)
Grade Dimension Consensus Rating Consensus Rating Support
Grade 4 (cont’d)
Standard Setting Low Any student has the ability to respond correctly by chance.
Program Quality Indicators
High Scoring is straight forward.
Grade 5
Level of Accuracy High Prompt two may be more subjective to teacher.
Level of Independence
No Hand over hand is a current practice per administration manual.
New Learning Low Limited differentiation between grades four and five, no pretest.
Generalizations Across People and
Settings High
A person with student familiarity would be able to administer test.
Generalizations Across Materials and
Activities Low Some items are material dependent.
Standard Setting Low Any student has the ability to respond correctly by chance.
Program Quality Indicators
High Scoring is straight forward.
Grade 6
Level of Accuracy Low There are some students where teacher interaction is needed to obtain a response (i.e., hand drop, sign language)
Level of Independence
Low
Across all kids, it depends on the level of support and level of guidance the student requires as to the level of inference - in general it is low
New Learning No The writing tasks are similar across grade levels.
Generalizations Across People and
Settings High For most tasks this is true for students.
Generalizations Across Materials and
Activities High
Every standard is tested using a scaffolding approach, building up the full standard
(continued)
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 42
Table 43. (Continued)
Grade Dimension Consensus Rating Consensus Rating Support
Grade 6
(cont’d)
Standard Setting High need more comprehension as mastery is not likely to be shown by chance especially for prompt 2
Program Quality Indicators
High student's score is indicative of what they can do
Grade 7
Level of Accuracy Low There are some students where teacher interaction is needed to obtain a response (i.e., hand drop, sign language)
Level of Independence Low
Across all kids, it depends on the level of support and level of guidance the student requires as to the level of inference - in general it is low
New Learning No The writing tasks are similar across grade levels.
Generalizations Across People and Settings
High For most tasks this is true for students.
Generalizations Across Materials and Activities
High Every standard is tested using a scaffolding approach, building up the full standard
Standard Setting High need more comprehension as mastery is not likely to be shown by chance especially for prompt 2
Program Quality Indicators
High student's score is indicative of what they can do
Grade 8
Level of Accuracy Low There are some students where teacher interaction is needed to obtain a response (i.e., hand drop, sign language)
Level of Independence Low
Across all kids, it depends on the level of support and level of guidance the student requires as to the level of inference - in general it is low
New Learning Low The level of student independence is related to student guidance
Generalizations Across People and Settings
High For most tasks this is true for students.
(continued)
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 43
Table 43. (Continued)
Grade Dimension Consensus Rating Consensus Rating Support
Grade 8
(cont’d)
Generalizations Across Materials and
Activities High
Every standard is tested using a scaffolding approach, building up the full standard
Standard Setting High need more comprehension as mastery is not likely to be shown by chance especially for prompt 2
Program Quality Indicators
High student's score is indicative of what they can do
Grade 9
Level of Accuracy High Student has to get items correct to receive credit.
Level of Independence
High
mostly independent responses are acceptable. Dependent responses are occasionally acceptable when teacher prompt is specified.
New Learning High there is task differentiation across grades; the tasks increase in breadth, depth, and are often prerequisites for higher grade.
Generalizations Across People and
Settings High
students are expected to demonstrate knowledge across different testers who are familiar with the student.
Generalizations Across Materials and
Activities High
students are expected to demonstrate knowledge across materials and activities.
Standard Setting High students are expected to demonstrate high level of knowledge to be able to pass.
Program Quality Indicators
High Program quality indicators are not used according to the participant; only the student knowledge influences the score.
Grade 10
Level of Accuracy High Student has to get items correct to receive credit.
Level of Independence
High
mostly independent responses are acceptable. Dependent responses are occasionally acceptable when teacher prompt is specified.
New Learning High there is task differentiation across grades; the tasks increase in breadth, depth, and are often prerequisites for higher grade.
(continued)
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 44
Table 43. (Continued)
Grade Dimension Consensus Rating Consensus Rating Support
Grade 10
(cont’d)
Generalizations Across People and
Settings High
students are expected to demonstrate knowledge across different testers who are familiar with the student.
Generalizations Across Materials and
Activities High
students are expected to demonstrate knowledge across materials and activities.
Standard Setting High students are expected to demonstrate high level of knowledge to be able to pass.
Program Quality Indicators
High Program quality indicators are not used according to the participant; only the student knowledge influences the score.
The civics panelists did not view several dimensions as having a high level of interference; panelists’ comments reflect concern over lack of generalization to people and materials (Table 44). Panelists, in the US history group, viewed all dimensions as having some level of inference. They believed that the US history EOC score provides information about what a student knows and can do independently of the teacher or the assessment system. Table 44. Consensus Student Learning – Social Studies FSAA-PT
Grade Dimension Consensus
Rating Consensus Rating Support
Civics
Level of Accuracy High There is one answer they get credit for.
Level of Independence High Hand over hand maybe challenging to interpret.
New Learning No County by county each has their own EOC prerequisites, but not state wide
Generalizations Across People and Settings
No Test admin dictates training and familiarity with the student.
Generalizations Across Materials and Activities
No The tasks are specific to the access point and then tied to the specific standard.
Standard Setting NA Not established.
Program Quality Indicators
High Program quality indicators are reflective of the score not outside indicators.
(continued)
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 45
Table 44. (Continued)
Grade Dimension Consensus Rating Consensus Rating Support
US History
Level of Accuracy Low
They try to make sure that everyone gives it the same way, but in reality there is some room for inference by the administrator. No way to fix it. In a perfect world we would have said high student inference. A lot of room for human error.
Level of Independence High You only get down to a 50/50 chance. Used to be the middle because you would scaffold twice.
New Learning High
Some questions are prior knowledge, but some are from this grade level. 25% is from middle school but the rest are grade level differentiated
Generalizations Across People and
Settings Low
Because the students are hard to understand, so someone else might not understand them. Someone else giving the test would affect the score. Have to be familiar with the student. You would need to have in the IEP that the person needs to be someone who understands the test. Some would be fine but some students would not.
Generalizations Across Materials and
Activities High
You can use other materials and other tasks could get at the same standard equally well.
Standard Setting High In the independent level items they are held to a high standard. That standard might even be too high for this group of students.
Program Quality Indicators
High No connection between an IEP and a score.
Criterion 6: Performance Accuracy
Criterion 6 is intended to evaluate the degree of accessibility of the FSAA-PT for all student groups who take it. Reduced access to the tasks would decrease accurate measurement of students’ skills. Panelists, individually, rated tasks on whether accommodations or supports can be provided for different types of students without substantially altering the target content.
Tables 45 and 46 display the mean percent of tasks rated as accessible to all students. At least 90% of the tasks should be rated as accessible for the whole assessment to be considered accessible. The ratings for all grades and subjects except for writing grade 4 indicate that panelists found 100% of the tasks to be accessible to different disability groups. For writing grade 4, they found only 75% of the tasks to be accessible to different disability groups. The panelists commented some tasks may not translate well to ASL; students with disabilities may
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 46
not be able to relate to some content; and some items may not be accessible to visually impaired students.
Table 45. Percent of FSAA-PT Tasks as Accessible to Different Disability Groups – Writing
Grade N N
% Yes % No Raters Tasks
Grade 4 4 6 75.00 25.00
Grade 5 4 6 100.00 0.00
Grade 6 4 6 100.00 0.00
Grade 7 4 6 100.00 0.00
Grade 8 4 6 100.00 0.00
Grade 9 4 6 100.00 0.00
Grade 10 4 6 100.00 0.00 Table 46. Percent of FSAA-PT Tasks as Accessible to Different Disability Groups – Social Studies
Grade N N
% Yes % No Raters Tasks
Civics 4 48 100.00 0.00
US History 4 48 100.00 0.00 The second rating required panelists to evaluate whether tasks could be modified or supports offered without altering the meaning or purpose of the task. A common approach to administering an alternate assessment is for teachers to offer accommodations based on student IEPs or supports (i.e., assistive technology; scaffolding) as appropriate for a given student.
Tables 47 and 48 include the mean percent of tasks panelists found amenable to these types of changes. Panelists found the majority of tasks could be altered appropriately for individual students.
Table 47. Percent of FSAA-PT Tasks as Amenable to Accommodations or Supports – Writing
Grade N N
% Yes % No Raters Tasks
Grade 4 4 6 100.00% 0.00%
Grade 5 4 6 100.00% 0.00%
Grade 6 4 6 100.00% 0.00%
Grade 7 4 6 100.00% 0.00%
Grade 8 4 6 100.00% 0.00%
Grade 9 4 6 95.83% 4.17%
Grade 10 4 6 100.00% 0.00%
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 47
Table 48. Percent of FSAA-PT Tasks as Amenable to Accommodations or Supports – Social Studies
Grade N N
% Yes % No Raters Tasksa
Civics 4 47-48 100.00% 0.00%
US History 4 47-48 100.00% 0.00% a A range of values denotes at least one panelist did not provide a rating on all tasks. To further evaluate the FSAA-PT on accessibility and accommodations, panelists were asked as a group to provide a consensus rating on four questions across nine disability groups. This evaluation allowed panelists to evaluate whether students with certain disabilities may indeed have difficulty accessing the FSAA-PT or accommodations are difficult to provide. The ratings above focused, in general, across all disabilities on whether the FSAA-PT is accessible and amenable to accommodations. Table 49 shows that panelists believed there are sufficient provisions in the assessment to capture responses for students without clear, intentional communication in civics and writing grades 4-5 and 9-10, but not in US history or writing grades 6-8. Panelists felt that accommodations, modifications, and supports were defined sufficiently to maintain standardized administration for all grades and subjects except for writing grades 6-8.
Table 49. Consensus Whole Test Barriers to Demonstrating Student Knowledge
Question Yes No Consensus Comments
Are there provisions in the assessment to
capture responses for students without clear,
intentional communication (even
at non-symbolic level)?
Civics US History If in the classroom you never know if you are getting to them, then the test doesn't have anything to help with that.
Writing 4-5 Writing 6-8
Writing Prompt 2 is too in-depth to capture an answer for those students who do not have clear, intentional communication
Writing 9-10
Are accommodations, modifications, and supports defined
sufficiently to maintain standardized
administration?
Civics,
Writing 6-8
ASL script is not standardized and if it becomes standardized, needs to allow flexibility to take into account the sign vocabulary of the student
US History,
Writing 4-5,
Writing 9-10
Table 50 indicates that overall, panelists felt that the FSAA-PT is accessible to many different disability groups. The main issue panelists did find was generally with writing for all grades, specifically regarding deaf/blind students. Panelists stated that for students who are deaf, deaf/ blind, or communicate nonverbally with pictures the accommodations would result in the meaning being changed.
Independent Alignm
ent Review
of the FS
AA
-PT
: Civics, U
S H
istory, and the Writing P
rompts
48
Table 50. Consensus Whole Test Barriers to Demonstrating Student Knowledge for Certain Disability Groups
Question Disability Group Consensus Comments
Vis
ually
Im
paire
d/Le
gally
Blin
d
Hea
ring
Impa
ired
Dea
f/Blin
d
Non
verb
al –
Prin
ted
Wor
ds
Non
verb
al -
Pic
ture
s
Non
verb
al –
Man
ual
Sig
ns
Non
verb
al –
Eye
Gaz
e
Ver
bal b
ut n
o us
e of
ha
nds
Com
mun
icat
es w
ith
obje
cts
or b
y in
dica
ting
yes/
no
Does the FSAA-PT contain provisions for students with these characteristics?
Writing gr 9-10
Writing gr 6-8
The instructions to teacher do not state specifically which items will be brailleable. Unclear what accommodations will be made for deaf/blind students.
Student can do the FSAA-PT tasks as designed with flexibility built into tasks?
Writing gr 9-10
Writing gr 6-8
The instructions to teacher do not state specifically which items will be brailleable. Unclear what accommodations will be made for deaf/blind students.
Student can do the FSAA-PT tasks with accommodations (no change to meaning)?
Writing gr 4-5
Writing gr 4-5, Writing gr 9-10
Writing gr 6-8
For students who routinely communicate utilizing ASL concepts than exact English there could be changes to meaning. It is not clear what accommodations are provided to deaf/blind students.
Independent Alignm
ent Review
of the FS
AA
-PT
: Civics, U
S H
istory, and the Writing P
rompts
49
Student can do the FSAA-PT tasks with modifications/supports (may change meaning)?
Writing gr 9-10
It is not clear what accommodations may be provided to deaf/blind students. We need to know exactly what modifications are allowed for deaf/blind students to be able to evaluate this question.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 50
Chapter 5: Summary and Recommendations
In this section, we summarize the results of the alignment study and provide recommendations to strengthen portions of the Florida alternate assessment system.
Access Point to Standards Alignment Summary
For this alignment evaluation, panelists reviewed APs, associated with the FSAA-PT blueprints, for the writing prompt section of the ELA assessment, civics, and US history in multiple ways. First, they evaluated the content centrality (Criterion 2) between the blueprint APs and the corresponding LAFS and NGSSS for Social Studies. Second, panelists evaluated the progression of content (Criterion 4) from one grade to the next only for the blueprint identified writing APs. Lastly, panelists rated the appropriateness and accessibility (Criteria 1 and 6) of the AP content for this population of students.
The rules for the LAL criterion applied to the alignment between blueprint identified APs and the LAFS and NGSSS for Social Studies are as follows:
Criterion 1: Age Appropriateness (individual panelist rating) - 90% or more of the APs were rated as ‘adapted’ or ‘neutral’
Criterion 2a: Content Centrality (individual panelist rating) - 90% or more of the APs were linked to the LAFS or NGSSS for Social Studies
Criterion 2b: Performance Centrality (individual panelist rating) - 90% or more of the APs comparable in complexity to the LAFS or NGSSS for Social
Studies
Criterion 4: Content Differentiation (consensus group rating) - Dimension ratings were ‘clear’ or ‘partial’ and the Identical dimension is ‘no’
Criterion 6: Performance Accuracy (individual panelist rating) - 90% or more of the APs were accessible to different disability groups
Table 51 provides summary conclusions on the alignment of the blueprint identified APs to their respective LAFS and NGSSS for Social Studies. As a reminder, only the writing APs and LAFS are of interest in this alignment study. For non-writing APs and LAFS, refer to Nemeth et al. (2016 No. 029) report. If APs met the criterion, then a green highlighted box containing a ‘’ is assigned. For results falling slightly below a criterion, then a yellow highlighted box containing the criterion results is assigned. Finally, a red highlighted box contains results that fell well below the criterion.
As illustrated in Table 51, in general, the blueprint identified APs exhibited high content linkage with the grade-level standards. Specifically, the APs across all grades and subjects were rated by panelists as age appropriate (Criterion 1) and were found to assess the same content and performance expectations as the grade-level standards (Criterion 2) for all grades and subjects. Panelists felt that the blueprint identified APs were accessible to different disability groups (Criterion 6).
Independent Alignm
ent Review
of the FS
AA
-PT
: Civics, U
S H
istory, and the Writing P
rompts
51
Table 51. Percent of Grade-Level APs Which Met Each LAL Criterion
Criterion 1 Criterion 2 Criterion 4 Criterion 6
Age Appropriate Content Centrality Performance Centrality Content Differentiation Performance Accuracy
Is the content of the APs
age appropriate?
Does the AP content link with the associated LAFS or NGSSS?
Are the APs comparable in complexity to the LAFS & NGSSS?
Does content differ across grade-levels
within a grade span?12
Are barriers to demonstrating student knowledge minimized?
Tables 10-11 Tables 12-13 Tables 16-17 Table 18 Tables 19-20
W4 0 out of 5
W5
W6
2 out of 5
W7
W8
W9 NA
W10
Civ NA
USH NA
12 For Writing grades 4-8, a comparison between this study and the 2016 alignment study (see Nemeth, et al. [2016 No. 029]) reveals different results. In the 2016 alignment study, panelists evaluated all the blueprint identified APs for Language Arts associated with the ELA FSAA-PT. However, the current alignment study required panelists to review the blueprint identified APs for Language Arts associated with only the writing prompt section of the ELA FSAA-PT.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 52
Criterion 4 (content differentiation) at the grade level was assessed only for writing grades 4-5 and 6-8. The civics assessment and US history assessment were not intended to have content differentiation between grades. Similarly, writing grades 9-10 APs were the same between these two grades. Content differentiation appears to be an area in need of improvement. For writing grades 4-5, panelists found content differentiation to be low in all areas (breadth, depth, prerequisite, new learning), and consequently rated the APs to be identical between the grades. For writing grades 6-8, the panelists found no differentiation in breadth between any of the three grades, low differentiation in new learning between grades 7 and 8, and partial differentiation in depth and prerequisite. As a result, they concluded that one of the APs (AP 2.4) is identical across the grades.
FSAA-PT Alignment Summary
Table 52 provides summary conclusions on the alignment of the FSAA-PT writing prompt section of the ELA, civics, and US history assessments to the LAFS and NGSSS for Social Studies APs, respectively. If tasks met the criterion, then a green highlighted box containing a ‘’ is assigned. For results falling slightly below a criterion, then a yellow highlighted box containing the criterion results is assigned. Finally, a red highlighted box contains results that fell well below the criterion.
The rules for the LAL and HumRRO criterion applied to the alignment between FSAA-PT tasks and APs are as follows:
Criterion 1: Age Appropriateness (individual panelist rating) - 90% or more of the tasks were rated as ‘adapted’ or ‘neutral’
Criterion 2b: Performance Centrality (individual panelist rating) - 90% or more of the tasks were rated as ‘some’ or ‘all’
Criterion 3a: Content Representation (individual panelist rating) - 90% or more of the tasks were rated as ‘partial’ or ‘fully’ aligned
Criterion 3b: Category Representation (based on individual panelist rating) - Tasks match the FSAA-PT Test Specifications targets
Criterion 3c: DOK Representation (individual panelist rating) - 50% or more of the item set task 3 were at the same or higher DOK level as the AP - 90% or more of the assigned complexity ratings are confirmed by panelists for DOK,
Volume of Information, Vocabulary, and Context
Criterion 4: Content Differentiation (consensus group rating) - Dimension ratings were ‘clear’ or ‘partial’ and the Identical dimension is ‘no’
Criterion 5: Achievement (consensus group rating) - 6 of the 7 dimensions have some level of inference, either low or high - At least 4 dimensions have a high level of inference
Criterion 6: Performance Accuracy (individual panelist rating) - 90% or more of the tasks were accessible to different disability groups - 90% or more of the tasks were amenable to accommodations or supports
Independent Alignm
ent Review
of the FS
AA
-PT
: Civics, U
S H
istory, and the Writing P
rompts
53
Table 52. Percent of Grade-Level Tasks Which Met Each LAL Criterion
Criterion 1 Criterion 2 Criterion 3 Criterion 4 Criterion 5 Criterion 6
Age
Appropriate Performance
Centrality Content Coverage
Content Differentiation
AchievementPerformance
Accuracy
Item
Alignment
Represent Intended
CategoriesTask Complexity
Is th
e co
nten
t of
the
task
s ag
e ap
pro
pria
te?
Is th
e ite
m s
et ta
sk
com
para
ble
in
com
plex
ity to
the
A
P?
Are
task
s fu
lly
alig
ned
with
A
Ps?
Do
task
s ad
equa
tely
re
pres
ent r
epor
ting
cate
gorie
s?
Do
task
s re
flect
the
rang
e of
DO
K in
the
AP
s?1
3
Do
pane
lists
agr
ee w
ith
DO
K?
Do
pane
lists
agr
ee w
ith
Vol
ume
of In
form
atio
n?
Do
pane
lists
agr
ee w
ith
Voc
abul
ary?
Do
pane
lists
agr
ee w
ith
Con
text
?
Wri
tin
g:
Do
prom
pts
incr
ease
in c
om
plex
ity
acro
ss g
rade
leve
ls?
14
C
ivic
s &
US
H:
Do
task
s w
ithin
an
item
set
in
crea
se in
co
mpl
exity
?
Stu
dent
ach
ieve
men
t de
mon
stra
tes
lear
ning
.
Are
task
s ac
cess
ible
to
diffe
rent
dis
abili
ty
grou
ps?
Are
task
s am
ena
ble
to
acco
mm
odat
ions
or
supp
orts
?
Tables 21-22
Tables 23-24
Tables 25-26
Tables 27-28
Tables 33-34
Tables 29-30
Tables 35-36
Tables 37-38
Tables 39-40
Tables 41-42
Tables 43-44
Tables 45-46
Tables 47-48
W4 17%
80% 70% 75% 75% 2 out of 5
6 out of 7; 3
75%
W5 13%
6 out of 7;
3
W6 88% 17%
0 out of 5
W7 88% 0%
W8 0%
W9 21% 65% 85% 75%
W10 33%
13 For Writing grades 4-10, a comparison between this study and the 2016 alignment study (see Nemeth, et al. [2016 No. 029]) reveals different results. In the 2016 alignment study, panelists evaluated the field test writing prompts which were still under development. Also, the panelists participating last year and this year were not the same educators. 14 In the 2016 alignment study, panelists evaluated all tasks and prompts on the ELA FSAA-PT. However, the current alignment study required panelists to review only the writing prompts of the ELA FSAA-PT.
Independent Alignm
ent Review
of the FS
AA
-PT
: Civics, U
S H
istory, and the Writing P
rompts
54
Civ 3 out of 7; 3
USH
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 55
In general, the civics and US history FSAA-PT exhibited good overall alignment with the fewest areas for improvement. The writing prompts associated with the ELA FSAA-PT showed more areas for improvement. Panelists found the APs and assessment tasks for all subjects and grades to be age appropriate (Criterion 1). They determined that for the most part, the assessment tasks maintain fidelity with the performance expectations in the APs for civics and US history, and for writing grades 4-5, 8, and 9-10. For writing grades 6 and 7, 88% of tasks were found to call for comparable performance levels as the standards (Criterion 2).
There were mixed results on Criterion 3. Panelists found the tasks for each grade and subject to be fully aligned with the standards, and the percent of aligned tasks matches test specifications. However, panelists found the task cognitive complexity to be substantially lower than the AP complexity in writing for all grades. In civics and US history, the cognitive complexity of tasks was found to match the AP cognitive complexity. For the most part, panelists agreed with the assigned DOK. There was some disagreement in writing grades 4 and 9, but the overall cognitive complexity assigned by the panelists was either the same or higher. For writing grade 4, panelists agreed with 80% of assigned DOK levels, rating 10% of tasks as requiring a higher DOK, and 10% of tasks as requiring a lower DOK. For writing grade 9, panelists agreed with only 65% of assigned DOK levels, rating other tasks as lower (25%) and higher (10%). Similarly, panelists agreed with most of Volume of Information levels, except for writing grades 4 and 9. They agreed with 70% of grade 4 writing tasks, and rated the other 30% as having a higher Volume of Information. For grade 9, on the other hand, panelists agreed with 85% of the tasks, and rated the other 15% as having a lower Volume of Information. For the most part, panelists agreed with the Vocabulary rating, with the exception of writing grade 4 and grade 9, where they agreed with 75% of the tasks. For grade 4, the other 15% of the tasks were rated as having a higher Vocabulary level, and for grade 9, 10% of the tasks were rated as having a lower Vocabulary level while 15% as having a higher Vocabulary level. Panelists agreed with the rating of Context in most cases, with the exception of grade 4. In this case, they agreed with 75% of the tasks, and rated the Context of the other 25% of the tasks as higher.
Criterion 4 was evaluated differently for the writing and social studies assessments. Since the writing tasks, unlike the tasks for civics and US history, were not ordered from easiest to hardest, these tasks were evaluated for content differentiation in the following way: Is there a progression in breadth, depth, prerequisite, new learning from lower grade prompt 1 and prompt 2 to higher grade prompt 1 and prompt 2? Content differentiation ratings at the prompt level agree with the overall AP content differentiation ratings for these grades. The progression of prompts for grades 4-5 writing was judged to have no differentiation in new learning, limited prerequisite differentiation, and partial differentiation in the breadth and depth. As a result, the panelists concluded that content differentiation was limited for these grades. However, for writing grades 6-8, panelists evaluated depth as limited between grades 6-7 and 8, and prerequisite as limited between grades 6 and 7. They stated that breadth and new learning were absent across all three grades, and consequently determined that the tasks were identical between grades 6 and 7, but not between grade 8 and the other two grades. For writing grades 9-10, panelists found clear content differentiation. For civics and US history, panelists were asked to evaluate whether the content differentiation existed from task 1 to task 3 in an item set. Overall, panelists found clear content differentiation for civics and US history.
Criterion 5 is an evaluation of whether the assessment system, in general, provides student demonstration of learning. Here, as well, some dimensions were rated by panelists as providing ‘no inference’ of student learning. For example, in civics panelists stated that little inference can be made about the presence of new learning, and the assessment results may be challenging to generalize across people and settings, and materials and activities. In grade 4-5 writing, the
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 56
assessment was seen to include tasks where hand over hand teacher guidance may be reducing the level of inference about student knowledge; therefore, the level of independence was judged to provide no inference about student knowledge. For the most part, across the subjects, panelists felt the FSAA-PT provides an assessment in which student learning can be demonstrated.
One thing we found in the course of this study is that even after we discussed with panelists the allowable accommodations and modifications as described in the test administration manual, panelists tend to think back to how these and similar assessments are being administered in the field. In some cases, for example, if a teacher is unable to elicit a response from a student by the means specified in the test administration manual, they are going to implement some solutions that are not explicitly prohibited, but also not explicitly endorsed in the manual. While the manual does not explicitly endorse hand over hand assistance, the participants observed it being implemented in the field, and mentioned it in the discussion. We value these statements by the teachers, even though they diverge from the test administration manual instructions, since they come from their expertise. To make ratings more consistent, it may be helpful to state more explicitly in the test administration manual not only which accommodations/modifications are allowed, but also which ones are prohibited.
For Criterion 6, the ratings provided by panelists for all grades and subjects except for writing grade 4 found 100% of the tasks to be accessible to different disability groups. For writing grade 4, only 75% of the tasks were rated as accessible to different disability groups. Panelists voiced concerns about the tasks translating to ASL, and students with visual impairments having trouble with some tasks.
Recommendations15
HumRRO makes the following suggestions to strengthen the alignment between the components of the Florida assessment system:
Review the cognitive complexity of writing tasks. Tasks should assess APs at the same or higher complexity level. This ensures the tasks are appropriately assessing the content of the AP that the task is asking a student to demonstrate knowledge and ability. The majority of writing tasks, associated with prompt 1, did not assess students on a cognitive complexity level that was similar to the cognitive complexity level of the AP; instead, tasks were judged to be too low. It is recommended that the writing tasks, particularly for prompt 1, be reviewed to ensure the cognitive complexity level of the tasks are in accordance with the assessment design and, if needed, additional writing tasks developed measuring a wider range of complexity to better match the cognitive complexity of the APs.
Review content differentiation of writing APs and tasks across grades. APs should increase in content breadth, depth and newer knowledge, as well as growth on prerequisite skills. Similarly, for assessments in which tasks are structured in such a way that they increase in cognitive complexity between grades (writing grades 4-10), there should be a progression of breadth and depth between the tasks between grades. However, in writing grades 4-5 and 6-8 no content differentiation was found between APs across grades within grade spans, and little task differentiation between grades within grade spans among the prompts. It is recommended, especially for writing grades 4-5 and 6-8 that the APs and tasks be reviewed to ensure appropriate content
15 A supplemental appendix, not for public dissemination as it contains item information, identifying specific items and tasks that FDOE and Measured Progress may want to review will be provided.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 57
differentiation within and across grade spans. If the content differentiation between APs and thus prompts is not meant to be reflected in the AP, per se, but in the complexity of the reading passage associated with the writing prompt, then additional training or communication to educators in the field regarding such is recommended.
Review the DOK, Volume of Information, Vocabulary, and Context assigned to tasks. For writing grades 4 and 9, panelists agreed with less than 90% of assigned DOK, Volume of Information, Vocabulary, and Context (writing grade 4 only) assigned to tasks. It is recommended, especially for these grades and subjects, for the tasks to be reviewed to ensure they reflect the appropriate DOK, Volume of Information, Vocabulary, and Context.
Review the degree to which the assessment provides evidence of a student’s ability to demonstrate what they know and can do. For civics and writing grades 4-5, panelists expressed concern that the tasks may not generalize to people and settings, and materials and activities, and that a student’s responses may not be sufficiently independent from the teacher. It is recommended that the accommodations and modifications allowed and not allowed is explicitly stated in the test administration manual.
Review the accessibility of tasks to different disability groups. For grade 4 writing, panelists rated only 75% of the tasks as accessible to all disability groups. Their specific concerns were the accessibility of tasks for deaf, deaf/ blind students, and students who communicate nonverbally with pictures. While only grade 4 writing did not meet the criterion for accessibility of tasks, concerns about these population groups were voiced by panelists in other subject groups as well. It is recommended that accommodations for these groups are provided and/or outlined in a more clear and specific fashion.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts 58
References
Flowers, C., Wakeman, S., Browder, D., & Karvonen, M. (2007). Links for academic learning: An alignment protocol for alternate assessments based on alternate achievement standards. Charlotte, NC: University of North Carolina at Charlotte. Retrieved from: http://www.naacpartners.org/LAL/documents/NAAC_AlignmentManualVer8_3.pdf
Nemeth, Y. M., Purl, J., & Smith, E. A. (2016). Independent alignment review of the Florida Standards Assessment (FSA) in English Language Arts and Mathematics (2016 No. 029). Alexandria, VA: Human Resources Research Organization.
Porter, A. C. (2002, October). Measuring the content of instruction: Uses in research and practice. Educational Researcher 31(7), 3-14.
Smith, E. A., Deatz, R. C., Wen, Y., & Nemeth, Y. M. (2014). Independent alignment review of the mathematics grade 11 Minnesota Test of Academic Skills (MTAS) (2014 No. 048). Alexandria, VA: Human Resources Research Organization.
Smith, E. A., Wen, Y., Nemeth, Y. M., Levinson, H., & Deatz, R. C. (2014). Independent alignment review of the mathematics grade 11 Minnesota Comprehensive Assessment (MCA-III) (2014 No. 058). Alexandria, VA: Human Resources Research Organization.
Webb, N. L. (1997). Criteria for alignment of expectations and assessments in mathematics and mathematics education (Research Monograph No. 6). Washington, DC: Council of Chief State Schools Officers.
Webb, N. L. (1999). Alignment of mathematics and mathematics standards and assessments in four states (Research Monograph 18). Madison, WI: National Institute for Mathematics Education and Council of Chief State School Officers. (ERIC Document Reproduction Service No. ED440852)
Webb, N. L. (2005). Webb alignment tool: Training manual. Madison, WI: Wisconsin Center for Education Research. Available: http://www.wcer.wisc.edu/WAT/index.aspx
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-1
Appendix A. Panelist Alignment Review Materials Samples
Panelists received the following instruction sheet and as a reference guide corresponding with verbal instructions form HumRRO facilitators.
FSAA-PT Panelist Instructions
Rating Task Documents Needed File Format
1 FSA Standards DOK (Consensus)
(1) FSA Standards (2) FSA_1_DOKConsensus_subject Grade x – x (3) Panelist Instructions (4) Depth of Knowledge_rev_nov21 subject ONLY.docx
Print copy Print copy Print copy Print copy
2
FSAA Access Points (AP) DOK (Consensus)
(1) FSAA Access Points (2) FSAA_2_APDOKConsensus_ subject Grade x – x (3) Panelist Instructions (4) Depth of Knowledge_rev_nov21 subject ONLY.docx
Print copy Print copy Print copy Print copy
3 AP Review (Individual)
(1) FSA Standards (2) FSAA Access Points (3) FSAA_3_APReview_subject Grade x – x (4) Panelist Instructions (5) Depth of Knowledge_rev_nov21 subject ONLY.docx
Print copy Print copy Excel spreadsheet Print copy Print copy
4
AP Content Differentiation (Consensus) Writing 4-5, 6-8 ONLY
(1) FSAA Access Points (2) FSAA_4_AP Content Diff_ subject Grade x – x
Print copy Excel spreadsheet
5 FSAA Task Review (Individual)
(1) FSAA Tasks (Prompts and Responses) (2) Item Workbook – subject Grade x – x (3) Panelist Instructions (4) FSAA Test Administration Manual (5) Presentation Rubric_rev_nov21.pdf (6) Depth of Knowledge_rev_nov21 subject ONLY.docx (7) FSAA Access Points
Print copy Excel spreadsheet Print copy Print copy Print copy Print copy Print copy
6 Task Content Differentiation (Consensus)
(1) FSAA Tasks (2) FSAA_6_ContentDiff_ subject Grade x – x
Print copy Excel spreadsheet
7 Whole Test (Consensus)
(1) FSAA Tasks (Prompts and Responses) (2) FSAA Test Administration Manual (3) FSAA_8_WholeTestCon_ subject Grade x – x
Print copy Print copy Excel spreadsheet
8
Student Learning Review (Consensus)
(1) FSAA Test Specs for writing, civics, and US history (2) FSAA Assessment Manual excerpts
Print copy Print copy Excel spreadsheet
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-2
Rating Form Excel files: Access HumRRO item rating forms:
a. Locate folder on desktop, double click to open. b. Open file specified by facilitator (example - FSAA_3_APReview_subject Grade x
– x). c. File, Save As, same file name with an underscore and their 3 initial extension
(e.g., FSAA_3_APReview_subject Grade x – x _eas). d. Autosave will be set to every “1” minute; however, please save often as this
doesn’t work all the time. e. Repeat for each rating form.
1 Rate FSA Standard DOK (Consensus)
(1) Use rating sheet, FSA Standards, and Depth of Knowledge (2) Calibration: Rate 5 Standards independently, answer on rating sheet handout, and
discuss as group to reach consensus. Note: if unable to reach consensus, majority rules, then tie break is higher DOK rating.
(3) The facilitator may repeat before you start entering your independent ratings.
2 Rate FSAA Extended Standard (AP) DOK (Consensus)
(1) Use rating sheet, FSAA Access Points, and Depth of Knowledge (2) Calibration: Rate 5 Access Points independently, answer on rating sheet handout, and
discuss as group to reach consensus. Note: if unable to reach consensus, majority rules, then tie break is higher DOK rating.
(3) The facilitator may repeat before you start entering your independent ratings. 3 Access Point (AP) Review
(1) Open FSAA_3_APReview_subject Grade x – x.xls and save with initial extension. (2) Review rating categories (codes on following pages). Reminder: White cells are for your
data; however, this file does have slightly greyed-out columns that you may need to ender data in. They are greyed-out because using them will likely be rare.
a. Content Centrality: Is all of the content in AP in the indicated standard? If you rate other than “Fully aligned”, you must explain in the second column what content the AP covers that is not part of the standard. Column 3 is available to suggest another standard if you feel there is one that is more appropriate.
b. Performance Centrality: Are students called upon to perform similarly between the AP and FSA standard? For example, do both standards require the student to select, identify, compare, analyze, or evaluate? If there are differences, then rate accordingly.
c. Age Appropriateness: Is the content and context of the AP indicative of age/grade level content.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-3
d. Barriers to Demonstrating Knowledge. There are two ratings.
Symbolic (This is asking the level of communication required by the AP for this student population to reasonably demonstrate knowledge):
Awareness (pre-symbolic): 10% of alternate assessment students have minimal intentional communication skills, and may not respond to any on-demand assessment. Their physical challenges may make it difficult to judge responses. Teachers who know student may make inferences. In reading, objects and pictures are more concrete, and the student may make some limited connections to printed text, Braille, or raised symbols.
Early Symbolic: Intentional communication that can be interpreted more clearly but students lack abstract symbolic language; also described as emergent symbolic. This student can convey what they know and can do, but may not always communicate clearly or consistently. In reading, the student relies on picture discrimination and read-alouds with simplified text for listening comprehension.
Symbolic: Communicates with verbal or written words, sign language, braille, or language-based augmentative and alternative communication systems. Like the emergent symbolic student, however, this student can communicate what they know and can do, but communication is not always clear or consistent. Reading may be limited to pictures and sight words, and writing responses with sight words only.
Accessibility (This is outside of communication abilities; such as if students with visual impairments, or inability to follow instructions, or need of assistive technology):
Yes, all FSAA eligible students can demonstrate the knowledge required by this AP.
No, some FSAA eligible students can not demonstrate the knowledge required by this AP.
(3) Calibration: Rate 5 Access Points independently and discuss as group. This is NOT consensus and is only to ensure everyone is comfortable with the ratings.
(4) The facilitator may repeat before you start entering your independent ratings.
4 Content Differentiation for AP – Writing Grades 4-5 & 6-8 ONLY This criterion focuses on whether the content expectations (access points) change appropriately between grade levels. NOTE: THIS IS ONLY FOR WRITING GRADES 4-5 & 6-8
(1) Open FSAA_4_APContentDiff_subject Grade x – x.xls and save with initial extension. (2) Review rating categories (codes on following pages).
a. Use FSAA Access Points. b. Review APs for adjacent grades. c. Always specify an example(s) when explaining rating.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-4
5 FSAA Task Review
(1) Panelists access Item Workbook - subject Grade x - x.xls and save with initial extension. (2) Review rating categories (codes on following pages)
a. Task Complexity Ratings (columns C-N) (Use Presentation Rubric_rev_nov21.pdf for definitions of VI, V, and C):
i. DOK: Does the assigned DOK indicate the correct level of complexity for this task? If not, then provide an alternate DOK rating and an explanation for the change.
ii. Volume of Information: Does the assigned VI indicate the correct level of volume of information? If not, then provide an alternate VI rating and an explanation for the change.
iii. Vocabulary: Does the assigned V indicate the correct level of vocabulary? If not, then provide an alternate V rating and explanation for the change.
iv. Context: Does the assigned C indicate the correct level of context? If not, then provide an alternate C rating and an explanation for the change.
b. Content Centrality: Does the content in the task match with content indicated in
AP? If you rate other than “Fully Aligned” explain what content is missing from EB and provide another benchmark if you feel there is one that is more appropriate.
c. Performance Centrality: Do the tasks allow students to demonstrate content at a similar performance level as the AP? Performance types include: select, identify, compare, analyze, or evaluate.
d. Age Appropriateness: Is the content and context of the content age/grade level appropriate?
e. Barriers to Demonstrating Knowledge. This has three ratings, symbolic, accessibility, and modification. See Task #3, AP Review, for information for Symbolic and Accessibility.
Modification (This is asking if there are supports teachers can provide, such as assistive technology or additional prompts of some type (ask for suggestions from the special ed teachers) as appropriate for a given students:
i. Yes, the task could be modified to be more accessible for some students without changing meaning.
ii. No, modifying the task further would change the meaning of difficulty.
(1) Calibration: Rate 5 tasks independently and discuss as group. This is NOT consensus
and is only to ensure everyone is comfortable with the ratings. (2) The facilitator may repeat before you start entering your independent ratings.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-5
6 Content Differentiation for Tasks This criterion focuses on whether the content presented in items change appropriately between task levels.
(1) Open FSAA_6_ContentDiff_ subject Grade x – x.xls and save with initial extension. (2) Review rating categories (codes on following pages).
a. Use all items. b. Rate based on task levels for items on the test. c. Explain ratings for each category by citing specific example(s).
7 Rate ‘Whole Test’ (Consensus) by grade level assessment form The purpose of this step is to determine if barriers exist for some students to demonstrate learning per test form, similar to the AP and task ratings earlier, only as a consensus discussion.
(1) Open FSAA_7_WholeTestCon_ subject Grade x - x for reference only. The facilitator will record the groups discussion.
(2) Focus on across the assessment form in general, but use task examples for evidence in support of rating.
(3) Use FSAA materials, FSAA Administration Manual, and FSAA Tasks.
8 Student Learning (Consensus)
This criterion is to identify if inferences can be made about a student from their scores. In other words, are the scores indicative of student learning and knowledge, or are the scores entirely, or partly, the result of the teacher or program?
(1) Open FSAA_7_subject Grade x – x.xls as a reference only The facilitator will record the groups discussion.
a. You will need FSAA materials, FSAA Assessment Manual, FSAA tasks. b. Discuss each criteria and facilitator records “y” or “n” in one of the 3 available
cells for each criterion and documents the discussion (key points). (2) Additional clarification support:
a. Accuracy: From the tasks and assessment administration guidance, does it appear the teacher has a wide latitude interpreting student responses or is it clear that student response clearly shows learning has occurred.
b. Independence: From assessment administration guidance, what level of assistance is the teacher allowed to provide their student? For example: Hand-over-hand assistance is the teacher physically helping the student indicate the response selection.
c. New learning: Ask them to think about their responses to the Content Differentiation steps provided for Access Points and Tasks in addition to looking through test specs and assessment administration manual for indication of baseline or pretesting.
d. Generalize across people/settings: This is asking if the tasks are designed to work across people and settings, or if they are designed at the lowest end (no student inference) as being answerable if one person gives them to a student.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-6
Can more than one particular person present a task item and record responses for any given student?
e. Generalize across materials/activities: This is similar to above, only with regard to the materials and activities. Are the tasks designed to fit only one standard and in only one context, with no options for using different materials?
f. Standard setting: Have panelists review Access Points, scoring guidelines, and rubrics to determine if at the lowest option students could make proficiency by chance. Are the APs written such that students would not be proficient without having to show some independent learning?
g. Program quality: Does FL use program quality indicators and do those indicators influence a student’s score? Example of indicators that could impact a student score would be if the task prompt was part of the evaluation of the student’s score or if the completeness and accuracy of the student’s IEP were part of the scoring (program indicator).
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-7
Training Support Materials Depth of Knowledge_rev_nov21 ELA ONLY.docx Depth of Knowledge_rev_nov21 SOCIAL STUDIES ONLY.docx Presentation Rubric_rev_nov21.pdf Step 3 and 5 Access Point and Task Reviews
Category Code Description
Content Centrality
1- Not aligned 2- Partially aligned 3-Fully aligned
AP/Task does not match standard content at all AP/Task is not fully aligned to the standard content AP/Task is a good match to standard content
Age Appropriateness
I-Inappropriate N-Neutral A-Adapted
Content is off-grade level Content is not age-bound, it is appropriate at any age or grade Adapted from, or linked to, age/grade-level content
Performance Centrality
N-None S-Some A-All
AP/Task has no similar performance types AP/Task has some similar performance types AP/Task has the same performance types
Symbolic Communication
A-Awareness E-Early symbolic S-Symbolic
Minimal intentional communication.; teacher inferred Recognizes some symbol-object relationships Has a broad knowledge of symbols, communicates picture or words through speech, assistive technology, signing
Accessibility Y-Yes N-No
AP is accessible to all students Some students cannot access content (explain who & why)
Modifications or Supports
Y-Yes N-No
Modifications and supports can be provided for this Task. This Task is not amenable to supports or modifications without changing meaning or difficulty.
Step 4 and 6 Content Differentiation (across grades) for APs and Tasks Category Description
Broader Higher-grade APs reflect broader application of target skill/knowledge.
Higher tasks reflect broader application of target skill/knowledge (AP).
Deeper Higher-grade APs reflect deeper mastery of the target skill/knowledge.
Higher tasks reflect deeper mastery of the target skill/knowledge (AP).
Prerequisite Lower-grade APs target a prerequisite skill for mastery of the higher grade AP.
Lower tasks target a prerequisite skill for mastery of the AP.
New The higher-grade has a new skill or knowledge unrelated to skill/knowledge covered at prior grades.
The higher task has a new skill or knowledge that combined with the lower tasks allows for the complete AP.
Identical Higher-grade APs appear identical to one of the lower-grade APs.
Higher tasks appear identical to one of the lower tasks in what a student is being asked to know/do.
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-8
Depth of Knowledge – ELA Reference Sheet revised 11/21/14
All items should be assigned a Depth of Knowledge level based on the information presented in the table below. Content clarification examples are not exhaustive and general performance verbs are not the defining criteria for Depth of Knowledge classification.
1 Attention General Performance Verbs: touch look vocalize repeat attend
• Simple commands that require no answer—only require doing the command.
• Generally not assessed as a skill. Used to focus the student on a task.
Examples: Look at me. Listen while I read this story.
2 Rote Knowledge, Memorize& Recall
General Performance Verbs: list identify state label recognize record match recall retell
• Habitual response—recalls previously heard or learned information. • Practiced, rote behavior. • No inferences are required for correct answer. • Habitual response of common day to day activities or objects.
English Language Arts
Matches picture/word to picture/word. Identifies rhyming words. Identifies letters by phonics/sounds or sight. Identifies detail of text of 2-3 simple sentences using verbatim wording. Identifies correct spelling of misspelled word. Identifies misspelled common words. Identifies letters and phonetically regular, high frequency words (self-read).
Examples: Show me/tell me… …which can you drink from? (book, cup, pen) …what do you read? (book, desk, stapler) …which pair of words rhyme?
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-9
3 Use of Knowledge and Information
General Performance Verbs: perform tell demonstrate follow count locate name read describe define spell
• Engagement of some mental processing beyond habitual response. • Simple inferences may be needed. • Uses information from a chart or graph to make simple inferences in order to
correctly respond. • Chooses what comes next in a sequence.
English Language Arts
Indicates comprehension of basic/common words or two to three word sentences.
Identifies main idea by applying information gained from text.
Identifies detail by making simple inferences.
Identifies a relevant or best sentence to add to passage.
Self-reads materials/passages.
Identifies best word to complete sentence.
Identifies initial word in sentence in need of capitalization.
Identifies the correct spelling of grade appropriate words presented in sentence.
Identifies prefixes/suffixes in words.
Identifies incorrectly used common punctuation.
Identifies basic punctuation including periods, commas, and question marks.
Examples: Show me/tell me… …what is the main idea?
…who is this story about?
…what fits in the blank of this sentence?
…what happens next in the story?
…which word in this sentence is misspelled?
…which word uses the pre-fix…..
…which group of words has a comma?
…which word describes sound?
…which piece of evidence supports this clam?
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-10
4 Comprehension General Performance Verbs: explain conclude group categorize restate review translate describe paraphrase infer summarize illustrate compute classify solve
• Strategic thinking—requires reasoning, planning a sequence of steps. • Answer choices summarize and are not verbatim from passage.
English Language Arts
FROM INFORMATION THAT IS INFERRED:
• Identifies theme or message of a story. • Identifies main idea by drawing
conclusions or making inferences. • Identifies elements of a story without
definition of the element. • Identifies purpose of writing passage. • Selects best sentence(s) for middle or
end of passage (correct order required). • Orders three or more sentences to
communicate logical sequence of events. • Sorts or groups words or items with
categories given. • Identifies sentence that best supports
topic. • Identifies two or more sentences to
complete a composition. • Identifies correct meaning of words from
context sentence. • Edits for correct use of subject and verb
agreement. • Edits for correct use of singular and plural
nouns. • Identifies proper nouns and pronouns
within sentences, and book titles in need of capitalization.
• Identifies correct usage of punctuation.
Examples: Show me/tell me…
…what is the main idea? …who is this story about? …what is the “plot” of this story? …which of these is found inside a house and which are found outside a house? (bed, swing set, trees, car, computer) Bed becomes a plural (more than one bed) by adding an “s”. …what would more than one tree be? (tree, treeses, trees) …which sentence shows commas used correctly? …which sentence provides the best conclusion by stating why the claim is significant?
5 Application General Performance Verbs: organize collect apply construct use develop generate interact with text implement compare contrast
• Extended thinking—making connections within and between subject domains, non routine problem solving.
• Student generates answer without cues. English Language Arts
• Makes connections between multiple sources.
• Compares events in two passages. • Generates response. • Implements a plan.
Examples: Show me/tell me…
…how the poem and the story are the same. …how the structure of both passages is the same. …how to revise this sentence using fewer words. (no response options)
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-11
6 Analysis Evaluation
General Performance Verbs: pattern analyze compose predict extend plan judge evaluate interpret cause/effect investigate examine distinguish differentiate generate
• Requires investigation. • Student predicts based on information
given. • Student creates possible alternative
outcomes. • Student uses multiple sources to answer
question without cues/supports. • Generally, DOK levels of 6 will not be
found on the assessment unless open response items that require investigation using two or more texts are assessed.
Examples: …tell me another possible ending to the story (no options provided). …what kind of science experiment can you do to find out how many hours of sun a seed needs to sprout?
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-12
Depth of Knowledge Rubric - Social Studies revised 6/2017 All items should be assigned a Depth of Knowledge level based on the information presented in the table below. Content clarification examples are not exhaustive and general performance verbs are not the defining criteria for Depth of Knowledge classification.
1 Attention General Performance Verbs: Touch, look, vocalize, repeat, attend
Simple commands that require no answer—only require doing the command.
Generally not assessed as a skill. Used to focus the student on a task.
Examples: Look at me. Listen while I read this story.
2 Rote Knowledge, Memorize& Recall General Performance Verbs: List, identify, state, label, recognize, record, match, recall, retell
Habitual response—recalls previously heard or learned information. Practiced, rote behavior. No inferences are required for correct answer. Habitual response of common day to day activities or objects.
Social Studies Matches pictures and/or words. Identifies details from text (1-2 simple
sentences) using verbatim wording. Identifies familiar characteristics of time
periods or situations. Recognizes simple definitions of social
studies related terms when definition is provided.
Examples
…what is something else that is built by people? (ship, rock, leaf)
…what is a manufactured good? (cats, shoes, trees)
What is a [law, rule, right, constitution, amendment]?
3 Use of Knowledge and Information General Performance Verbs: Perform, tell, demonstrate, follow, count, locate, name, read, describe, define, spell
Engagement of some mental processing beyond habitual response. Simple inferences may be needed. Uses information from a chart or graph to make simple inferences in order to
correctly respond. Chooses what comes next in a sequence.
Social Studies Identifies detail of text with 2-4 sentences
requiring a slight inference or connection of ideas.
Indicates comprehension of common social studies content words or concepts.
Identifies the how, who, what, and/or why of governmental processes.
Identifies reasons or importance of events and/or actions.
Examples:
Why did (name of person) build a (name of structure or invention)?
What was one reason why the (name of event or situation) take place?
What is the process for making a (law, rule, constitutional amendment)?
Why is (law, rule, right, constitution, amendment) important?
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-13
4 Comprehension
General Performance Verbs: Explain, conclude, group, categorize, restate, review, translate, describe, paraphrase, infer, summarize, illustrate, compute, classify, solve
Strategic thinking—requires reasoning, planning a sequence of steps. Answer choices summarize and are not verbatim from passage.
Social Studies
Draws conclusions based on information provided in a chart, table, or diagram.
Uses information to complete a chart. Identifies trends and/or changes in
processes or in ways of life. Identifies reasons and/or consequences of
changes.
Examples:
Based on information in the chart, how has (process, occupation, way of living, law, constitution) changed over the years?
Which sentence best completes the chart?
What was one result of the change in (event, people living in area, law, economic situation, invention)?
5 Application
General Performance Verbs: Organize, collect, apply, construct, use, develop, generate, interact with text, implement, compare, contrast
Extended thinking—making connections within and between subject domains, non routine problem solving.
Student generates answer without cues.
Social Studies
Explains cause and effect relationships. Explain similarities. Explain differences.
Examples:
Based on the agreements, what would have happened if. . . ?
In what way are these two (people, organizations, laws, events, governmental programs) alike?
What is one difference between. . . ?
Independent Alignment Review of the FSAA-PT: Civics, US History, and the Writing Prompts A-14
6 Analysis Evaluation
General Performance Verbs: pattern, analyze, compose, predict, extend, plan, judge, evaluate, interpret, cause/effect, investigate, examine, distinguish, differentiate, generate
Requires investigation. Student predicts based on information
given. Student creates possible alternative
outcomes. Student uses multiple sources to answer
question without cues/supports. Generally, DOK levels of 6 will not be
found on the assessment unless open response items that require investigation using two or more texts are assessed.
Examples:
…tell me another possible ending to the story (no options provided).
…what kind of science experiment can you do to find out how many hours of sun a seed needs to sprout?
Independent Alignm
ent Review
of the FS
AA
-PT
: Civics, U
S H
istory, and the Writing P
rompts
A-15
Panelists reviewed the individual FSAA item tasks using the following rating form in electronic format. The format of the rating form was identical for each subject grade-level.