LinkedInFacebookShare

Systems of Assessments for Measuring Multidimensional Science Standards

Science students in the classroom

By Matt Silberglitt and Kevin King of WestEd’s Assessment Design and Development team

The conceptual framework for multidimensional science education standards has been central to the science education landscape for about 10 years now. In schools and classrooms responding to these standards, significant shifts have changed the nature of science teaching and learning. More than ever, science teaching in these classrooms is grounded in real-world scientific phenomena, learning that includes connecting ideas across topics and domains of science, and students engaged in really “doing science” to:

  •  Ask and answer questions,
  • Investigate and solve problems,
  • Communicate scientific ideas, and
  • Gather evidence
    • Use that evidence to develop models of science systems
    • Explain the mechanisms behind scientific phenomena
    • Engage in argumentation that supports or refutes scientific claims

Assessments aligned to these multidimensional standards have mostly been developed in the last five to eight years, including classroom assessments that are embedded into curricula or at the end of units, various interpretations of interim assessments, and large-scale assessments developed at the state level. Additionally, evaluation tools and open-source instructional materials have been developed, and groups working to advance various components of balanced assessment systems have emerged.

This post focuses on a balanced assessment system and one state’s efforts to build such a system from the bottom up, as recommended by the National Research Council’s Board on Testing and Assessment (BOTA) and Board on Science Education (BOSE).

Assessment timeline

Assessment is much more than just a test

Assessment includes a wide range of activities, from the moment-by-moment monitoring that teachers use to assess student learning informally during a lesson or activity, to the largest-scale national and international assessment programs such as NAEP and PISA. A system of assessments should be built to cross this range in a way that is purposefully and intentionally coherent. And each part of the system should reinforce the others through shared, developmentally appropriate goals and expectations while requiring a variety of strategies to serve different purposes.

The state of Delaware, taking the “bottom-up” approach recommended by BOTA and BOSE, has developed an assessment system that has three main components:

  • Classroom-embedded instructional assessments coupled with assessment literacy professional learning
  • End-of-unit assessments (EoUs), which are interim focused
  • A state-level summative integrative transfer assessment (ITA)

At each level, the assessments are aligned to the NGSS and provide information about student knowledge and abilities related to the NGSS. Additionally, each level incorporates different strategies for the development and use of the assessments.

Classroom-embedded assessments. At this first level, classroom-embedded assessments were developed for teachers and by teachers, with guidance from state and national experts in science education and assessment. The assessments are shared within a community of science educators across Delaware, and teachers select which assessments they will use and when they will use them. Teachers also score the assessments and use the data to make instructional decisions, such as adjusting the content and timing of subsequent lessons and assessments.

End-of-unit-assessments. At the next level, EoUs were developed by experts in science assessment. However, they were informed by instructional content used across Delaware and the structure of summative test components, vetted by teachers, and field-tested in their classrooms. The results of these field tests are just now being collected and will be used by committees of teachers to evaluate the quality of the assessments and to recommend improvements.

Like the classroom-embedded instructional assessments, using EoUs is entirely in the hands of the classroom teacher. With this flexibility, EoUs can be used to evaluate learning at the end of a unit, monitor progress across units, and—at the teacher’s discretion and in response to the needs of their classroom—inform instructional decision-making for the unit the EoU addresses, for subsequent units, or for practices and crosscutting concepts across instructional units.

Information about student knowledge and skills is available at the PE level (for the PEs assessed by the EoU), practice and crosscutting concept level. This level of the system provides a view of a student’s progress in their ability to make sense of scientific phenomena by engaging in science practices and applying understandings that cut across topics and domains. It is intentional that the EoUs provide the most detailed picture of student understanding at a fine-grain level.

State-level summative assessments. Design features of the ITA, the highest level of the assessment system, inform design features of the EoUs, but the two types of assessments have distinctive and purposeful differences. Like the EoUs, the ITAs are developed by experts and vetted by Delaware’s educators. However, each ITA is designed to assess the breadth of standards at benchmark grade levels across the full instructional range (grades 5 and 8, high school biology) rather than focus on smaller units of instruction at each grade level. The ITA also has a continuous development cycle for large-scale assessment security reasons and has different forms each year, which are equated to one another.

Perennial results at this level help educators recognize areas of continued strength in the implementation of curricula and in instruction. The results also help guide shifts from year to year that address areas of weakness at the school, district, or state level. The focus of ITAs is on transfer of core scientific understandings and is reported only at the scale score/proficiency levels for students (and then aggregated to other levels).

A thoughtful theory of action for the Delaware science assessment system

From when Delaware first envisioned this system in 2014, the state has had a thoughtfully crafted theory of action that has guided decision-making as the system components have been developed over time.

  • Developing a theory of action for the assessment system began with understanding that the primary role of the assessment system is to provide timely feedback within the larger science education system.
  • Within the system and with its overall role in mind, each of the components and their roles were identified. The role of each component reflects its intended use within feedback cycles at different time scales. Developing the system with these roles and uses in mind has ensured alignment between instructional goals and assessment claims.
  • Teachers develop capacity by developing tools as well as using them. So having a system of assessments; measuring the appropriate information at the right time and providing only the information that is truly actionable at the right aggregate levels; and involving teachers in the development, review, and scoring of assessments in the right ways will result in teachers better understanding instructional standards and the applicability of reporting. They will then improve their instruction and response to assessment data in order to improve student learning and transfer understanding of complex skills and content (as represented by practices, core ideas, and crosscutting concepts) as intended by the standards.
  • Having a thoughtfully constructed system supported by all levels of the instructional system (state, district, school, teachers) will improve educators’ understanding of expectations in order to best facilitate students’ deep learning and transfer understanding.

An assessment system is core to Delaware’s vision for its science education system, so professional learning related to the building and implementation of each component of the system has been central. In addition, the work of Delaware’s assessment team has intersected with the state’s instructional and curriculum teams to make sure that NGSS training for educators and administrators is designed and implemented coherently.

Successful instruction and learning for students in Delaware will continue to be enhanced through the implementation of the state’s effective, coherent, and integrated system of assessments.

Follow Us on Social Media

How is your state measuring multidimensional science standards? Join the conversation on TwitterFacebook, and LinkedIn.


Matt Silberglitt, Manager of Science Assessment with WestEd’s Assessment Design and Development team, manages activities for science assessment development projects.

Kevin King is a Senior Project Leader in WestEd’s Assessment Design and Development team and a co-lead for the State Science Assessment Solutions team.

More Related to this Post