Exploring the affordances of 3D assessments in measuring three-dimensional science learning
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The introduction of the Next Generation Science Standards (NGSS) calls for three-dimensional learning in K–12 science, defined as the integration of core ideas, science practices, and broader concepts to foster higher proficiency in science. This indicates a shift from standards that promote the recitation of isolated facts as ‘knowing’ science toward a contextualized application of essential ideas so students may develop robust scientific knowledge of complex ideas and use scientific practices to construct and defend explanations to solve problems. The NGSS promises to bring about meaningful change in American K–12 science education. However, the full adoption of three-dimensional standards, curriculum, and instruction also necessitates the development of assessments to evaluate the efficacy of these reform efforts. Unfortunately, traditional science assessments heavily rely on selected-response items (e.g., multiple-choice questions) that are generally regarded as inadequate for evaluating proficiency in something as complex as three-dimensional learning. Alternative assessment formats (e.g., constructed-response, essay) allow for more complex constructed responses from students. Tests that require hands-on manipulation of materials (e.g., practical assessments), although are better suited to providing evidence of student proficiency in three-dimensional learning, yet require investments of time, staffing, and other resources, rendering them impractical for uniform use in the classroom or scaling to large scale (e.g., statewide) implementation. Therefore, assessments are moving toward technology-based vehicles as a way to develop and deliver assessments that provide more complex user (student) interactions. Given that these technological platforms can collect and analyze large amounts of data quickly and efficiently, assessment data could be used effective evidence supporting claims about student proficiency in three-dimensional learning. This study represents a preliminary effort to explore a sample of items, developed using Evidence-Centered Design, and created specifically to evaluate three-dimensional learning as represented by NGSS. Semi-structured cognitive interviews were used to evaluate the appropriateness of data collected from these sample items. The results indicated that the new assessment items were at least as accurate as multiple-choice questions in their sorting of students. Students who demonstrated high proficiency in the topics scored as high proficiency on the test items, and vice versa. Furthermore, individual items have been proved to be capable of producing appropriate evidence for inferences of student proficiency. Students tended to respond correctly to individual items tied to specific content if they demonstrated proficiency in that specific content, and to respond incorrectly to items aligned to content in which they demonstrated misconceptions or a general lack of proficiency. This study provides preliminary evidence that technology can be used to deliver assessment items that offer more complex interactions than simple selected-response and that these items can quickly and efficiently generate appropriate data for evidentiary arguments about student proficiency in three-dimensional science learning. Further research is needed to examine larger samples of more varied item types with greater numbers and a wider variety of students. Additionally, the methodology employed here, using cognitive interviews to evaluate and improve the quality of assessment items could be expanded and refined with more and larger studies.