Skip to page content

AI for STEM Projects

PASTA: Collaborative Research: Supporting Instructional Decision Making: Potential of An Automatically Scored Three-dimensional Assessment System

Funder: National Science Foundation

Decorative PASTA image

This project studies the utility of a machine learning-based assessment system for supporting middle school science teachers in making instructional decisions based on automatically generated student reports (AutoRs). The assessments target three-dimensional (3D) science learning by requiring students to integrate scientific practices, crosscutting concepts, and disciplinary core ideas to make sense of phenomena or solve complex problems. The project will develop computer scoring algorithms, a suite of AutoRs, and an array of pedagogical content knowledge supports. These products will assist middle school science teachers in the use of 3D assessments, making informative instructional changes, and improving students’ 3D learning. The project will generate knowledge about teachers’ uses of 3D assessments and examine the potential of automatically scored 3D assessments.

International Conference for AI-based Assessment in STEM Education

Funder: National Science Foundation

Decorative Conference image

Achieving the three-dimensional learning goals in the Framework for K-12 Science Education requires transformation of assessment practices from relying on multiple-choice items to performance-based, knowledge-in-use tasks. However, these performance-based constructed-response items often prohibit timely feedback which, in turn, has hindered science teachers from using these assessments. Artificial intelligence (AI) has demonstrated great potential to meet this assessment challenge. To tackle this challenge, experts in assessment, AI, and science education gathered for a two-day conference at the University of Georgia to generate knowledge of integrating AI in science assessment.

Does AI Have a Bias? A Critical Examination of Scoring Bias of Machine Algorithms on Students from Underrepresented Groups in STEM (SURSs)

Funder: NAED/Spencer Research Development Award

Decorative AI Bias image

The NAEd/Spencer project will answer: (a) Are artificial intelligence (AI) algorithms more biased than humans when scoring students from underrepresented groups in STEM’s (SURSs’) drawing models and writing explanations in scientific modeling practice? (b) Are AI algorithms more sensitive to the linguistic and cultural features of the assessments than human experts? Two sets of assessments that are aligned with the Next Generation Science Standards with varying critical cultural features will be developed. Students’ responses will be collected from schools where almost half are SURSs. I will compare machine severity on scoring SURSs’ responses with standard scorers’ (e.g., human consent scores), as well as examine how item cultural features interact with machine scoring capacity, as compared to human raters. The findings will inform the potential bias by using AI algorithms. Using knowledge learned in this project, educators can identify potential strategies to improve culturally responsive assessments and justify the use of AI to develop more inclusive and equitable science learning.

siSTEMas: Stimulating Immersive Science Through Engaging Multilingual and Authentic Scenarios

Funder: National Institutes of Health

SiSTEMas logo

The focus of the siSTEMas proposal is to support and equip underrepresented students, with a particular focus on Latinx Multilingual Learners (LML), to enter the STEM pipeline and persist. The project is articulated in four specific aims that include: (1) creating two new versions of Virtual Vet, a narrative-rich read aloud version and Spanish version, to reach a more diverse student population with the international, award-winning serious game; (2) developing a new immersive environment, Virtual Vet Middle Grades, that targets a deeper understanding of the human body through the study of genetics; (3) creating a responsive and customized environment in Virtual Vet that leverages deep learning approaches to provide timely feedback to students and teachers; and (4) developing a five-day STEM camp on the University of Georgia’s campus in the genetics department to provide inclusive and ambitious science learning experiences specifically supporting LML students by providing instruction in English and Spanish. Through partnership with six school districts, more than 6,000 students will garner access to Virtual Vet and Virtual Vet Middle Grades.

Visit the siSTEMas website

ArguLex – Applying Automated Analysis to a Learning Progression for Argumentation

Funder: National Science Foundation

Decorative ArguLex image

This NSF-funded project applies lexical analysis and machine learning technologies to develop an efficient, valid, reliable and automated measure of middle school students’ abilities to engage in scientific argumentation. The project will build upon prior work that developed high quality assessments for a learning progression for argumentation. These assessments are time and resource intensive to score, but when used with automated approaches, such assessments will allow measurement of argumentation to be taken to scale with rapid formative feedback and will be an invaluable resource for STEM teachers, researchers, and teacher educators. The project brings together BSCS researchers who have experience measuring argumentation; Michigan State University’s Automated Analysis of Constructed Response research group, which provides expertise across a range of scientific disciplines refining analysis for formative educational purposes; Stanford University, which provides expertise in learning progressions for argumentation in science; and Western Michigan University, which serves as an external evaluator.

Examining AI Scoring Bias in PISA Tests of Both Germany and the US: Does Non-Native Background Matter?

FUNDER: Alexander von Humboldt Foundation, Germany

Photo by: Atikah Akhtar on unsplash.com

The increasing demand for a workforce that is capable of critical thinking in STEM requires learners to develop competence in science. However, supporting students in developing such competence is widely acknowledged to require high-quality assessments, such as constructed responses, that are beyond the multiple-choice items. This transformation has appeared in the Program for International Student Assessment (PISA). To facilitate timely feedback and automatic scoring, we will employ artificial intelligence (AI) in this project, focusing on examining scoring bias. While AI has demonstrated great potential to automate the scoring, equity and ethical issues arise, which are particularly pronounced for non-native background students (non-NBSs) due to their known disadvantages in lower language proficiency and cultural diversity. It is evident that human assigned scores to their constructed responses may be biased and would reflect these disadvantages. Similar to human raters, AI will also be likely to pose bias, if not worse than human raters. This is because AI algorithms are usually trained using human-coded responses.

Can AI decrease the potential bias to non-NBSs given its known reliability, or will AI generate more biased scores? To answer this question, we will apply the many-facet IRT models to examine PISA 2015 science data for both Germany and the U.S. The findings will inform whether AI algorithms are more severe than “ideal human experts” when scoring non-NBSs’ work. We will also analyze the linguistic and cultural features of the PISA assessment tasks and further examine whether AI algorithms are more sensitive to students’ cultural backgrounds than human experts. Using knowledge learned in this project, educators and policymakers can justify the use of AI in high-stakes tests such as PISA and identify potential strategies to improve culturally responsive features of assessments so that AI algorithms can accurately score items.

© University of Georgia, Athens, GA 30602
706‑542‑3000