
- This event has passed.
LARC 2025 Conference Presentation—Comparability of Computer-Based and Print-Based Testing Modalities of a Scripted Oral English Language Proficiency Assessment
April 3 @ 12:00 am

Comparability of Computer-Based and Print-Based Testing Modalities of a Scripted Oral English Language Proficiency Assessment
Date & Time: April 3, 2025 | 11:00–11:30 a.m. PDT
Location: California State University, Fullerton
Presenter(s):
- Yage Leah Guo (Center for Applied Linguistics)
- Reshmi Kompakha (Center for Applied Linguistics)
- Elyssa Sun (Center for Applied Linguistics)
- Anna Zilberberg (Center for Applied Linguistics)
Description:
A robust validity argument for test score interpretations must include a careful examination of potential sources of construct-irrelevant variance (CIV) caused by processes that are extraneous to the test’s intended purpose (AERA, APA, NCME, 2014; Guidelines for Technology-Based Assessments, 2022). When different assessment modalities (such as print-based and computer-based) are used concurrently, it is essential to ensure that the modality itself does not become a source of CIV and that the test scores achieved via different modalities remain comparable (Lottridge et al., 2010).
The current study examined score comparability between two versions of an Oral English Language Proficiency Assessment designed for adults learning English in the U.S.: a computer-adaptive multi-stage test (MST) version and a semi-adaptive print-based test (PBT) version. Both versions are currently used by adult education ESL programs throughout the nation to track examinee progress, make placement decisions, and report results via the National Reporting System (NRS). This English test is a complex performance assessment of integrated listening and speaking skills and currently exists in two forms, with two versions (MST and PBT) of each form. Both versions of the same form contain the same tasks, measuring NRS Educational Functioning Levels 1 through 4. The main difference is that in the MST, the underlying IRT-based computer algorithm decides which level of tasks (Level 1, 2, 3, or 4) is appropriate for the examinee after each stage, based on the current estimate of the examinee’s ability, while in the PBT, that decision is made by the test administrator only once, after the seven questions drawn from MST Stage 1 and Stage 2 that comprise the locator have been administered.
A total of 46 adult learners of English, representing a range of English proficiency levels, took the test twice, once with MST Form 1 and once with PBT Form 2. The demographic breakdown of the sample was similar to those of ESL programs nationwide (U.S. Department of Education, 2020a). Two trained test administrators participated in the study. All tests were administered and recorded via Zoom. Results indicated that there was no effect of form administration order as examinees performed similarly regardless of whether they took a certain form first or second. The average scale scores of the two performances across the two different test versions were very close; the correlation between the two sets of scores was very high (0.97), and the paired-difference mean was small and non-significant. Furthermore, 100% of cases were assessed to be in the same or adjacent NRS level classification based on the performances on MST and PBT test versions. In summary, the results of this comparability study suggest that examinees were not disadvantaged by taking one version or another and that both MST and PBT test versions produce comparable results.