What hinders practicing English in virtual reality: a data-driven analysis of performance and errors among engineering students
Abstract
The study aims to identify the factors that hinder engineering students with intermediate English proficiency (A2-B1) from completing a learning dialogue in English as a foreign language within a virtual reality (VR) environment, using an analysis of error patterns based on scenario performance statistics. This study examines user results for the scenario “Basic information about the company” in two modes (demo testing and final exam) and compares performance distributions using descriptive statistics and visual analytics (histograms, kernel density estimation, and box plots). Within the error analysis, two key error types were distinguished and quantitatively described across individual dialogue steps: response selection errors (choosing an incorrect reply option) and speech recognition errors (failure of the system to accept/recognise a correctly spoken answer). The scientific novelty of the study lies in the fact that, using real VR dialogue interaction logs obtained from engineering students at RUDN during an English language training course, it proposes and applies a Recognition Dominance Index (RDI) to estimate the contribution of the technical factor (automatic speech recognition, ASR) to overall failure and to separate it from learning-related difficulties. The results show that the overall task difficulty remains comparable between the demo and exam (median performance around 65-67%); however, exam scores are more concentrated in the 60-75% range, while demo outcomes display higher variability and include low-score outliers. Speech recognition was demonstrated to be the main bottleneck of the scenario: on average, recognition errors occurred more frequently than selection errors (approximately 41.8% versus 30.2%), and the RDI indicated a predominantly recognition-driven nature of failure (approximately 76% on average). At the level of individual utterances, the highest non-recognition rates were observed in the closing expressions of gratitude and in questions about the company’s specialisation, whereas the highest incorrect-selection rates were associated with steps that required strict adherence to academic etiquette and precise wording. The findings suggest that successful completion of the VR dialogue is hindered less by users’ lack of content knowledge and more by speech recognition limitations and response design features, which highlights the need to improve the ASR component and scenario annotation when assessing communicative skills.
References
- Baker R. S., Inventado P. S. Educational Data Mining and Learning Analytics // Learning Analytics: From Research to Practice / ed. by J. A. Larusson, B. White. Springer, 2014. https://doi.org/10.1007/978-1-4614-3305-7_4
- Bohus D. Error awareness and recovery in conversational spoken language interfaces: Doctoral dissertation. Pittsburgh: Carnegie Mellon University, 2007.
- Bohus D., Rudnicky A. Sorry and I didn’t catch that! – an investigation of non-understanding errors and recovery strategies // Proceedings of the 6th SIGdial Workshop on Discourse and Dialogue. Lisbon, 2005.
- Broussard K. M. Errors in Language Learning and Use: Exploring Error Analysis by Carl James // TESOL Quarterly. 1999. Vol. 33. No. 1. https://doi.org/10.2307/3588202
- Chen C., Yuan Y. Effectiveness of Virtual Reality on Chinese as a second language vocabulary learning: perceptions from international students // Computer Assisted Language Learning. 2023. Vol. 38 (3). https://doi.org/10.1080/09588221.2023.2192770
- De Araujo A., Papadopoulos P. M., Mckenney S., De Jong T. A learning analytics‐based collaborative conversational agent to foster productive dialogue in inquiry learning // Journal of Computer Assisted Learning. 2024. Vol. 40 (6). https://doi.org/10.1111/jcal.13007
- De Vries B. P., Cucchiarini C., Bodnar S., Strik H., Van Hout R. Spoken grammar practice and feedback in an ASR-based CALL system // Computer Assisted Language Learning. 2014. Vol. 28 (6). https://doi.org/10.1080/09588221.2014.889713
- Field A. Discovering Statistics Using IBM SPSS Statistics. 5th ed. Sage Publications, 2018.
- Frigge M., Hoaglin D. C., Iglewicz B. Some implementations of the Boxplot // The American Statistician. 1989. Vol. 43 (1).
- Graesser A., Jordan P., Vanlehn K., Rosé C., Harter D.Intelligent tutoring systems with conversational dialogue // AI Magazine. 2001. Vol. 22 (4). https://doi.org/10.1609/aimag.v22i4.1591
- Heift T., Schulze M. Errors and Intelligence in Computer-Assisted Language Learning: Parsers and Pedagogues. Routledge, 2007.
- James C. Errors in language learning and use: Exploring error analysis. Routledge, 2013.
- Jurafsky D., Martin J. H. Speech and Language Processing. 3rd ed. Pearson, 2020.
- Kang B. O., Jeon H., Lee Y. K. AI‐based language tutoring systems with end‐to‐end automatic speech recognition and proficiency evaluation // ETRI Journal. 2024. Vol. 46 (1). https://doi.org/10.4218/etrij.2023-0322
- Knill K., Gales M., Kyriakopoulos K., Malinin A., Ragni A., Wang Y., Caines A. Impact of ASR Performance on Free Speaking Language Assessment // Interspeech 2018 (Hyderabad, India, 2-6 September 2018). 2018. https://doi.org/10.21437/interspeech.2018-1312
- Lee A. Assessing Speaking Skills in Virtual Reality: Impacts and Implications // English Teaching. 2025. Vol. 80 (2).
- Lev-Ari S.Comprehending non-native speakers: theory and evidence for adjustment in manner of processing // Frontiers in Psychology. 2015. Vol. 5. https://doi.org/10.3389/fpsyg.2014.01546
- Lev-Ari S., Keysar B. Less-Detailed Representation of Non-Native Language: Why Non-Native Speakers’ Stories Seem More Vague // Discourse Processes. 2012. Vol. 49 (7). https://doi.org/10.1080/0163853x.2012.698493
- Li K.-C., Chang M., Wu K.-H. Developing a Task-Based Dialogue System for English Language Learning // Education Sciences. 2020. Vol. 10 (11). https://doi.org/10.3390/educsci10110306
- Palmas F., Cichor J., Plecher D. A., Klinker G. Acceptance and Effectiveness of a Virtual Reality Public Speaking Training // IEEE International Symposium on Mixed and Augmented Reality (ISMAR) (Beijing, China). 2019. https://doi.org/10.1109/ismar.2019.00034
- Radzikowski K., Wang L., Yoshie O., Nowak R. Accent modification for speech recognition of non-native speakers using neural style transfer // EURASIP Journal on Audio, Speech, and Music Processing. 2021. Vol. 2021 (1). https://doi.org/10.1186/s13636-021-00199-3
- Sadigzade Z. Immersive and Gamified Approaches: VR/AR in Language Learning // Porta Universorum. 2025. Vol. 1 (6). https://doi.org/10.69760/portuni.0106002
- Scott D. W. Multivariate density estimation: Theory, practice, and visualization. Wiley, 2015.
- Silverman B. W. Density estimation for statistics and data analysis. Chapman and Hall, 1986.
- Thi-Nhu Ngo T., Hao-Jan Chen H., Kuo-Wei Lai K. The effectiveness of automatic speech recognition in ESL/EFL pronunciation: A meta-analysis // ReCALL. 2023. Vol. 36 (1). https://doi.org/10.1017/s0958344023000113
- Tobin J., Nelson P., Macdonald B., Heywood R., Cave R., Seaver K., Desjardins A., Jiang P.-P., Green J. R. Automatic Speech Recognition of Conversational Speech in Individuals with Disordered Speech // Journal of Speech, Language, and Hearing Research: JSLHR. 2024. Vol. 67 (11). https://doi.org/10.1044/2024_jslhr-24-00045
- Tukey J. W. Exploratory data analysis. Addison-Wesley, 1977.
- VanLehn K. The relative effectiveness of human tutoring, intelligent tutoring systems, and other tutoring systems // Educational Psychologist. 2011. Vol. 46 (4).
- Wang Z., Schultz T., Waibel A.Comparison of acoustic model adaptation techniques on non-native speech // 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (Hong Kong, China): Proceedings. 2003. https://doi.org/10.1109/icassp.2003.1198837
- Yang X., Chen Y.-N., Hakkani-Tur D., Crook P., Li X., Gao J., Deng L. End-to-end joint learning of natural language understanding and dialogue manager // 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (New Orleans, LA, USA): Proceedings. 2017. https://doi.org/10.1109/icassp.2017.7953246
Author information
About this article
Publication history
- Received: March 1, 2026.
- Published: April 8, 2026.
Keywords
- изучение языков с помощью виртуальной реальности (VR)
- обучение на основе диалогов
- аналитика обучения
- ошибки распознавания речи (ASR)
- оценка эффективности
- virtual reality (VR) language learning
- dialogue-based training
- learning analytics
- speech recognition (ASR) errors
- performance assessment
Copyright
© 2026 The Author(s)
© 2026 Gramota Publishing, LLC