A Comparative Analysis of Chatgtp-4/4.5 and Human-written Summaries in Linguistic Research

Aleksa Stošić1[0009-0005-4601-0401], Marija Milojković1[0000-0003-4787-1017], and Aleksandra Branković1[0009-0007-7810-5581]
1 Belgrade Metropolitan University, Tadeuša Košćuška 63, Belgrade 11158, Serbia aleksa.stosic@metropolitan.ac.rs
marija.milojkovic@metropolitan.ac.rs
aleksandra.brankovic@metropolitan.ac.rs
DOI: 10.46793/eLearning2025.205S

 

Abstract. This study evaluates the potential of ChatGPT-4 and ChatGPT-4.5 as research assistants in applied linguistics (AL) by examining their ability to generate annotated bibliographies of research articles. Five AL papers on technology in English pronunciation and speaking instruction were summarized by both models and by human researchers, producing 25 summaries. Fourteen expert raters assessed the summaries for quality and judged their authorship. Results show that both models produced factually accurate and structurally faithful summaries. However, both models lacked critical selectiveness, could only provide generalized statements on relevance, and relied on surface-level markers to assess credibility. Quantitative analysis indicated that ChatGPT summaries were rated as comparable in quality to human-authored ones, though inter-rater agreement was low and a bias against texts perceived as AI-generated was observed. Qualitative findings revealed that experts distinguished AI from human summaries based on information density, word choice, stylistic naturalness, and evaluative engagement. Overall, ChatGPT proved advantageous in accuracy, structural consistency, and efficiency, but its weaknesses in evaluative depth and authenticity suggest that, while it can accelerate the early stages of literature review, it cannot substitute for the nuanced judgment and interpretive reasoning required in applied linguistics.

Keywords: ChatGPT, applied linguistics, research assistant, research summarization, annotated bibliography.

 

References

  1. Gao, C. A., Howard, F. M., Markov, N. S., Dyer, E. C., & Ramesh, S. (2023). Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers. npj Digital Medicine, 6(1), 75. doi:10.1038/s41746-023-00819-6
  2. O’Connor, S., & ChatGPT. (2023, January). Open artificial intelligence platforms in nursing education: Tools for academic progress or abuse? Nurse Education in Practice, 66, 103537. doi:10.1016/j.nepr.2022.103537
  3. OpenAI. (2023). GPT-4 technical report. https://openai.com/research/gpt-4 4. OpenAI (2025). Introducing GPT-4.5. https://openai.com/index/introducing-gpt-4-5/
  4. Hake, J., Crowley, M., Coy, A., Shanks, D., Eoff, A., Kirmer-Voss, K., Dhanda, G., & Parente, D. J. (2024). Quality, accuracy, and bias in CHATGPT-based summarization of Medical Abstracts. The Annals of Family Medicine, 22(2), 113–120. https://doi.org/10.1370/afm.3075
  5. Salleh, H. M. (2023). Errors of commission and omission in artificial intelligence: Con textual biases and voids of chatgpt as a research assistant. Digital Economy and Sustainable Development, 1(1). https://doi.org/10.1007/s44265-023-00015-0
  6. Bae, H. (2023). CHATGPT as a Research Assistant in Experimental Linguistics. https://doi.org/10.2139/ssrn.4585546
  7. Uchida, S. (2024). Using early LLMS for corpus linguistics: Examining chatgpt’s potential and limitations. Applied Corpus https://doi.org/10.1016/j.acorp.2024.100089 Linguistics, 4(1), 100089.
  8. Alkaissi, H., & McFarlane, S. I. (2023). Artificial hallucinations in CHATGPT: Implications in scientific writing. Cureus. https://doi.org/10.7759/cureus.35179
  9. Jarrah, A. M., Wardat, Y., & Fidalgo, P. (2023). Using CHATGPT in academic writing is (not) a form of plagiarism: What does the literature say? Online Journal of Communication and Media Technologies, 13(4). https://doi.org/10.30935/ojcmt/13572
  10. Meyer, J. G., Urbanowicz, R. J., Martin, P. C., O’Connor, K., Li, R., Peng, P.-C., Bright, T. J., Tatonetti, N., Won, K. J., Gonzalez-Hernandez, G., & Moore, J. H. (2023). CHATGPT and large language models in academia: Opportunities and challenges. BioDa ta Mining, 16(1). https://doi.org/10.1186/s13040-023-00339-9
  11. Rahardyan, T. M., Susilo, C. H., Iswara, A. M., & Hartono, M. L. (2024). CHATGPT: The future research assistant or an academic fraud? [A case study on a state university located in Jakarta, Indonesia]. Asia Pacific Fraud Journal, 9(2), 275–293. https://doi.org/10.21532/apfjournal.v9i2.347
  12. Davis, F. D. (1989). Perceived usefulness, perceived ease of use, and user acceptance of Information Technology. MIS Quarterly, 13(3), 319. https://doi.org/10.2307/249008
  13. Venkatesh, V., Morris, M.G., Davis, G.B., & Davis, F.D. (2003). User acceptance of information technology: Toward a unified view. MIS Quarterly, 27(3), 425. https://doi.org/10.2307/30036540
  14. Dahri, N. A., Yahaya, N., Al-Rahmi, W. M., Aldraiweesh, A., Alturki, U., Almutairy, S., Shutaleva, A., & Soomro, R. B. (2024). Extended Tam based acceptance of AI-powered chatgpt for supporting metacognitive self-regulated learning in education: A mixed methods study. Heliyon, 10(8). https://doi.org/10.1016/j.heliyon.2024.e29317
  15. Balaskas, S., Tsiantos, V., Chatzifotiou, S., & Rigou, M. (2025). Determinants of CHATGPT adoption intention in higher education: Expanding on Tam with the mediating roles of trust and risk. Information, 16(2), 82. https://doi.org/10.3390/info16020082
  16. Rahman, M., Terano, H., Rahman, N., Salamzadeh, A., & Rahaman, S. (2023). CHATGPT and academic research: A review and recommendations based on practical examples. Journal of Education, Management and Development Studies, 3(1), 1–12. https://doi.org/10.52631/jemds.v3i1.175
  17. Lin, C.-Y. (2004). ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out (pp. 74–81). Association for Computational Linguistics.
  18. Goyal. T., Li, J. J., & Durrett, G. (2022). News summarization and evaluation in the era of gpt-3. (arXiv:2209.12356). https://arxiv.org/abs/2209.12356
  19. Yang, X., Li, Y., Zhang, X., Chen, H., & Cheng, W. (2023). Exploring the limits of ChatGPT for query or aspect-based text summarization (arXiv:2302.08081). arXiv. https://arxiv.org/abs/2302.08081
  20. Zhang, H., Liu, X., & Zhang, J. (2023). Extractive summarization via ChatGPT for faithful summary generation (arXiv:2304.04193). arXiv. https://arxiv.org/abs/2304.04193
  21. Ma, Y., Liu, J., Yi, F., Cheng, Q., Huang, Y., Lu, W., & Liu, X. (2023). AI vs. human Differentiation analysis of scientific https://doi.org/10.48550/arXiv.2301.10416 content generation. arXiv.
  22. Fabbri, A. R., Kryściński, W., McCann, B., Xiong, C., Socher, R., & Radev, D. (2021). Summeval: Re-evaluating summarization evaluation. Transactions of the Association for Computational Linguistics, 9, 391–409. https://doi.org/10.1162/tacl_a_00373
  23. Biber, D., & Conrad, S. (2019). Register, genre, and style. Cambridge University Press. https://doi.org/10.1017/9781108686136
  24. Bashori, M., van Hout, R., Strik, H., & Cucchiarini, C. (2022). ‘Look, I can speak correctly’: Learning vocabulary and pronunciation through websites equipped with automatic speech recognition technology. Computer Assisted Language Learning, 37(5–6), 1335 1363. https://doi.org/10.1080/09588221.2022.2080230
  25. Dennis, N. K. (2024). Using AI-powered speech recognition technology to improve English pronunciation and speaking skills. IAFOR Journal of Education: Technology in Education, 12(2), 107–126. https://doi.org/10.1016/j.caeai.2024.100230
  26. Hsu, H.-W. (2024). An examination of automatic speech recognition (ASR)-based computer-assisted pronunciation training (CAPT) for less-proficient EFL students using the Technology Acceptance Model. International Journal of Technology in Education, 7(3), 456–473. https://doi.org/10.46328/ijte.681
  27. Lyu, J., & Andi, H. K. (2024). The role of technology-enhanced learning in improving English pronunciation and language proficiency. Sciences of Conservation and Archaeology, 36(4), 96–102. https://doi.org/10.48141/sci-arch-36.4.24.9
  28. Mohammadkarimi, E. (2024). Exploring the use of artificial intelligence in promoting English language pronunciation skills. LLT Journal: A Journal on Language and Language Teaching, 27(1), 98–115. https://doi.org/10.24071/llt.v27i1.8151
  29. Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. https://doi.org/10.1191/1478088706qp063oa
  30. Patel, S. B., & Lam, K. (2023). Chatgpt: The future of discharge summaries? The Lancet Digital Health, 5(3). https://doi.org/10.1016/s2589-7500(23)00021-3

 

 

Izvor: Proceedings of the 16th International Conference on e-Learning (ELEARNING2025)