Generative AI-Based Software Testing Implementation Model in Agile Development Methodology: Systematic Literature Review
##plugins.themes.academic_pro.article.main##
Published
Apr 29, 2026
Abstract
The development of Artificial Intelligence (AI) technology, particularly Generative AI and Large Language Models (LLM), has had a significant impact on the software development process, including software testing activities. This study aims to conduct a Systematic Literature Review (SLR) to identify and analyze various Generative AI-based software testing frameworks applied in Agile development environments. The literature search process was conducted in several major scientific databases, namely IEEE Xplore, Scopus, ScienceDirect, SpringerLink, and Google Scholar with a publication period of 2021–2026. Based on the selection process using the PRISMA method, 35 articles were obtained that met the inclusion criteria. The analysis results show that most studies utilize Large Language Models (LLM), machine learning, generative models, and retrieval-augmented generation (RAG) techniques to support software test automation, such as test case generation, unit testing automation, and behavior-driven development testing. In addition, several studies also developed AI assistant and agentic AI-based frameworks that can be integrated into Agile development pipelines. This study found that the use of Generative AI can improve testing process efficiency, accelerate test case generation, and help improve software quality. However, several challenges remain, such as limited model accuracy, the need for high-quality datasets, and issues with the security and reliability of AI-based systems
##plugins.themes.academic_pro.article.details##

This work is licensed under a Creative Commons Attribution 4.0 International License.
Hak Cipta :
Penulis yang mempublikasikan manuskripnya di jurnal ini menyetujui ketentuan berikut:
- Hak cipta pada setiap artikel adalah milik penulis.
- Penulis mengakui bahwa Ranah Research : Journal of Multidisciplinary Research and Development berhak menjadi yang pertama menerbitkan dengan lisensi Creative Commons Attribution 4.0 International (Attribution 4.0 International CC BY 4.0) .
- Penulis dapat mengirimkan artikel secara terpisah, mengatur distribusi non-eksklusif manuskrip yang telah diterbitkan dalam jurnal ini ke versi lain (misalnya, dikirim ke repositori institusi penulis, publikasi ke dalam buku, dll.), dengan mengakui bahwa manuskrip telah diterbitkan pertama kali di Ranah Research.
References
Ahlgren, T. L., Sunde, H. F., Kemell, K. K., & Nguyen-Duc, A. (2025). Assisting early-stage software startups with LLMs. Information and Software Technology, 187, 107832. https://doi.org/10.1016/j.infsof.2025.107832
Ardic, B., Brandt, C., Khatami, A., Swillus, M., & Zaidman, A. (2025). The qualitative factor in software testing: A systematic mapping study of qualitative methods. Journal of Systems and Software, 227, 112447. https://doi.org/10.1016/j.jss.2025.112447
Banh, L., Holldack, F., & Strobel, G. (2025). Copiloting the future: How generative AI transforms software engineering. Information and Software Technology, 183, 107751. https://doi.org/10.1016/j.infsof.2025.107751
Baralla, G., Ibba, G., & Tonelli, R. (2024). Assessing GitHub Copilot in Solidity development: Capabilities, testing, and bug fixing. IEEE Access, 12, 164389–164411. https://doi.org/10.1109/ACCESS.2024.3486365
Dakhel, A. M., Nikanjam, A., Majdinasab, V., Khomh, F., & Desmarais, M. C. (2024). Effective test generation using pre-trained large language models and mutation testing. Information and Software Technology, 171, 107468. https://doi.org/10.1016/j.infsof.2024.107468
Dong, Y., Jiang, X., Jin, Z., & Li, G. (2024). Self-collaboration code generation via ChatGPT. ACM Transactions on Software Engineering, 33(7), 189. https://doi.org/10.1145/3672459
Durrani, U. K., Akpinar, M., Bektas, H., & Saleh, M. (2025). Impact of artificial intelligence on software engineering phases and activities. IEEE Access, 13, 95535–95547. https://doi.org/10.1109/ACCESS.2025.3574462
Esposito, M., Li, X., Moreschini, S., Ahmad, N., Cerny, T., Vaidhyanathan, K., Lenarduzzi, V., & Taibi, D. (2026). Generative AI for software architecture. Journal of Systems and Software, 231, 112607. https://doi.org/10.1016/j.jss.2025.112607
Jang, W., & Kim, R. Y. C. (2025). Automatic test case generation mechanism with natural language-based Korean requirement specifications. IEEE Access, 13, 177305–177317. https://doi.org/10.1109/ACCESS.2025.3620431
Karpurapu, S., Myneni, S., Nettur, U., Gajja, L. S., Burke, D., Stiehm, T., & Payne, J. (2024). Comprehensive evaluation and insights into the use of LLMs in BDD acceptance test formulation. IEEE Access, 12, 58715–58721. https://doi.org/10.1109/ACCESS.2024.3391815
Krebs, R., & Mazumdar, S. (2025). Analyzing LLM-generated code according to ISO/IEC 5055:2021 categories. IEEE Access, 13, 202482–202499. https://doi.org/10.1109/ACCESS.2025.3637569
Langdon, W. B., & Clark, D. (2025). Deep imperative mutations have less impact. Automated Software Engineering, 32(1), 6. https://doi.org/10.1007/s10515-024-00475-4
Mårtensson, T. (2025). So much more than test cases – An industrial study on testing of software units and components. Journal of Systems and Software, 228, 112479. https://doi.org/10.1016/j.jss.2025.112479
Mastropaolo, A., Escobar-Velásquez, C., & Linares-Vásquez, M. (2025). From triumph to uncertainty: The journey of software engineering in the AI era. Communications of the ACM, 34(5), 131. https://doi.org/10.1145/3709360
Mojahed, S., Drouin, R., & Sboui, L. (2025). ODACE-RMS: A remote web-based platform for automated multi-device Android testing and certification. IEEE Access, 13, 99863–99878. https://doi.org/10.1109/ACCESS.2025.3576823
Murillo, J. M., Garcia-Alonso, J., Moguel, E., Barzen, J., Leymann, F., Ali, S., et al. (2025). Quantum software engineering: Roadmap and challenges ahead. Communications of the ACM, 34(5), 154. https://doi.org/10.1145/3712002
Nettur, S. B., Karpurapu, S., Nettur, U., & Gajja, L. S. (2025). Cypress Copilot: Development of an AI assistant for boosting productivity and transforming web application testing. IEEE Access, 13, 3215–3229. https://doi.org/10.1109/ACCESS.2024.3521407
Pezzè, M., Abrahão, S., Penzenstadler, B., Poshyvanyk, D., Roychoudhury, A., & Yue, T. (2025). A 2030 roadmap for software engineering. Communications of the ACM, 34(5), 118. https://doi.org/10.1145/3731559
Quin, F., Weyns, D., Galster, M., & Silva, C. C. (2024). A/B testing: A systematic literature review. Journal of Systems and Software, 211, 112011. https://doi.org/10.1016/j.jss.2024.112011
Sharma, T., Kechagia, M., Georgiou, S., Tiwari, R., Vats, I., Moazen, H., & Sarro, F. (2024). A survey on machine learning techniques applied to source code. Journal of Systems and Software, 209, 111934. https://doi.org/10.1016/j.jss.2023.111934
Sisomboon, W., Kaewyotha, J., & Songpan, W. (2026). Automated software test case generation using ensemble LLM with retrieval-augmented generation. IEEE Access. https://doi.org/10.1109/ACCESS.2026.3667925
Usman , Y. , Oladipupo , H. , During , A. D. , Akl , R. , & Chataut , R. (2025). AI, ML, and LLM integration in 5G/6G networks: A comprehensive survey. IEEE Access, 13, 168914–168950. https://doi.org/10.1109/ACCESS.2025.3608736
Yahaya, M. S., Hashim, A. S. B., Oluwagbemiga Balogun, A., Muazu, A., Usman, F. S., Aliyu, D. A., & Muhammad, A. U. (2025). Exploration and exploitation mechanism in pairwise test case generation: A systematic literature review. IEEE Access, 13, 82342–82377. https://doi.org/10.1109/ACCESS.2025.3566163
Walczak, J., Tomalak, P., & Laskowski, A. (2026). Impact of code context and prompting strategies on automated unit test generation with modern large language models. Journal of Systems and Software, 237, 112834. https://doi.org/10.1016/j.jss.2026.112834
Zhao, Y., Hou, X., Wang, S., & Wang, H. (2025). LLM App store analysis: A vision and roadmap. Communications of the ACM, 34(5), 125. https://doi.org/10.1145/3708530