‘As generative AI (GenAI) continues to permeate academia, distinguishing between student-authored essays and those by Large Language Models (LLMs) becomes crucial for maintaining academic integrity. This study conducted a survey on the ethical awareness of using generative AI tools among a group of STEM students (n=156). Also, we empirically evaluate the effectiveness of state-of-the-art LLM detector in identifying essays written by LLMs versus those written independently by students. We separated students into two groups and assigned them to either use LLMs or write essay independently. The essays were collected, anonymised, and analysed using quantitative methods. The detection tools were assessed on their ability to classify the essays correctly. The findings highlight limitations of the deployment of LLM detection tools in writing courses. Based on the outcomes, recommendations are provided for educators to enhance the ethical use of detection technologies. Our code is available on GitHub.’
Link: https://www.tandfonline.com/doi/full/10.1080/14703297.2025.2511062?ui=w7xas&af=T&ai=p5woq#abstract