AI detection is now more important than ever. So why are we ditching GPTZero?


GPTZero was an incredible product when it first launched. At a time when ChatGPT and other large language models (LLMs) were becoming increasingly more popular at an exponential rate, and students began using these LLMs to do their classwork, GPTZero was a tool that could protect academic integrity, while minimizing false positives. It could provide a critical alert to the teacher that a given submission might have been created by AI. Consequently, GPTZero garnered significant press and investor attention, which helped them raise money and build a large team, as well as enter into partnerships with larger learning management systems (LMSs). All of this was supposed to have a significant positive impact on education…

As a startup in the EdTech space, we knew about GPTZero from the beginning, and we were one of the early adopters in the first half of 2023. We have used them with great success for over a year. And then, GPTZero launched model 2024-08-02-base, presumably on August 2nd. Not long after, we started receiving complaints that perfectly legitimate, human-written text was being flagged as AI generated. After doing some research internally, we found that a lot of these submissions were not only flagged as AI-generated, but they had confidence scores of 100%! This means that GPTZero was seemingly 100% confident that perfectly legitimate submissions, which anyone could tell were unlikely to be written by AI (because of grammatical mistakes, formatting, and the like), were AI-generated. In our communications with GPTZero, we enquired about the issue and were faced with a level of disregard that is unacceptable when discussing cases of false accusations of plagiarism directed towards students. As we wrote in one email, “Falsely accusing a student of plagiarism can ruin their academic career. Falsely accusing MANY students of plagiarism is educational malpractice.” When we enquired about using an older version of their API, we were told we would need to commit to “$5-10k” of annual spend “due to the engineering effort” required. This is despite the fact that the API documentation clearly stated how to specify an older version of the model to use (and despite the fact that this also simply did not work).

This lack of interest by GPTZero to do the right thing and address false positives that are occurring is an issue that any stakeholder in the field of education should be demanding answers about. As a result of the lack of answers provided to us by GPTZero, we have made the difficult decision to cease our use of their software, and we encourage others to do the same. While we search for an alternate provider that can meet the high bar for accuracy and precision expected from LX Aer solutions, we will be temporarily disabling AI plagiarism detection in our platform, and we will remove all results stored on our servers that used the 2024-08-02-base GPTZero model.

This is a question not of features in EdTech products, but a question of ethics and morals. We refuse to allow our software to be complicit in falsely accusing students of plagiarism, and we condemn all those who prioritize the profits they can earn from utilizing such technology, while intentionally disregarding its issues with accuracy.