Allan Brooks never set out to reinvent mathematics. But after weeks spent talking with ChatGPT, the 47-year-old Canadian came to believe he had discovered a new form of math powerful enough to take down the internet.
Brooks â who had no history of mental illness or mathematical genius â spent 21 days in May spiraling deeper into the chatbotâs reassurances, a descent later detailed in The New York Times. His case illustrated how AI chatbots can venture down dangerous rabbit holes with users, leading them toward delusion or worse.
That story caught the attention of Steven Adler, a former OpenAI safety researcher who left the company in late 2024 after nearly four years working to make its models less harmful. Intrigued and alarmed, Adler contacted Brooks and obtained the full transcript of his three-week breakdown â a document longer than all seven Harry Potter books combined.
On Thursday, Adler published an independent analysis of Brooksâ incident, raising questions about how OpenAI handles users in moments of crisis, and offering some practical recommendations.
âIâm really concerned by how OpenAI handled support here,â said Adler in an interview with TechCrunch. âItâs evidence thereâs a long way to go.â
Brooksâ story, and others like it, have forced OpenAI to come to terms with how ChatGPT supports fragile or mentally unstable users.
For instance, this August, OpenAI was sued by the parents of a 16-year-old boy who confided his suicidal thoughts in ChatGPT before he took his life. In many of these cases, ChatGPT â specifically a version powered by OpenAIâs GPT-4o model â encouraged and reinforced dangerous beliefs in users that it should have pushed back on. This is called sycophancy, and itâs a growing problem in AI chatbots.
In response, OpenAI has made several changes to how ChatGPT handles users in emotional distress and reorganized a key research team in charge of model behavior. The company also released a new default model in ChatGPT, GPT-5, that seems better at handling distressed users.
Adler says thereâs still much more work to do.
He was especially concerned by the tail-end of Brooksâ spiraling conversation with ChatGPT. At this point, Brooks came to his senses and realized that his mathematical discovery was a farce, despite GPT-4oâs insistence. He told ChatGPT that he needed to report the incident to OpenAI.
After weeks of misleading Brooks, ChatGPT lied about its own capabilities. The chatbot claimed it would âescalate this conversation internally right now for review by OpenAI,â and then repeatedly reassured Brooks that it had flagged the issue to OpenAIâs safety teams.

Except, none of that was true. ChatGPT doesnât have the ability to file incident reports with OpenAI, the company confirmed to Adler. Later on, Brooks tried to contact OpenAIâs support team directly â not through ChatGPT â and Brooks was met with several automated messages before he could get through to a person.
OpenAI did not immediately respond to a request for comment made outside of normal work hours.
Adler says AI companies need to do more to help users when theyâre asking for help. That means ensuring AI chatbots can honestly answer questions about their capabilities, but also giving human support teams enough resources to address users properly.
OpenAI recently shared how itâs addressing support in ChatGPT, which involves AI at its core. The company says its vision is to âreimagine support as an AI operating model that continuously learns and improves.â
But Adler also says there are ways to prevent ChatGPTâs delusional spirals before a user asks for help.
In March, OpenAI and MIT Media Lab jointly developed a suite of classifiers to study emotional well-being in ChatGPT and open sourced them. The organizations aimed to evaluate how AI models validate or confirm a userâs feelings, among other metrics. However, OpenAI called the collaboration a first step and didnât commit to actually using the tools in practice.
Adler retroactively applied some of OpenAIâs classifiers to some of Brooksâ conversations with ChatGPT, and found that they repeatedly flagged ChatGPT for delusion-reinforcing behaviors.
In one sample of 200 messages, Adler found that more than 85% of ChatGPTâs messages in Brooksâ conversation demonstrated âunwavering agreementâ with the user. In the same sample, more than 90% of ChatGPTâs messages with Brooks âaffirm the userâs uniqueness.â In this case, the messages agreed and reaffirmed that Brooks was a genius who could save the world.

Itâs unclear whether OpenAI was applying safety classifiers to ChatGPTâs conversations at the time of Brooksâ conversation, but it certainly seems like they would have flagged something like this.
Adler suggests that OpenAI should use safety tools like this in practice today â and implement a way to scan the companyâs products for at-risk users. He notes that OpenAI seems to be doing some version of this approach with GPT-5, which contains a router to direct sensitive queries to safer AI models.
The former OpenAI researcher suggests a number of other ways to prevent delusional spirals.
He says companies should nudge users of their chatbots to start new chats more frequently â OpenAI says it does this, and claims its guardrails are less effective in longer conversations. Adler also suggests companies should use conceptual search â a way to use AI to search for concepts, rather than keywords â to identify safety violations across its users.
OpenAI has taken significant steps towards addressing distressed users in ChatGPT since these concerning stories first emerged. The company claims GPT-5 has lower rates of sycophancy, but it remains unclear if users will still fall down delusional rabbit holes with GPT-5 or future models.
Adlerâs analysis also raises questions about how other AI chatbot providers will ensure their products are safe for distressed users. While OpenAI may put sufficient safeguards in place for ChatGPT, it seems unlikely that all companies will follow suit.
