Randomized Trial of a Generative AI Chatbot for Mental Health Treatment.

Heinz MV, Mackin DM, Trudeau BM, et al. Randomized Trial of a Generative AI Chatbot for Mental Health Treatment. Nejm Ai. 2025;2(4)doi:10.1056/AIoa2400802Acceptability, Artificial Intelligence, mental health, Algorithm, CTBH

This study examined the feasibility, acceptability, and effectiveness of a generative artificial intelligence (Gen-AI) chatbot for mental health treatment. In this randomized controlled trial, researchers evaluated symptom improvement and user experiences with a new Gen-AI–powered text-based chatbot, Therabot, among individuals with major depressive disorder (MDD), generalized anxiety disorder (GAD), and clinically high-risk feeding and eating disorders (CHR-FED). Adults (N = 210) were recruited through a national Meta Ads campaign, screened, assigned to one of the three diagnostic groups, and randomly assigned to either a waitlist control condition or the Therabot intervention. Participants in the intervention group were prompted daily for four weeks to engage with Therabot during the treatment phase. They retained access to the chatbot for an additional four weeks without prompts during a postintervention follow-up phase. Primary outcomes assessed clinical symptom improvement, while secondary outcomes included therapeutic alliance, engagement, and user satisfaction. The use of Therabot was associated with statistically significant reductions in symptoms across the MDD, GAD, and CHR-FED groups compared with controls at both the 4-week and 8-week time points. Among participants, 95% interacted with Therabot, sending an average of 260 messages, using the system for 24 days (min.=1, max.=60), and accumulating an average of 6.18 hours of use over the 4-week study period. Participants also reported forming a therapeutic alliance comparable to traditional outpatient psychotherapy (overall mean WAI score =3.59, SD=1.27). Participants reported high satisfaction with Therabot and willingness to use it independently. Safety concerns, including expressions of suicidal ideation, required staff intervention 15 times, while inappropriate responses (e.g., providing medical advice) required corrections 13 times. This study demonstrates promising clinical outcomes for Therabot. Future research should include an active control condition, comparisons with time-comparable in-person therapy, and further investigation into the optimal role and duration of Gen-AI chatbots as adjuncts to psychotherapy.

Posted in: Cutting Edge Literature // Tagged: acceptability; algorithm; artificial intelligence; CTBH; mental health

Share