Randomized Trial of a Generative AI Chatbot for Mental Health Treatment

Authors

Authors: Heinz MV, Mackin DM, Trudeau BM, et al.

Purpose

This study examined the feasibility, acceptability, and effectiveness of a generative artificial intelligence (Gen-AI) chatbot for mental health treatment.

Methods

This randomized controlled trial (RCT) assessed the symptom improvement and user experiences of a new Gen-AI powered text-based chatbot, Therabot, for individuals with major depressive disorder (MDD), generalized anxiety disorder (GAD), and clinically high-risk feeding and eating disorders (CHR-FED). Adults (N=210), recruited through a national Meta Ads campaign, were screened and assigned to either the MDD, GAD, or CHR-FED group and randomly assigned to a waitlist control group or the Therabot intervention. Participants receiving the intervention were prompted daily for four weeks to engage with Therabot (treatment phase). Users of Therabot retained access for an additional four weeks without prompts (postintervention follow-up phase). Primary outcomes assessed clinical benefit. Secondary outcomes included therapeutic alliance, engagement, and user satisfaction.

Findings

Use of Therabot was associated with statically significant greater clinical symptom reduction across MDD, GAD, and CHR-FED groups compared to controls at 4-week and 8-week timepoints.
The 95% of participants that interacted with Therabot sent on average 260 messages, used Therabot for 24 days (min.=1, max.=60) and had an average of 6.18 hours of use over the study period (4-weeks).
Patients developed a therapeutic alliance (WAI score) similar to that seen with traditional outpatient psychotherapy (overall mean WAI score =3.59, SD=1.27).
On a seven-point Likert scale, with seven being the highest, most participants were satisfied with Therabot (5.30) and indicated it was a tool they would use on their own (5.12). Overall, participants found Therabot easy to use (6.42), intuitive (5.58), similar to a real therapist (4.90), with a good interface (5.46) and design (5.53).
Safety concerns (e.g., expressions of suicidal ideation) required staff intervention 15 times and inappropriate responses (e.g., providing medical advice) required staff corrections 13 times.

Relevance

This study is the first RCT of the safety and effectiveness of the Therabot Gen-AI chatbot for mental health treatment, using clinical measures of mental health symptoms.
Therabot’s development, utilizing expert therapist-patient dialogues rooted in cognitive behavioral therapy, resulted in a Gen-AI chatbot that was clinically effective, and rated well by the intended end users.
Future research should include an active control condition and comparison of Therabot to time-comparable in person therapy. Future research will consider the specific roles of Gen-AI chatbots as an adjunctive to in-person psychotherapy.
Additional research will be required to determine the ideal duration users should engage with Gen-AI-driven adjunctive therapies to reduce clinical symptoms.

Heinz MV, Mackin DM, Trudeau BM, et al.(2025). Randomized Trial of a Generative AI Chatbot for Mental Health Treatment. Nejm Ai. 2(4)doi:10.1056/AIoa2400802

This work was supported by funding from Dartmouth College.

Authors

Purpose

Methods

Findings

Relevance

Read More