Vital Set Of Policy Recommendations For Stridently Dealing With AI That Provides Mental Health Advice

In today’s column, I expand upon some crucial policy recommendations that have recently arisen as a result of the Food and Drug Administration (FDA) seeking commentary about the AI-based mental health realm (note: these aren’t recommendations promulgated by the FDA; they are recommendations submitted to the FDA for consideration and discussion).

First, I will set the stage by describing the recent efforts of the FDA to consider how to best regulate mental health medical devices, especially those that lean into AI. Second, the FDA had an important meeting in November 2025 and sought input from those who could offer insights into the emerging matter. I have examined the posted commentaries. One of the especially thoughtful postings particularly caught my attention. I believe you will find the encompassing policy suggestions of keen interest. I mindfully explain the policy points and have added my own perspective to the stated considerations.

The bottom line is that existing policy on AI for mental health is still being formulated (i.e., it is either non-existent or floating in the air), lots of debate is taking place, and, meanwhile, the use of AI for mental health is charging ahead, doing so in the absence of articulated and enforceable policies that might offer sensible boundaries and controls. You might say that the horse is already out of the barn, galloping at breakneck speed.

Let’s talk about it.

This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here).

AI And Mental Health

As a quick background, I’ve been extensively covering and analyzing a myriad of facets regarding the advent of modern-era AI that produces mental health advice and performs AI-driven therapy. This rising use of AI has principally been spurred by the evolving advances and widespread adoption of generative AI. For a quick summary of some of my posted columns on this evolving topic, see the link here, which briefly recaps about forty of the over one hundred column postings that I’ve made on the subject.

There is little doubt that this is a rapidly developing field and that there are tremendous upsides to be had, but at the same time, regrettably, hidden risks and outright gotchas come into these endeavors, too. I frequently speak up about these pressing matters, including in an appearance last year on an episode of CBS’s 60 Minutes, see the link here.

Background On AI For Mental Health

I’d like to set the stage on how generative AI and large language models (LLMs) are typically used in an ad hoc way for mental health guidance. Millions upon millions of people are using generative AI as their ongoing advisor on mental health considerations (note that ChatGPT alone has over 800 million weekly active users, a notable proportion of which dip into mental health aspects, see my analysis at the link here). The top-ranked use of contemporary generative AI and LLMs is to consult with the AI on mental health facets; see my coverage at the link here.

This popular usage makes abundant sense. You can access most of the major generative AI systems for nearly free or at a super low cost, doing so anywhere and at any time. Thus, if you have any mental health qualms that you want to chat about, all you need to do is log in to AI and proceed forthwith on a 24/7 basis

There are significant worries that AI can readily go off the rails or otherwise dispense unsuitable or even egregiously inappropriate mental health advice. Banner headlines in August of this year accompanied the lawsuit filed against OpenAI for their lack of AI safeguards when it came to providing cognitive advisement.

Despite claims by AI makers that they are gradually instituting AI safeguards, there are still a lot of downside risks of the AI doing untoward acts, such as insidiously helping users in co-creating delusions that can lead to self-harm. For my follow-on analysis of details about the OpenAI lawsuit and how AI can foster delusional thinking in humans, see my analysis at the link here. As noted, I have been earnestly predicting that eventually all of the major AI makers will be taken to the woodshed for their paucity of robust AI safeguards.

Today’s generic LLMs, such as ChatGPT, Claude, Gemini, Grok, and others, are not at all akin to the robust capabilities of human therapists. Meanwhile, specialized LLMs are being built to presumably attain similar qualities, but they are still primarily in the development and testing stages. See my coverage at the link here.

FDA Is In This Milieu

The FDA has been trying to figure out how to sensibly regulate medical devices that dovetail into the AI mental health realm. Overall, there appears to be a desire on the part of the FDA to craft a regulatory framework that balances safety and caution with a sense of encouraging innovation and progress. All eyes are on the FDA. Stakeholders include AI makers, MedTech firms, regulators, lawmakers, healthcare providers, researchers, and many others, especially the public at large.

A Digital Health Advisory Committee was formed by the FDA and had its initial public meeting last year on November 20-21, 2024. The theme for that meeting was entitled “Total Product Lifecycle Considerations for Generative AI-Enabled Devices.” The second meeting took place this year on November 6, 2025, and was entitled “Generative Artificial Intelligence-Enabled Digital Health Medical Devices.” For more information about the FDA efforts on these matters, see the official FDA-designated website at the link here.

Public comments were submitted beforehand for the November 2025 meeting. The docket remained open for additional submissions beyond the November 2025 meeting, extending through December 8, 2025. I have taken a close look at all the various submissions. Some are insightful. Some are not. I will be doing a broad review of the submissions in an upcoming posting (be on the watch).

One of the posted commentaries was especially insightful, and I’d like to expand upon the policy points they provided. Let’s do that.

Policy Statement Submitted To The FDA RFI

In a document posted at the Stanford University HAI website (see the link here), entitled “Response to FDA’s Request for Comment on AI-Enabled Medical Devices”, researchers from the Stanford Institute for Human-Centered Artificial Intelligence, the Behavioral Science & Policy Institute at the University of Texas at Austin, and Carnegie Mellon University (CMU), consisting of Desmond Ong, Jared Moore, Nicole Martinez-Martin, Caroline Meinhardt, Eric Lin, and William Agnew, provide their policy suggestions that were submitted to the FDA.

Steadfast readers of my column might recall that I have previously covered various AI-related mental health initiatives underway at Stanford University, and that I served as an invited fellow at Stanford. If you are interested in some of my coverage and analyses of the innovative research taking place at Stanford on AI for mental health, see, for example, the link here, the link here, the link here, and numerous other postings in my column.

In this instance, the above-noted response document contained six major policy recommendations:

  • (1) Develop comprehensive benchmarks that incorporate human clinical expertise.
  • (2) Require chatbot developers to provide API endpoints for user-facing models.
  • (3) Institute reporting requirements for performance evaluations and safety protocols.
  • (4) Designate a trusted third-party evaluator for AI mental health chatbots.
  • (5) Mandate companies to designate products designed for therapeutic uses.
  • (6) Ensure chatbot design that prevents AI sycophancy and parasocial relationships.

Those are each very important policy points.

I’ll discuss them one by one and then make some concluding remarks.

Definitional Considerations Are Huge

Before we explore the mainstay policy considerations, I’d like to bring up a thorny topic that continues to plague and frustrate this entire field of study and practice. It has to do with definitions. You will momentarily see how this substantively impacts everything else that might be discussed on this topic overall.

Lest you think that definitions are something of a mundane or trivial concern, I am reminded of the famous line by Socrates that the beginning of wisdom is the definition of terms. On a similar vein, Voltaire stated categorically that if you wish to converse, first make sure to define your terms.

The reason this looms so significantly in the realm of AI for mental health is that there is an ongoing and unresolved debate about what types of apps fall within the scope of mental health. I suppose you might contend that it is obvious to the eye and that you know it when you see it.

The thing is, AI makers are aiming to slide out of various regulations and limitations by cleverly labeling their AI as decidedly not providing mental health guidance. Instead, according to them, their AI provides mental well-being guidance, or perhaps wellness advice, or something utterly other than “mental health” per se. This is a means of dodging the legal restrictions associated with mental health aspects and seeks to maneuver away from an ominous bright red line.

Of course, AI makers would insist that the naming is genuine and has nothing to do with skirting around the mental health moniker. It is asserted that a “well-being” app has nothing to do with mental health. Mental health is reserved for highly scientific endeavors. Their app for wellness is obviously of a more ad hoc nature and not intended to be a mental health prognosticator.

Common Wording In Popular Usage

Speaking of definitions, in the case of the FDA, here is an excerpt of their definition of digital mental health medical devices as utilized for the November 6, 2025, meeting (excerpt):

  • “For this meeting, ‘digital mental health medical devices’ refers to digital products or functions (including those utilizing AI methods) that are intended to diagnose, cure, mitigate, treat, or prevent a psychiatric condition, including those with uses that increase a patient’s access to mental health professionals.”

Do you think that definition encompasses the well-being types of apps, or does it cover only something more rigorous?

Right now, there isn’t an across-the-board accepted answer to that vital question.

The odds are that it is going to become a notable legal contention, and lawyers are going to fight vigorously on opposing sides. I’ve discussed this acrimonious debate at length; see the link here and the link here.

Generally, the labeling of apps so far seems to fall into these two camps (informally):

  • (1) Apps of ad hoc nature and claimed not to be in the mental health domain: Mental well-being apps, wellness apps, mental recovery apps, mindfulness apps, positive thinking apps, mood tracking apps, meditation apps, emotional regulation apps, personalized insights apps, etc.
  • (2) Apps leveraging evidence-based methodologies and presumably in the mental health domain: Mental health apps, managing depression apps, behavioral techniques apps, anxiety coping apps, substance abuse recovery apps, cognitive psychology apps, eating disorders apps, obsessive-compulsive disorder management apps, and so on.

Again, those wordings or names are arguable as to what they portend. Lawyers are going to have a field day on this.

Let’s get into the policy recommendations.

Develop Comprehensive Benchmarks For AI Mental Health Chatbots

The first policy has to do with developing comprehensive benchmarks to assess the real-world efficacy of AI that performs mental health guidance. There aren’t any clinically accepted, all-encompassing benchmarks yet in this realm. You can find pinpoint or piecemeal benchmarks, you can find research-oriented benchmarks that aren’t scaled to real-world circumstances, and other heroic initial attempts at devising such benchmarks.

We need more than the current state of affairs provides.

Why is a robust and realistic benchmark needed?

It is justifiably straightforward. The classic adage says that you cannot manage that which you cannot or aren’t measuring. My addendum is that nor can you measure that which hasn’t been suitably defined. As per my earlier indication, we presently lack a universal definition of AI and mental health. That’s a big problem when it comes to devising a proper benchmark. At the get-go, we need to reach an agreement on what it is that we are attempting to measure.

Once that’s figured out, the debate over what kinds of metrics and measurements are best utilized can more adroitly occur. The next step would be to devise practical methods for carrying out those measurements. The benchmarks would allow a comprehensive rank-rating of any proclaimed or alleged AI that does mental health. Some assert that we would opt to certify the AI, thus having a ready means of quickly knowing whether a particular AI for mental health is bona fide and reliable.

This is a sound policy recommendation and a worthy North Star.

Require AI Mental Health Chatbots To Provide APIs

If we are going to do benchmarking, there must be a means of accessing AI that purports to perform mental health guidance. A handy and perhaps essential mechanism would be to require AI makers to include a suitable API (application programming interface). The API would be the portal into testing and assessing the AI.

Part of the basis for using an API is that trying to do testing via the conventional UI (user interface) often won’t get you full access to everything that needs to be tested. Also, the testing can be done on a streamlined basis by connecting the benchmark testing tools directly to the AI via the API. One tricky aspect is that sometimes the API provides a different experience than the UI. This would need to be resolved in hopefully some receptive fashion with respective AI makers.

The API is an example of a seemingly detail-oriented policy point that is focused on implementation minutiae. Well, we all know that the proverbial devil is in the details, namely, without which due attention to the detail-level can undermine the most illustrative of endeavors.

Establish Reporting of Performance And Safety Protocols

Another recognizable observation is that there are three types of lies: lies, damn lies, and statistics.

I mention this popular refrain because we would want AI makers to provide reports about what their AI is doing and how it is abiding by stipulated safety protocols. An AI maker could attempt to distort the truth by cooking up favored reporting formats and making just about anything look good.

Just like tracking teams in sports such as football or baseball is pivotal, having strict reporting requirements would help to level the playing field for AI makers. All AI makers that are employing some kind of AI mental health capability would be required to abide by the stated reporting requirements. Similarly, there would need to be stipulated safety protocols, from which the reporting would indicate how often and how well the AI safeguards are doing their indispensable job.

Determine Trusted Third-Party Evaluator(s) Of AI Mental Health Chatbots

I’ve got a question for you to mull over.

Who would do the benchmarking and the assessment of the reporting that is supposed to be taking place?

One thought is to let the free marketplace do so. AI researchers might do it. Companies might spring up to do so. Many might leap into the fray. There is money and fame that could be made by becoming a bespoke or preferred evaluator. Another avenue would be to establish a special consortium or get a governmental agency to take on this task.

The crux is that rather than a free-for-all, maybe a delineated and trusted third-party would be the wisest choice for having a one-stop-shop evaluator. The goal would be to ensure that no funny business takes place. We’ve got to get this policy aspect settled so that the foregoing points will ultimately be attainable.

Force Carving Out Of AI Mental Health Into Distinct Therapeutic Apps

This next policy recommendation is undoubtedly the most controversial of this set of six. It has to do with a matter that keeps getting bandied around in the AI and mental health community. Heated discourse ensues.

Here’s the deal. I earlier noted that there is generic AI, such as ChatGPT, that provides mental health advice, doing so as an aside to everything else that it does, and there are specialized LLMs that are customized to perform mental health guidance. Some believe that any kind of AI for mental health guidance should be exclusively placed into a customized AI that is purpose-built for mental health.

Imagine that we got OpenAI to yank out the mental health elements of ChatGPT and had them place it into a specialized app that they devised. Let’s call it ChatGPT-MH. A user who is in conventional ChatGPT would get routed to ChatGPT-MH the moment they started to chat about mental health aspects. Once the discussion in ChatGPT-MH has finished, they will be routed back to conventional ChatGPT.

The AI maker preserves the loyalty of its users by keeping them within the sphere of their AI products. An advantage to centering the mental health aspects into a separate app is that it would be clearer as to what entity or artifact requires the testing, benchmarking, etc. And the customized app would presumably be much stronger in mental health due to being devoted to that serious consideration.

I realize this seems like an easy-going proposition. It’s not. In an upcoming column posting, I’ll unpack the upsides and downsides. There are those who clamor that this must be done. Others smirk and say that you might as well hold your breath since it’s never going to fly.

Prevent AI Sycophancy And Parasocial Relationships

I would say that this next policy is one that nearly everyone seems to be in concurrence on (well, not everyone, but most do who are taking a fair-minded, judicious perspective). It is widely known and acknowledged that many of the contemporary generative AI and LLMs have been shaped by AI makers to be sycophantic. I’ve covered this extensively; see the link here and the link here.

The reason that AI makers want their AI to coddle and fawn over users is that it gets users to like the AI. The more that users like the AI, the more they use it. Users are also less likely to switch to a competing AI. In the end, it has to do with getting eyeballs, making money, and having the biggest AI in town. Unfortunately, sycophancy tends to produce adverse mental health consequences. Society needs to get this mitigated.

It is technologically relatively easy, but the desire to keep it is enormously strong.

The same story goes along with AI fostering parasocial relationships. I tend not to describe the bonding of human-AI relationships as parasocial, more along the lines of being a said-to-be companion, but I’ll go with it for this discussion.

The upshot is that people tend to formulate a kind of friendship or kinship with AI, especially when it is giving them mental health advice. It’s not good that a human-AI relationship goes into the friendship zone, and equally bad or worse when it conflates the mental health guidance with a semblance of friendship. This is over-the-top anthropomorphizing. For my analysis of the dangers involved, see the link here, and for my coverage on the emerging condition known as AI psychosis, see the link here and the link here.

Policy From Idea To Action

We need devoted attention toward composing prudent policies to contend with the rising tide of people using AI for their mental health needs. The policies must be sensible, workable, and aim to ensure that AI for mental health doesn’t take society down a doom-and-gloom rabbit hole.

It is incontrovertible that we are now amid a grandiose worldwide experiment when it comes to societal mental health. The experiment is that AI is being made available nationally and globally, which is purported to provide mental health guidance of one kind or another. Doing so either at no cost or at a minimal cost. It is available anywhere and at any time, 24/7. We are all the guinea pigs in this wanton experiment.

Our existing legacy policies have gotten us into a bit of a conundrum. We need new policies to keep us from spiraling downward. The horse is out of the barn. You can’t argue that. Let’s get things right before the horse is miles away, and irreversible damage has been done.

A final thought for now. The famous playwright Vaclav Havel made this shrewd remark: “It is not enough to stare up the steps, we must step up the stairs.” Let’s come up with solid policies on AI for mental health and get everyone up those imperative steps. Right away. Time is slipping by.

Source: https://www.forbes.com/sites/lanceeliot/2025/12/11/vital-set-of-policy-recommendations-for-stridently-dealing-with-ai-that-provides-mental-health-advice/