Blog - Sanctuary Health

AI will be one of the greatest new technological foundations from which we can begin to reimagine building products in healthcare. But right now it feels messy. In this blog, I want to answer three questions: 1) What are LLMs? How does ChatGPT fit in? What are the implications on healthcare? Let's start with what LLMs are:

What are language models? and what are the large ones?

Wikipedia says a language model is a “probability distribution over a sequence of words.” I can translate it as: language models predict the next word in a sentence by understanding what came before it. If you open up your phone and type “I”, one of the keyboard suggestions will likely be “am” as the next word; this is a very rudimentary form of a language model.

So how does some complex statistics turn into a tool that can write a blog about integrating with Epic in the style of Shakespeare? What AI teams figured out was that the more data you give the model, the better it became at completing requested tasks. Large language models, therefore, work off the same principles as above but consume a gigantic data set (the large bit); GPT-3 has 175 billion parameters and was trained on 570 gigabytes of text.

For those that are interested, the most recent releases of LLMs are built using a transformer deep learning model which provides the next piece of the puzzle. You can read more here: Transformer (machine learning model) - Wikipedia

What is GPT and how does it fit in?

GPT stands for Generative Pre-trained Transformer and is a series of large language models developed by OpenAI. OpenAI added to the AI cooking pot 175 billion parameters and the “internet” as the source of the training data. The final concoction is GPT-3 and now ChatGPT (GPT-3.5) which has a human-like ability to perform a series of tasks. It has been asked to develop scripts, write code, and do my brother's History essays for university; all this feels like the bottleneck of capability will lie with the person writing the prompt then the AI itself. OpenAI’s GPT LLM is the most famous but they are not the only player in the game. Google and Microsft are developing versions behind a few more doors.

What is the impact on healthcare?

To know how GPT-3, ChatGPT, or other iterations of LLMs can disrupt healthcare, we need to know what LLMs are good at and we can then apply those strengths to the never-ending problem list we have to address in healthcare. So what can LLMs do really well:

Well, the first answer is we don’t really know yet. Stanford got a bunch of clever people in a room to discuss the impact of GPT-3 when it was released and one of the discussants said

That such a large “capability surface” makes it challenging to both scope the full array of uses (because GPT-3 can take in arbitrary inputs, it is a priori impossible to anticipate all potential behaviors of the model) and to ensure their safety to people and societies - Source

But that isn’t a satisfying answer so here is someone else’s fantastic attempt at breaking it down. Meor Amer has a great 2 part blog that summarises the 7 main jobs LLMs are currently very good at:

Summarisation
Generation
Rewrite (we won’t cover this!)
Extract
Search
Cluster
Classify

The next sections I wrote with the help of Johnathan Klaus, MBA, RN. I have broken them down into 3 main sections:

Summarisation/Extraction

Summarisation is the model's ability to take long convoluted text a boil it down. Extraction is a close sister which is the model's ability to pull out key pieces of info. Think of summarisation as boiling down Romeo and Juliet to 2 pages and extraction as pulling out all the scenes where “Romeo” is mentioned. Where is this useful in healthcare? Well, here are a couple of places:

a) Medical research and medications - Understanding and decoding new research is a high-effort task. It takes time to filter through new research papers, understands their methods, and break down their findings. I think LLMs can massively reduce the time-to-value in terms of finding the right info, quickly, or for medications physicians aren’t used to prescribing, broadly summarizing the relevant information into an easier-to-read format, highlighting the important information.

b) Patient records and health information - This entire stack is a huge source of pain for anyone in healthcare. That lack of interoperability between systems or the fact there isn’t one source of patient information is an administrative challenge that costs lives. I also think humans are also notoriously bad at weighing the importance of different pieces of information. The “availability bias” means we can miscalculate the importance of different aspects of patient records based on what we remember from them, which is doubly hard when the information is in 900 different places.

So within the hospital environment, summarization can be useful to gather the information that is currently spread throughout the EHR into a coherent patient story, some opportunities for this include; nursing shift hand-offs, inter/intra facility transfer summaries, procedure summaries, physician hand-offs and rounding summaries (these may be somewhat of a combination of Generation and Summarization).

Some cool companies already operating in this area:

Abstractive Health

Synapse Medicine | Medication Success Starts Here

Generation

Generation is the LLM’s ability to write something from any given prompt. This is the most obvious use case and the one that has proliferated around the internet and will likely democratize a tranche of content creation for the entire healthcare space.

a) Patient comms and education - Individuals with low health literacy have a 2.6x higher mortality (source) rate than those with high health literacy. Engaging patient in their own care and health through education is a tough problem because it needs to be personalized for the approach to work. The goal is to deliver the right information, at the right time, in the most engaging format possible (ie media type, level of complexity, etc). This has long been too big of a mountain for any hospital comms team to climb and so for decades, we have had boring content. LLMs provide the opportunity for a step in the right direction. The ability to write content and target it at the right reading age will empower marketing and comms teams to increase the velocity of content production, whilst also being able to make it more engaging. Sanctuary Health actually built a tool on top of ChatGPT to make this easier - https://demo.sanctuaryhealth.io/writing

b) Care plans, insurance claims, and chatbots - In what other workflows can LLMs be powerful? Let’s start with constructing care plans. What do I mean by this? Depending on the training data, LLMs will begin to be able to construct aspects of care journeys instead of the doctor having to do so. To illustrate this, let’s take the goal of weight loss as part of some treatment for obesity. Alex Cohen (Product director at Carbon Health) built a food and workout plan with a few inputs that were likely produced measurably faster than any physician by themselves. See the anecdotal example below:

https://twitter.com/anothercohen/status/1599531037570502656?s=20&t=WDT8q_wgHk8FeLZGLRcJCA

I mean this sincerely when I say that ChatGPT might be the most incredible tech to emerge in the last decade.

Here's how I got it to create a weight loss plan, complete with calorie targets, meal plans, a grocery list, and workout plan 🧵:
— Alex Cohen (@anothercohen) December 4, 2022

The same is true for writing insurance claims. See this shockingly brilliant example recently shared on Twitter:

You: There’s no ChatGPT use case in healthcare
Docs: Watch this 👇 pic.twitter.com/2hX6vT4ncn
— Stuart Blitz (@StuartBlitz) December 14, 2022

I think the thread that is worth pulling on here is about enhancing a physician's or any healthcare provider’s work. Healthcare providers waste a huge amount of time on admin and this reduces their ability to deliver care. LLMs have the potential to streamline this in a way not seen before. A chatbot that responds to a patient directly who may have a question about their condition or side effects from medications. If you email a doctor’s office an automatic reply may be able to generate a customized response to non-critical questions. I don’t think we are here yet, but this is a very real possibility in the near future. Is it too much to say that in the long run, AI will likely be more accurate than healthcare providers?

Search, cluster, and classify

As Meor outlines, these talents of LLMs are less spoken about. Whilst what we went through above is about text generation, this section will be about the effects of text representation in healthcare and the opportunities that it will give. So what is this and how does it work? Text representation takes existing text-based data and performs the required tasks. So what can be done here? Well, this is a more general prediction but Google is a powerful tool for a lot of physicians. Sam Altman and I (ngl I think he go this from me) think that LLMs will pose the first real challenge to Google as they have the ability to build more personalized search experiences whilst having more powerful semantic connections. Expanding beyond Google, where can semantic search engines be more useful in bespoke areas of healthcare?

To build on this, there is a world where LLMs can clean and cluster hospital data sets. Clustering is an LLM's ability to build connections between data into pools. Clustering tries to answer the question ‘how are these two things connected?’. This example is given in Meor’s blog where an LLM analyzed 3000 posts from Hacker News:

In healthcare, this could be an incredibly powerful tool. Imagine having the ability to cluster things like types of patient questions, and misdiagnoses, or better connect a large number of documents in the health system (patient records, billing documents, literally anything).

What could go wrong?

The biggest risk for mishaps is in the generative part of LLMs. The problem here is that tools like ChatGPT are confidently wrong. That is to say that these systems have no true concepts of what accurate “looks like” but have the ability to be very precise. This messes with our brains because we frequently think precision is the same as accuracy. Why is this a problem? Well, to state the obvious, if generative models like chatbots have direct interactions with patients and get something wrong (symptoms, potential treatments, lifestyle changes) then the negative effects are obvious. Here is a tweet from Sam Altman that nicely sums it up:

ChatGPT is incredibly limited, but good enough at some things to create a misleading impression of greatness.

it's a mistake to be relying on it for anything important right now. it’s a preview of progress; we have lots of work to do on robustness and truthfulness.
— Sam Altman (@sama) December 11, 2022

‍

Moving forward

I think that LLMs provide a new foundation that could genuinely change care delivery from the ground up. And most importantly, empower the people that actually deliver care. Over the next decade, there will be a ground shift away from health providers doing admin to providing care to the people that need it. This will result in better outcomes and happier interactions with and within healthcare. Providers don’t want to do admin and patients want more time with their doctors. The use of them in healthcare needs to be implemented cautiously but in a way that does not fall victim to the industry’s inertia. There is a world where AI can enhance the delivery of care in a way not seen before, we just need to build it.

‍

Sources

https://txt.cohere.ai/llm-use-cases/

https://arxiv.org/pdf/2102.02503.pdf

https://en.wikipedia.org/wiki/GPT-3

https://en.wikipedia.org/wiki/Language_model

‍