Background
MedLM is a family of foundation models fine-tuned for the healthcare industry. Med-PaLM 2 is one of the text-based models developed by Google Research that powers MedLM, and was the first AI system to reach human expert level on answering US Medical Licensing Examination (USMLE)-style questions. The development of these models has been informed by specific customer needs such as answering medical questions and drafting summaries.
MedLM model card
The MedLM model card outlines the model details, such as MedLM's intended use, data overview, and safety information. Click the following link to download a PDF version of the MedLM model card:
Regulatory information
MedLM intended use
MedLM is based on Google Research's medically-tuned large language model, Med-PaLM 2. It is intended to be used for question answering and creating draft summaries from existing documentation -- to be reviewed, edited, and approved by the user before use. MedLM is also used for educational purposes for a Healthcare Professional (HCP) to engage in medical questioning and answering to help support the HCP.
Conditions of use and out-of-scope applications
- MedLM customers and users must abide by the Generative AI Prohibited Use Policy, Google Cloud Platform Service Specific Terms, Terms of Service, Acceptable Use Policy, User Guide, and other product documentation.
- As part of Service Specific Terms, customers may not
use MedLM for clinical purposes (for clarity, non-clinical
research, scheduling, and other administrative tasks are not restricted), to
provide medical advice, or in any manner that is overseen by or requires
clearance or approval from a medical device regulatory agency.
- Direct patient use is prohibited. The product functions as an assistive tool for a clinician, HCP, or knowledge worker with a high degree of expertise, education, or experience in the healthcare and life sciences industry.
- Use of MedLM as a Software as a Medical Device is prohibited.
- The intended use for MedLM is to draft documents and responses that would be reviewed by a "human in the loop" before usage.
- We recommend usage of MedLM solely for Medical Q&A and
Summarization use cases at this stage:
- Long form Q&A
- Multiple choice Q&A
- Summarizations, such as creation of After Visit Summaries or History and Physical Examination notes
- Examples of medical device uses that are not permitted include (but are not
limited to):
- Analysis of patient records, prescription patterns, geographical data, and so forth to identify patients with possible diagnosis of opioid addiction.
- Analysis of patient-specific medical information to detect a life-threatening condition, such as stroke or sepsis, and generate an alarm or an alert to notify a HCP.
- Analysis of patient-specific medical information found in the medical records, including the most recent mammography report findings, to provide a list of follow-up actions or treatment options.
- Analyzing prioritized list of FDA-authorized depression treatment options to an HCP based on an analysis of reported outcomes in a database of clinical studies using medical information (for example, diagnosis and demographics) from the patient's medical record.
MedLM is currently only available to allow-listed customers in the US.
MedLM is not intended to be used as a medical device. Customer use cases must be consistent with the intended use and conditions of use. Q&A should only be used for educational purposes and summarization outputs must always be independently reviewed and verified by the user based on their clinical judgment.
MedLM versus PaLM
Usage of MedLM is similar to that of PaLM. However, unlike PaLM, MedLM has been tuned for specific medical tasks, such as select forms of summarization and medical question-answering.
As with most new applications of LLMs, however, we encourage performing careful validation and/or tuning your usage of MedLM to ensure good performance on these tasks. For tasks that don't require specialized medical expertise (for example, general NLP tasks or tasks which operate on medical data but don't require expertise), we expect that MedLM may perform similar to more generic models such as PaLM, and encourage experimenting with both on the specific use-case. See also the Med-PaLM paper for more details on the Q&A tasks that MedLM has been trained and validated on. Capabilities like grounding of the responses in authoritative medical sources or accounting for the time-varying nature of medical consensus are not built into the model, as called out in the publication.
As per our Generative AI Service-Specific Terms, customers may not use MedLM for clinical purposes (for clarity, non-clinical research, scheduling, or other administrative tasks is not restricted), to provide medical advice, or in any manner that is overseen by or requires clearance or approval from a medical device regulatory agency.
MedLM models
In the current MedLM release, two models are being made available:
- MedLM-medium
- MedLM-large
MedLM-medium and MedLM-large have separate endpoints and provide customers with additional flexibility for their use cases. MedLM-medium provides customers with better throughputs and includes more recent data. MedLM-large is the same model from the preview phase. Both models will continue to be refreshed over the product lifecycle. In this page, "MedLM" refers to both models.
Customer responsibilities
MedLM has been developed with trained and licensed healthcare practitioner users in mind. Google Cloud customers and end users should understand that LLMs and Generative AI are inherently probabilistic and may not always be accurate. Without adequate consideration or controls by customers, use of Generative AI models in healthcare may constitute a hazard to patients due to inaccurate content, missing content, or misleading, biased content.
- Customers should implement appropriate hazard mitigations for all MedLM uses, such as adequate practitioner education, training, assessment of equity, and appropriate technical controls.
- Customers must also perform their own evaluations for performance and safety to ensure prevention of harm for their use cases.
MedLM may produce less accurate results for some groups compared to others depending on the question and how it is posed. Customers should be aware that differing performance of outputs of the model across demographic groups has the potential to exacerbate health inequities and perpetuate harmful biases. Such inaccuracies of outputs are not unique to MedLM and often stem from multiple factors, such as existing social and structural inequities, medical misconceptions, negative stereotypes, and lack of diversity in training data.
- Customers should consider implementing equity-focused evaluations and mitigations. This includes assessing model performance and behavior for intended use cases within various populations (for example, race/ethnicity, socioeconomic status (SES), geography, gender identity, sexual orientation, age, language preference, caste, and so forth); obtaining feedback on performance; engaging interdisciplinary experts and external partners that specialize in defining and addressing social and structural aspects of health; and conducting continuous monitoring efforts to assess and address issues of bias.
Request access
Access to the MedLM models is restricted. To request access, contact your Google Cloud account team.
Provide feedback
Your feedback throughout your experience will help us improve future model versions and ensure that we continue to deliver the best possible experience for our users. Contact [email protected] and copy your Google Cloud account team and Google Cloud Customer Engineer (CE). This email address is not for immediate support. To request immediate support, contact your Google Cloud account team or Google Cloud Customer Engineer (CE).
Email responses will be used as Feedback under the terms of your Agreement for Google Cloud Services and will be collected in accordance with the Google Cloud Privacy Notice. Do not include any personal information (names, email addresses) in this feedback form or other data that is sensitive or confidential. Note that data may be reviewed using both human reviewed and automated processing.
Report abuse
You can report suspected abuse of the MedLM API, any generated output that contains inappropriate material, or inaccurate information in Report suspected abuse on Google Cloud. In the Google Cloud Platform Service list, select Cloud AI.
Pricing
What's next
- See examples of how to create MedLM prompts.