A recent study indicates that large language models (LLMs) can provide treatment recommendations for early-stage hepatocellular carcinoma (HCC) that align with established clinical guidelines. Conducted by Ji Won Han from The Catholic University of Korea and published in the open-access journal PLOS Medicine, the research highlights the potential and limitations of AI in managing liver cancer.
Choosing the appropriate treatment for liver cancer is inherently complex. While international guidelines exist, clinicians must tailor their approaches based on various factors, including cancer stage, liver function, and the presence of comorbidities. This study aimed to determine if LLMs could generate treatment suggestions that reflect real-world clinical practices.
To explore this, researchers compared the recommendations provided by three LLMs—ChatGPT, Gemini, and Claude—with the actual treatments received by over 13,000 newly diagnosed patients with HCC in South Korea. The findings revealed a substantial alignment between LLM recommendations and treatment decisions for patients with early-stage HCC.
Findings Highlight AI’s Limitations in Advanced Cases
In early-stage patients, higher agreement between AI-generated recommendations and actual treatments correlated with improved survival outcomes. Conversely, in patients diagnosed with advanced-stage HCC, greater alignment was linked to worse survival rates. This suggests that while LLMs may assist in straightforward treatment decisions, their efficacy diminishes in more complex scenarios.
The study noted that LLMs tended to emphasize tumor characteristics, such as size and quantity, while physicians prioritized liver function. This distinction illustrates the nuanced clinical judgment required for advanced cases, where factors beyond tumor metrics are critical in treatment planning.
The authors urged caution when considering LLM advice, emphasizing that it should serve as a supplementary tool rather than a replacement for clinical expertise.
“Our study shows that large language models can help support treatment decisions for early-stage liver cancer, but their performance is more limited in advanced disease. This highlights the importance of using LLMs as a complement to, rather than a replacement for, clinical expertise,”
they stated.
As AI technologies continue to develop, understanding their strengths and weaknesses will be crucial in integrating them effectively into healthcare. The findings underscore the potential for AI to aid clinical decision-making but also highlight the importance of human judgment in complex medical cases.
This research presents a vital step in evaluating the clinical utility of LLMs in oncology, paving the way for future studies that could refine their application in real-world settings. The study will be available in the PLOS Medicine journal, set to be published in 2026, further contributing to the discourse on AI’s role in healthcare.
