Search Articles

View query in Help articles search

Search Results (1 to 10 of 127 Results)

Download search results: CSV END BibTex RIS


Automated Extraction of Mortality Information From Publicly Available Sources Using Large Language Models: Development and Evaluation Study

Automated Extraction of Mortality Information From Publicly Available Sources Using Large Language Models: Development and Evaluation Study

LLM: large language model. For example, in the post, “Jane Smith died from a severe infection following surgery. She also had diabetes and hypertension, which contributed to her deteriorating health,” the main cause would be noted as “severe infection following surgery,” and the secondary causes as “diabetes” and “hypertension.” The initial prompt engineering stage ensures that the LLM properly formulates the type of information to extract or predict.

Mohammed Al-Garadi, Michele LeNoue-Newton, Michael E Matheny, Melissa McPheeters, Jill M Whitaker, Jessica A Deere, Michael F McLemore, Dax Westerman, Mirza S Khan, José J Hernández-Muñoz, Xi Wang, Aida Kuzucan, Rishi J Desai, Ruth Reeves

J Med Internet Res 2025;27:e71113

Quo Vadis, AI-Empowered Doctor?

Quo Vadis, AI-Empowered Doctor?

The inadequacy of standard LLM evaluation metrics as grounds for physician workforce reduction has been comprehensively examined previously [14-18]. For example, the performance of medical LLMs is still dependent on the provision of pertinent clinical history information and salient features of the physical examination, and it is still not clear that this critical initial step in successfully identifying the nature of a medical condition can be adequately performed by an LLM.

Gary Takahashi, Laurentius von Liechti, Ebrahim Tarshizi

JMIR Med Educ 2025;11:e70079

Artificial Intelligence (AI) and Emergency Medicine: Balancing Opportunities and Challenges

Artificial Intelligence (AI) and Emergency Medicine: Balancing Opportunities and Challenges

The current research highlights the need for rigorous validation of LLM outputs, particularly when used for direct patient management. Because AI models are trained on large datasets, their performance is strongest for common presentations. Infrequent conditions such as rare genetic disorders or atypical manifestations of common pathologies are prone to misclassification.

Félix Amiot, Benoit Potier

JMIR Med Inform 2025;13:e70903

Assessing ChatGPT’s Educational Potential in Lung Cancer Radiotherapy From Clinician and Patient Perspectives: Content Quality and Readability Analysis

Assessing ChatGPT’s Educational Potential in Lung Cancer Radiotherapy From Clinician and Patient Perspectives: Content Quality and Readability Analysis

LLM chatbots like Chat GPT offer a promising solution to mitigate this burden by answering routine patient inquiries and reducing the workload on health care professionals. Furthermore, its ability to simulate conversations enables interactive patient education, improving comprehension and fostering a more informed and empowered patient community [4,8,9]. However, Chat GPT has limitations.

Cedric Richlitzki, Sina Mansoorian, Lukas Käsmann, Mircea Gabriel Stoleriu, Julia Kovacs, Wulf Sienel, Diego Kauffmann-Guerrero, Thomas Duell, Nina Sophie Schmidt-Hegemann, Claus Belka, Stefanie Corradini, Chukwuka Eze

JMIR Cancer 2025;11:e69783

Evaluating a Customized Version of ChatGPT for Systematic Review Data Extraction in Health Research: Development and Usability Study

Evaluating a Customized Version of ChatGPT for Systematic Review Data Extraction in Health Research: Development and Usability Study

This accessible LLM software is proficient at understanding and processing human language with high speed and accuracy. As the tool can effectively interpret information from complex texts, it may have strong potential for application in the extraction of systematic reviews. In addition, Open AI recently released the functionality to build customizable versions of Chat GPT.

Jayden Sercombe, Zachary Bryant, Jack Wilson

JMIR Form Res 2025;9:e68666

Role of Artificial Intelligence in Surgical Training by Assessing GPT-4 and GPT-4o on the Japan Surgical Board Examination With Text-Only and Image-Accompanied Questions: Performance Evaluation Study

Role of Artificial Intelligence in Surgical Training by Assessing GPT-4 and GPT-4o on the Japan Surgical Board Examination With Text-Only and Image-Accompanied Questions: Performance Evaluation Study

Chat GPT has achieved conversational interactivity and human-like or better correct-answer rate across various fields—including the medical field [6]—suggesting that LLM applications could be beneficial in clinical, educational, and research settings [7]. GPT-4—released in March 2023—achieved an excellent correct-answer rate for United States Medical Licensing Examination (USMLE)-style questions, exceeding the passing threshold of 60% [8].

Hiroki Maruyama, Yoshitaka Toyama, Kentaro Takanami, Kei Takase, Takashi Kamei

JMIR Med Educ 2025;11:e69313

Improving Large Language Models’ Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation

Improving Large Language Models’ Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation

Therefore, LLM-generated summaries may sometimes omit crucial information or lack proper structure. In this paper, we present the first step in the process of simplifying discharge notes by harnessing the summarization capabilities of LLMs. In this study, we aim to generate more accurate, structured summaries from discharge notes where headers provide clear orientation and make the content easier to understand [25].

Mahshad Koohi Habibi Dehkordi, Yehoshua Perl, Fadi P Deek, Zhe He, Vipina K Keloth, Hao Liu, Gai Elhanan, Andrew J Einstein

JMIR Med Inform 2025;13:e66476

Feasibility of a Randomized Controlled Trial of Large AI-Based Linguistic Models for Clinical Reasoning Training of Physical Therapy Students: Pilot Randomized Parallel-Group Study

Feasibility of a Randomized Controlled Trial of Large AI-Based Linguistic Models for Clinical Reasoning Training of Physical Therapy Students: Pilot Randomized Parallel-Group Study

After being randomized, for those students belonging to the experimental group (LLM Group), a personal LLM Chat GPT account in version 3.5 was generated for them for a period of 1 month. Using this account, the participants solved a total of 4 clinical cases for 4 weeks, one per week, in which the LLM will serve as a virtual patient, answering the questions that the student asked and based on a physical therapy diagnosis, participants proposed a treatment for the virtual patient.

Raúl Ferrer-Peña, Silvia Di-Bonaventura, Alberto Pérez-González, Alfredo Lerín-Calvo

JMIR Form Res 2025;9:e66126

Implementing Large Language Models in Health Care: Clinician-Focused Review With Interactive Guideline

Implementing Large Language Models in Health Care: Clinician-Focused Review With Interactive Guideline

To help clinicians and health practitioners select LLMs, we proposed an interactive guideline with a clinical LLM selector tool that relies on a large-scale decision tree containing hundreds of nodes (general description in Figure 1). Using LLM names as keys, we recorded the number of appearances of 330 identified LLMs and their frequency of performing best by clinical task and input and output modalities.

HongYi Li, Jun-Fen Fu, Andre Python

J Med Internet Res 2025;27:e71916