Skip to main content Skip to local navigation

Award places York U among world leaders in medical AI research, innovation

A four-person team from York University joined hundreds of competitors in a prestigious AI challenge with one goal: to be recognized among those leading advancements for medical decision-making.

PhD students Israt Jahan and Md Tahmid Rahman Laskar, along with professors Jimmy Huang and Chun Peng, have been tackling a critical dilemma in health care by designing AI systems that help practitioners make safer, faster and more informed treatment decisions.

The challenge reflects the growing complexity of modern health care, notes Jahan, whose ongoing doctoral research focuses on AI-powered biomedical systems. Clinicians must manage vast amounts of medical data, keep up with rapidly evolving research and treat patients with multiple conditions requiring carefully tailored care – often under intense time pressure.

Artificial intelligence has potential to address this, notably through an approach called therapeutic reasoning in which AI systems can be asked by health care professionals to think through treatment options, flag unsafe drug interactions and organize medical evidence.

Israt Jahan (left) and Md Tahmid Rahman Laskar (right) holding their certificates for placing third in the CURE-Bench competition.

The goal is to develop the technology as a tool to aid complex medical decision-making for clinicians, pharmacists, researchers and health care systems.

Despite its immense potential, therapeutic reasoning is still largely in the investigation and experimental stage because it must be designed to be secure and reliable. “Existing AI systems often generate fluent answers but lack the reliability and safety guarantees needed for clinical use,” says Jahan.

The interdisciplinary team of York researchers brings together expertise from the Faculty of Science's Department of Biology (Jahan and Peng) and the Lassonde School of Engineering's Department of Electrical Engineering and Computer Science (Laskar) and the School of Information Technology (Huang). Their cross-disciplinary collaboration combines knowledge of clinical practice and biomedical science with expertise in AI, machine learning and software systems, allowing them to develop solutions that are both technically sophisticated and clinically trustworthy.

The team's previous studies, reported in peer-reviewed publications, has developed AI systems that can support – not replace – health care professionals. “By structuring how AI systems reason about treatments, drug safety and patient-specific factors, we want to reduce avoidable errors and help clinicians work more efficiently under pressure,” says Jahan.

In summer 2025, the team entered the prestigious CURE-Bench challenge, an initiative aimed at advancing research in therapeutic reasoning and more trustworthy medical AI. “We wanted to participate as a way to rigorously evaluate our research ideas in a realistic, high-stakes setting against strong global competition,” says Jahan.

Participants are challenged to submit AI systems that are evaluated on clinically grounded tasks, such as reasoning through treatment options, assessing drug safety, designing care plans and identifying new uses for existing drugs – all practical clinical tasks that health care professionals face every day.

Instead of building a new AI model from the ground up, the team developed prompts to guide existing AI models – including GPT-5 – to answer medical questions reliably and systematically. The prompts instruct the AI to identify the patient, disease and medications involved; check for safety concerns such as drug interactions or contraindications; rule out unsafe or ineffective options; and produce a structured rationale explaining why the final answer was the best choice.

Using biomedical expertise, they designed effective prompts to ensure that the AI outputs were clinically sound and accurate; their computer science skills helped process datasets and iterate experiments quickly.

This AI solution was submitted to the competition cycle in September 2025 in the Internal Reasoning Track, a category designed to test how well an AI model can think through complex medical problems on its own, rather than simply retrieving information from external sources.

Submissions are tested automatically on benchmark medical scenarios and scored based on accuracy, reliability, reasoning quality and alignment with real-world clinical practice. Winners are announced at the Annual Conference on Neural Information Processing Systems (NeurIPS), one of the most prestigious and influential conferences in AI.

This year’s event saw approximately 1,467 individual entrants representing about 322 teams, with more than 2,700 total submissions of models and systems.

York's innovation placed third.

The achievement marks a meaningful milestone for the York team. “It showed that careful design and attention to safety can significantly improve the performance of AI models in critical clinical tasks,” says Jahan.

Cross-disciplinary collaboration was also key for the achievement. Huang notes, “The team’s success demonstrates how combining biological expertise with cutting-edge artificial intelligence techniques can lead to impactful innovations in healthcare. It reflects the talent of our graduate students and the strength of interdisciplinary research at York.”

Following this success, the team plans to: improve how AI grounds its answers in medical evidence; test the system across a broader range of clinical scenarios and edge cases; and refine outputs to be clearer and more trustworthy for clinicians. They are also preparing this work for submission to peer-reviewed journals to share findings with the broader research community.

“We hope the result will encourage further research into safer and more reliable therapeutic reasoning systems,” says Jahan.

Awards & Recognition Editor's Picks Research & Innovation

Tags: