Skip to content

Our world has changed dramatically in the last few months”, stated Dr. Michael Pencina, Director of Duke AI Health, Vice Dean for Data Science, School of Medicine during his introduction. He was in part referring to the impacts that the release of ChatGPT software has recently had on society. A few days later, these impacts were described by the reputed philosopher and thinker Yuval Noah Harari, who argued that storytelling AI “has hacked the operating system of human civilization and will change the course of human history.” The use of emerging AI and machine learning technologies has potentially harmful and unpredictable consequences in terms of privacy, confidentiality, unjustified surveillance, or discrimination. These challenges also apply to the fields of research and innovation where AI tools have been widely used – from medical research, health innovation, and public health practice, to engineering, social sciences, and fundamental sciences.

In his overview, Dr. Pencina compared predictive AI – e.g. diagnostic tools – and generative AI, such as ChatGBT, which creates new content based on existing data with potentially unintended and unethical consequences. However, there are many exciting windows that generative AI has opened for medical research and practice, Pencina added. AI could support the clinical process by drafting clinical notes, based on the conversation between doctor and patient, it could create registries based on large amounts of data, or it could write a grant. “A clever way of using the GPT bot tool that I’ve seen is to ask the Chatbot to read a scientific paper and then ask it to answer a question from the paper. I experimented with this myself and it has turned out very good.” Pencina said.

Despite its many promises, Dr. Pencina emphasized the need for having humans evaluate the technology, which is “like a blind machine reading without recognition”. “We want a human envelope for evaluation of the technology,” said Pencina, even though this would slow down the innovation, safeguards are needed to manage bias and other crucial challenges. While the need for regulating AI is recognized by many entities, the “regulatory offensive” poses some problems as well.

For developing predictive AI, he recommended eight important considerations in a paper he published in 2020 in the New England Journal of Medicine: Population at risk, Outcome of interest, Time horizon, Predictors, Mathematical model, Model evaluation, Translation to CDS, and Clinical implementation.

Laura Barisoni, MD, Professor of Pathology and Medicine, Co-director of the Division of AI & Computational Pathology, Duke School of Medicine, discussed the opportunities and challenges of AI in digital pathology. The big opportunity is the potential for enhanced pathological analysis with the goal of more precise diagnoses, relevant disease stratification and data driven selection of patient treatments. But there are challenges that need to be managed, Barisoni described. For example, in terms of infrastructure, training new personnel for proper use of AI technologies can be time consuming. In terms of image analysis, standardizing analytics or validating and interpreting algorithms can be difficult. And there are other barriers related to informatics, data integration and visualization (see the slide deck for more details).

Christina Silcox, PhD, Research Director for Digital Health and Adjunct Assistant Professor at the Duke-Margolis Center for Health Policy, focused her talk on regulatory challenges surrounding AI. Data analyzed by software can be difficult to regulate in the context of both rapidly changing data and concerns surrounding bias. Other challenges are related to AI limitations regarding generalizability. “AI technologies built with data from very clean data sources may not work with real world data”, Dr. Silcox said. A third major challenge is the rapidity of updating the software which poses many problems for the regulatory workforce, for communicating the changes to users and for surveillance. A fourth challenge is the lack of machine learning explainability, which means that, like a black box, “a machine cannot tell you how the software works and there is no humanly comprehensible explanation of how the inputs are combined to come to a result. “ she described. Dr. Silcox presented some of the regulatory initiatives launched by the Food and Drug Administration, including the Predetermined Change Control Plan, which is a methodology to develop, validate, and implement modifications on software-enabled medical devices, and to assess the risks and benefits of these changes. However, even with detailed regulations and methodologies, there are still considerations surround liability and oversight. (see the slide deck for more details).

Nicoleta Economou, PhD, the director of the Algorithm-Based Clinical Decision Support (ABCDS) Oversight, leads the operations and framework design effort for the governance, evaluation, and monitoring of AI algorithms and tools deployed at Duke. “There are inherent risks we need to be aware of when developing algorithmic tools,” said Economou. To counteract these challenges related to AI and building its trustworthiness, she has helped stand up the framework of the ABCDS Oversight process, where AI algorithms and other algorithmic tools that are developed both externally and internally at Duke are monitored and put through a series of evaluation checkpoints to “understand their risk-benefit as it relates to bias, safety, and efficacy.”

“Our primary focus is on patient safety and high-quality care, with the mission to guide all algorithms that are to be deployed in the health system through their life cycle by providing governance, evaluation, and monitoring,” Economou said. According to her, ABCDS Oversight is a continuous improvement initiative that follows the evolving landscape for AI-enabled technologies and constantly learns from the implementation of its framework. In the past two years that the ABCDS Oversight Committee has implemented its framework, this committee has observed an overall enhancement of the quality of tools deployed at Duke, and a better understanding of how these tools are integrated into the health system in order to make sure that they are fair safe, efficacious, and meeting patient needs. “We will continue to learn and improve our process and expand it as we strive to improve overall the care we're delivering to our patients,” said Economou.

From the audience, Jeffrey Swanson, MD, asked the speakers about managing the challenges of the algorithm that his research team wants to create in order to predict suicide in veterans who are using non-VA health systems. (Watch the event to view this conversation).

Edward Tian, the undergraduate student in Computer Science and Journalism at Princeton University who created  GPTZero, invited Duke students and researchers to contact him (Watch the event to find his contact information) if they are interested in joining a research team to study AI challenges and identify ways to counteract them.