Ensuring Equity in Artificial Intelligence and Machine Learning

June 08, 2022

This week, I am pleased to introduce Susan K. Gregurick, Ph.D., Associate Director for Data Science and Director of the NIH Office of Data Science Strategy (ODSS).

Dr. Gregurick coordinates and oversees the NIH-wide strategic vision for data science. She was instrumental in creating ODSS in 2018 and was a senior advisor to ODSS until being appointed to her current role in 2019. She began her career at the NIH in 2013 as Division Director for Biophysics, Biomedical Technology, and Computational Biosciences at the National Institute of General Medical Sciences (NIGMS).

Under Dr. Gregurick’s leadership, the NIH launched the Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity (AIM-AHEAD) program in 2021. AIM-AHEAD is a national initiative to address the lack of diversity in artificial intelligence and machine learning (AI/ML) research and data. I recently spoke with Dr. Gregurick about AIM-AHEAD’s objectives and how it will enhance scientific workforce diversity.

Q. What are the main goals of the AIM-AHEAD program?

The rapid increase in the volume of data generated through electronic health records (EHR) and other biomedical research presents exciting opportunities for developing AI and machine learning approaches for biomedical research and improving healthcare—including health equity. Yet, a lack of diversity in the data used to train AI/ML models and among AI/ML researchers runs the risk of creating and perpetuating harmful biases, thus fostering continued health disparities and inequities.

AIM-AHEAD seeks to enhance the participation of researchers and communities underrepresented in AI/ML model development, use AI/ML to address health disparities and inequities, and improve AI/ML capabilities, beginning with EHR data.

The program will achieve these goals by targeting four related areas: partnerships, research, infrastructure, and data science training. Establishing mutually beneficial partnerships will integrate AI/ML data science research networks with community and clinical research networks to form collaborations and engage underrepresented scientists at various career stages.

Partners will use new and existing datasets to develop and enhance AI/ML algorithms and apply AI/ML approaches to address health inequities and disparities. AIM-AHEAD will also enable a coordinated data and computing infrastructure that enhances the interoperability of data resources. Finally, AIM-AHEAD will provide training opportunities in data science and health disparities research, among other areas, to increase AI/ML analytics capabilities and foster diversity in the discipline.

Q. What are the gaps that AIM-AHEAD will fill?

Biomedical and clinical research use AI/ML in numerous ways, from identifying patterns in patient data to recommending treatment strategies. However, the lack of diversity AI/ML research and data can perpetuate biases in applications, algorithms development and training, and interpretation of findings. For example, AI algorithms are often trained with data that does not represent the full diversity of a population, which can perpetuate racial and ethnic and gender biases, leading to health disparities and inequities.

Underrepresented researchers and communities in AI/ML have untapped potential to contribute expertise, data, recruitment strategies, and innovation to this rapidly advancing discipline. In addition, research suggests a relationship between increased workforce diversity and improved patient outcomes, so a diverse AI/ML research community has the potential to better anticipate and detect biases in AI/ML systems and address health disparities.

Q. How will AIM-AHEAD enhance biomedical data science workforce development?

As a national program with regional hubs across the United States, AIM-AHEAD is well-positioned to develop and invest in a diverse scientific workforce. For example, AIM-AHEAD partnerships and evidence-informed training opportunities will enhance researcher diversity at all career stages, including leadership positions. The result will be an inclusive research community poised to advance biomedical data science.

AIM-AHEAD will also catalyze the development of an inclusive and engaged community in biomedical data science by establishing strong criteria for a thorough training needs evaluation, a systematic approach to execution, and accountability for broad inclusion—principles often absent in AI/ML.

Q. Tell me about the new NIH-funded AIM-AHEAD consortium.

In September 2021, the NIH awarded an initial $50 million to the University of North Texas Health Science Center (UNTHSC) to lead the AIM-AHEAD Coordinating Center (A-CC). UNTHSC is partnered with 15 other institutions; they compose the A-CC and have expertise in community engagement, health equity research, AI and data science training, and data infrastructure. The A-CC is developing a consortium of institutions with a core mission to serve health disparity populations and an interest in building a more inclusive basis for AI/ML. The consortium will engage in various activities to enhance AI/ML research while developing and sustaining relationships with groups impacted by health disparities, and new collaborators are welcome.

Q. How does AIM-AHEAD fit into the larger NIH vision for equity in biomedical research?

AIM-AHEAD will help the NIH accelerate the pace of biomedical innovation and employ the promise of AI/ML to improve the health of everyone in the nation. Ensuring that the benefits of AI/ML are equitably shared across all groups and populations is a multi-factor, grand challenge for biomedical and health research.

AIM-AHEAD is a significant NIH investment to address this challenge with the potential to transform AI/ML by creating a more inclusive foundation for advancing AI/ML systems in a fair, inclusive manner. It also furthers broader NIH goals to promote the ethical creation and use of data and AI/ML models across the research ecosystem while broadening and diversifying participation in biomedical data science.