AI Health Data Studio: Hands-On Digital Pathology

Image from the CAMELYON16 ISBI challenge on cancer metastasis detection:

Friday, October 28, 2022 | 12:00 – 4:00 PM (Eastern time)

Live in person at Gross Hall 230E Seminar Room

Presented by: 

  • Ricardo Henao, PhD; Associate Professor, Department of Biostatistics and Bioinformatics; Chief AI Scientist, Duke AI Health
  • Akhil Ambekar, MS; Fellow, AI Health Data Science Fellowship Program
  • with Shelley Rusincovitch, MMCi; Managing Director, Duke AI Health

Optional: Register at to receive links to the coding notebooks and other instructions. Drop-ins are also welcome!

Do Machine Learning in Just One Afternoon!

This in-person workshop will give you hands-on experience in working with medical digital pathology images using machine learning.

Our use case will be in whole slide images of lymph node sections. We will use the CAMELYON16 dataset (, which consists of 400 hematoxylin and eosin-stained whole-slide images. During the workshop, you will learn how to retrieve, manage, and process these images, then apply a machine learning model based on a neural network architecture to classify image regions as normal or malignant. The techniques you learn will also be broadly applicable to other types of medical imaging.

The workshop is divided into 4 sections:

  • 12:00-12:45: Concept of the analysis and goals for data processing
  • 1:00-1:45: Hands-on studio: Data processing for digital pathology
  • 2:00-2:45: Conceptual basis of the machine learning
  • 3:00-3:45: Hands-on studio: Implementation of the ML model

You are welcome to attend all 4 parts, or just drop in for part of the afternoon. All are welcome, and no prior experience is required. We especially encourage graduate students, medical trainees, or faculty/staff interested in these methods.

We will use pre-built software containers running PyTorch and Jupyter Notebooks hosted by OIT’s Container Manager. The containers are accessible via your web browser, and you will not need to install any software, making it easy for you to implement the data processing and ML modeling steps. The coding notebooks will also be shared in GitHub for anyone to use.

About the Series

The Data Studios are designed to engage the Duke community in learning how to work with medical data, including campus-based methodology researchers who are interested in medical applications but may be unfamiliar with real-world medical datasets. The reasoning behind using publicly available data is to make sure that anyone is able to participate in running the code. The Data Studios are part of the Health Data Science (HDS) program, AI Health’s experiential learning and research hub.