Mitigating Feature Bias in DL Models for Cervical Cytology

1 Silicon Institute of Technology, Bhubaneswar 2 Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center
WIML, NeurIPS 2024

TL;DR

Cervical cancer, the fourth most common cancer globally, poses significant health challenges. While deep learning (DL) aids cervical cytology diagnostics, inherent feature bias in clinical datasets limits its effectiveness. This bias, arising from variability in feature representations across classes (e.g., Squamous Cell Carcinoma), causes inconsistent model performance. Our study introduces a bias-mitigating approach that categorizes data points into high- and low-density feature cohorts based on feature probability distributions. Using DieT as a feature extractor on the CRIC dataset for an 8-way classification task, we demonstrate that oversampling low-density cohorts during training significantly reduces AUC disparity between cohorts by 73.42%. This results in improved diagnostic accuracy and consistency in DL-based cervical cytology workflows.

BibTeX

BibTex Code Here