skip to content

Cambridge Mathematics of Information in Healthcare

 

The area under the receiver operating characteristic curve (AUROC) is a staple within machine learning for reporting model performance and assessing model generalisability. However, CMIH researchers demonstrate that reporting the AUROC alone for a test set masks not only domain shift between validation and test data but also obfuscates model instability and gives optimistic performance estimates. The researchers highlight the utility of the test AUROC for understanding model concordance and propose several complementary scores, which disentangle the effects of domain shift and model instability.

 

 

Read more about this work in Nature Machine Intelligence.

Roberts, M., Hazan, A., Dittmer, S. et al. The curious case of the test set AUROC. Nat Mach Intell (2024). https://doi.org/10.1038/s42256-024-00817-7

Funded by