Fig. 4From: Measuring the bias of incorrect application of feature selection when using cross-validation in radiomicsMean bias in AUC-ROC for each combination of feature selection and classification method over all datasets. To obtain these, for each of the 10 datasets, the largest difference in AUC-ROC for a given combination between correct and incorrect application of CV is computed, resulting in 10 differences. The mean of these differences is denoted, with the corresponding ranges below in parentheses. Since the displayed mean is an average of all datasets, a conclusion about the bias for a single dataset is not possible. For example, using SVM-RFE with a random forest shows almost no bias in mean (− 0.01), but the difference for a single dataset can be as high as 0.07 in AUC-ROCBack to article page