- Original Article
- Open Access
Machine learning combined with radiomics and deep learning features extracted from CT images: a novel AI model to distinguish benign from malignant ovarian tumors
Insights into Imaging volume 14, Article number: 68 (2023)
To develop an artificial intelligence (AI) model with radiomics and deep learning (DL) features extracted from CT images to distinguish benign from malignant ovarian tumors.
We enrolled 149 patients with pathologically confirmed ovarian tumors. A total of 185 tumors were included and divided into training and testing sets in a 7:3 ratio. All tumors were manually segmented from preoperative contrast-enhanced CT images. CT image features were extracted using radiomics and DL. Five models with different combinations of feature sets were built. Benign and malignant tumors were classified using machine learning (ML) classifiers. The model performance was compared with five radiologists on the testing set.
Among the five models, the best performing model is the ensemble model with a combination of radiomics, DL, and clinical feature sets. The model achieved an accuracy of 82%, specificity of 89% and sensitivity of 68%. Compared with junior radiologists averaged results, the model had a higher accuracy (82% vs 66%) and specificity (89% vs 65%) with comparable sensitivity (68% vs 67%). With the assistance of the model, the junior radiologists achieved a higher average accuracy (81% vs 66%), specificity (80% vs 65%), and sensitivity (82% vs 67%), approaching to the performance of senior radiologists.
We developed a CT-based AI model that can differentiate benign and malignant ovarian tumors with high accuracy and specificity. This model significantly improved the performance of less-experienced radiologists in ovarian tumor assessment, and may potentially guide gynecologists to provide better therapeutic strategies for these patients.
CT-based radiomics and deep learning features could differentiate ovarian tumors.
Radiomics, deep learning features, and clinical data provided complementary tumor information.
The ensemble model improved the radiologists’ performance in assessing ovarian tumors.
Ovarian cancer is the leading cause of gynecological cancer related deaths , and a misdiagnosis may delay the treatment and worsen the prognosis. Expedited referral of patients with ovarian cancer to a gynecologic oncologist for complete surgical staging and optimal cytoreduction correlates with better survival rates . In contrast, patients with benign ovarian tumor only need conservative treatment or laparoscopic cystectomy . Therefore, accurate distinction between benign and malignant ovarian tumors is of paramount importance in guiding treatment and it remains a great challenge in clinical practice.
Currently, distinction between benign and malignant ovarian tumors is largely based on imaging appearance [4,5,6]. Ultrasound is typically the first-line screening imaging tool. Due to the excellent spatial resolution and wide availability, computed tomography (CT) is often ordered for further tumor characterization. However, a definitive differentiation between benign and malignant ovarian tumors by CT remains challenging, especially in excluding the possibility of malignancy in multiseptated cystic tumors. Given that benign ovarian tumors greatly outnumber malignant ones, it is not uncommon that patients with tumor of indeterminate image features undergo surgery and the tumors are later proven to be benign. It is estimated that approximately 28% of oophorectomies performed are of benign tumors . These unnecessary surgeries represent a huge clinical concern with long-term consequences of decreased fertility and premature menopause [8, 9]. Therefore, a noninvasive method that can accurately distinguish benign from malignant ovarian tumors to prevent delayed treatment in malignant cases and save patients with benign tumors from unnecessary surgery is of significant clinical impact.
Artificial intelligence (AI) has been shown to improve the performance of tumor detection, tumor classification, and treatment monitoring in cancer imaging [10,11,12,13]. In contrast with subjective radiological imaging evaluation by humans, image feature extraction using radiomics or deep learning (DL) can provide quantified image information undetectable by human eyes and has shown promising results in tumor analysis [14,15,16,17,18,19,20,21,22,23,24,25]. Several recent studies used radiomics on CT images and applied machine learning (ML) classifiers to differentiate ovarian tumors [26,27,28]. However, there is limited research on applying DL to differentiate ovarian tumor using CT images. Christiansen et al.  and Wang et al.  applied DL for ovarian tumor differentiation using ultrasound and magnetic resonance imaging (MRI) respectively. In addition to studies that directly applied DL networks for ovarian tumor differentiation, there were few studies using DL networks for feature extraction from CT images to predict ovarian cancer recurrence or classify pulmonary nodule subtypes [24, 25]. To our best knowledge, the performance of applying ML based on combined radiomics and DL features extracted from CT images on differentiating ovarian tumors remains unknown.
In this study, we aimed to develop a CT-based AI model with feature extraction using radiomics and DL to distinguish benign from malignant ovarian tumors. We applied classifiers with radiomics and DL features extracted from CT images to classify benign and malignant ovarian tumors. The performance of various combinations of classifiers and feature sets were compared with radiologists on the classification task using pathologic diagnosis as the gold standard. Moreover, the performance improvement of radiologists with assistance of the optimal model was also assessed.
In this institutional review board-approved study, we retrospectively collected 245 consecutive patients with suspected ovarian tumors from the MacKay Memorial Hospital between July 2018 and December 2019. Patients meeting the following criteria were included: (1) pathologically confirmed ovarian tumor resected by surgery, (2) contrast-enhanced CT scan performed prior to surgery, (3) clear CT images without artifacts and fit for analysis. The final cohort consisted of 149 patients with 185 ovarian tumors (Fig. 1).
The data were divided into training and testing sets in a 7:3 ratio. The training set was used to develop five models with different combinations of feature sets: radiomics model, DL model, clinical model, combined radiomics and DL model, and ensemble model (combined radiomics, DL, and clinical feature sets). The models were then tested on the unseen testing set. Figure 2 illustrates the flowchart of study design.
Image acquisition and segmentation
CT examinations were performed on 4 different multidetector CT scanners: Siemens Somatom Definition Flash, Siemens Somatom Definition AS, Toshiba Aquilion ONE (TSX-301C), Toshiba Aquilion PRIME (TSX-303A). The scanning parameters were as follows: tube voltage, 120 kVp; tube current, 200–230 mA; gantry rotation time, 0.5 s; beam pitch, 1.0; reconstruction thickness, 2 mm; reconstruction interval, 1.5 mm. Contrast medium (Iodine concentration: 300 mg/mL) 80–100 mL was injected using a mechanical injector at a rate of 2.5–3.5 mL/sec. The time delay from contrast agent injection to image acquisition was 70 s.
The preoperative contrast-enhanced CT images were collected from the PACS. Tumors were manually segmented by an experienced radiologist using 3D slicer (IEEE Cat No. 04EX821). The boundary of the whole tumor was manually defined on each axial CT slice.
Feature extraction, selection, and tumor classification
After resolution and intensity normalization, radiomics features were extracted from the tumor images. A total of 129 radiomics features were extracted from each tumor, including 12 histogram features, 9 gray-level co-occurrence matrix (GLCM) features, 96 wavelet features, and 12 Laplacian of Gaussian (LoG) features (Additional file 1: Table S1).
In addition to the radiomics, a 3D U-Net convolutional neural network (CNN) was applied as a feature extractor. Figure 3 illustrates the architecture of the U-Net applied in this study, which consists of an encoder and a decoder. The basic idea of the use of the U-net as a feature extractor is that the features extracted by the encoder from an input tumor image could represent the tumor if the image reconstructed by the decoder using the features is similar to the input image [31,32,33,34]. In this study, the U-net was trained and validated respectively by 90% and 10% of the training set using Adam optimizer with a loss function of half mean squared error. A batch size of 1 was used due to the limited memory size of the applied graphic card. The learning rate and the number of epochs for the training were adjusted based on the averaged root mean squared error (RMSE) between the input and reconstructed images to ensure the images reconstructed by the decoder were as much as similar to the input images. By inputting the tumor images to the trained U-net, the features output by the last activation layer of the encoder were adopted as DL features of the tumor. For each tumor, 224 DL features were extracted.
Using the radiomics and U-net, 353 features were extracted from each tumor. However, the performance of classification using such a large number of features could be low due to multiple collinearity and over-fitting. We used a least absolute shrinkage and selection operator (LASSO) regression with tenfold cross-validation to eliminate irrelevant features . Features with regression coefficients > 0.1 were selected for the classification.
After feature selection, benign and malignant tumors were classified using four classifiers, including K-nearest neighbor (KNN), support vector machine (SVM), logistic regression (LR), and random forest (RF), with five types of feature sets, including radiomics features, DL features, clinical features, combined radiomics and DL features, and ensemble features (all features combined). The classification result would output a probability (0–100%) of malignancy for each tumor. The performance of the classification using different combinations of classifiers and feature sets were evaluated and compared using the training data with tenfold cross-validation. In this study, feature extraction, selection, and classifier training and evaluation were implemented using MATLAB R2020a (MathWorks, Natick, MA).
Based on the years of experience reading abdominal CT images, radiologists were divided into two groups, including juniors (3 radiologists, experience < 10 years) and seniors (2 radiologists, experience > 10 years). All radiologists were blinded to patients’ pathologic diagnoses. They were asked to independently interpret the CT images of the testing set and record each tumor as benign or malignant with the given information of patients’ age and CA-125 level. After one month, they were asked to interpret the images again with the assistance of the best performing model.
In order to evaluate the performance of the AI models and radiologists, the following indices were calculated: accuracy, sensitivity, specificity, receiver operating characteristic curve (ROC), area under the ROC curve (AUC), and F1 score. Interobserver reliability was assessed by using Krippendorff’s alpha coefficient. When assessing the clinical characteristics between groups, differences in continuous variables and categorical variables were examined using the independent samples t-test and chi-squared test, respectively. p < 0.05 was considered significant difference. Statistical analysis was performed using SPSS version 24.0 (IBM Corporation, Armonk, NY, USA).
The final cohort consisted of 149 patients with 185 ovarian tumors, 112 benign and 73 malignant. The patients’ age ranged from 18 to 80 years old (mean 46.4 ± 12.4 years). There were 78 patients (52.3%) with elevated CA-125 and 36 patients (24.2%) with bilateral tumors. There were significant differences in age (p < 0.0001), tumor volume (p < 0.0001), and CA-125 (p = 0.0003) between the benign and malignant groups (Table 1). The training and testing sets were balanced in terms of all clinical variables (Additional file 1: Table S2). Tumor histological subtypes are summarized in Table 2. For classification purposes, borderline and malignant tumors were grouped into a single category and referred to as malignant.
Feature selection and tumor classification
The details of features selected by LASSO method are described in Table 3. In the radiomics model, 4 features were selected from initial 129 radiomics features. For the DL features, the feature extraction DL model (U-net) was trained using a learning rate of 0.001 s and 25 epochs. The average RMSE between the input and reconstructed images was 25.45 ± 39.05. Four features were selected from initial 224 DL features for DL model. In the combined radiomics and DL model, 6 features were selected from the total 353 radiomics and DL features, including one radiomics feature and five DL features. The clinical model had four clinical features: age, CA-125, tumor volume, and tumor side. The ensemble model consisted of 10 features including 4 clinical features and 6 features used in the combined radiomics and DL model. The detailed model performance on training and testing sets using different classifiers, i.e., KNN, SVM, LR, and RF, can be found in Additional file 1: Tables S3–S4. Due to the overall better performance of the LR classifier compared with other classifiers on the testing set, its analysis results were presented for evaluation for the rest of the study.
Performance of AI models
The performance metrics of the AI models and radiologists on the testing set are summarized in Table 4. The accuracy of models in descending order were ensemble model 82%, DL model 73%, clinical model 73%, combined radiomics and DL model 71%, and radiomics model 61%. The best performing model was the ensemble model with the highest accuracy (82%), sensitivity (68%), negative predictive rate (85%), and F1 score (0.72). The ensemble model achieved a specificity of 89%, AUC of 0.83, and positive predictive rate of 77%. The DL model had the highest AUC (0.89), specificity (100%), and positive predictive rate (100%) but the lowest sensitivity (21%).
Performance of radiologists
The senior radiologists achieved higher accuracy, specificity, AUC, positive predictive rate, and F1 score than all junior radiologists (Table 4). With AI model assistance, all junior radiologists showed an overall improvement in performance metrics, while the senior radiologists had only mild improvement in accuracy, AUC, and F1 score. The interobserver reliability of junior radiologists (Krippendorff’s alpha, 0.4757 vs 0.6333) and senior radiologists (Krippendorff’s alpha, 0.4806 vs 0.7331) also revealed improvement with AI assistance. The averaged performance results of radiologists are summarized in Table 5. With the assistance of ensemble model, the junior radiologists achieved a significant improvement in averaged accuracy (81% vs 66%), sensitivity (82% vs 67%), and specificity (80% vs 65%) that were comparable with senior radiologists. The senior radiologists only displayed a mild improvement in average accuracy (85% vs 83%) and specificity (87% vs 83%) and the same sensitivity (82%) with AI assistance. Aided by the ensemble-produced probabilities, junior radiologists also achieved an improvement in AUC that showed no statistically significant difference from senior radiologists. Comparisons of AUC between radiologists can be found in Additional file 1: Tables S5–S7.
Performance comparison of ensemble model and radiologists
Figure 4 demonstrates the ROC curves of ensemble model and radiologists. The AUC of ensemble model (0.83) was comparable with senior radiologists (0.82–0.83) and better than junior radiologists (0.61–0.73). Compared with junior radiologists averaged results (Table 5), the ensemble model had higher accuracy (82% vs 66%) and specificity (89% vs 65%) with comparable sensitivity (68% vs 67%). Against the senior radiologists averaged results, the ensemble model had a comparable accuracy (82% vs 83%), higher specificity (89% vs 83%), but lower sensitivity (68% vs 82%). Comparison of AUC between the ensemble model and radiologists can be found in Additional file 1: Table S8.
Sample misclassified by AI model and/or radiologists
Figure 5 demonstrates examples of tumor misclassified by AI model and/or radiologists under three scenarios. Figures 5a and b depict ovarian tumors that were misclassified by AI model but correctly differentiated by all junior radiologists, selected from 4 cases of this scenario, including 2 malignant and 2 benign tumors. Figure 5c demonstrated the only one tumor that was misclassified by both AI model and all junior radiologists. Figure 5d depicted an ovarian tumor that was wrongly differentiated by all 3 junior radiologists but correctly classified by AI model, selected from 9 cases of this scenario, including 1 malignant and 8 benign tumors.
In this study, we developed a CT-based AI model incorporating radiomics and DL features with clinical data to classify benign and malignant ovarian tumors using ML classifiers. The model can distinguish benign from malignant ovarian tumors with high accuracy (82%) and specificity (89%) for a fair sensitivity (68%). The model performed better than the junior radiologists’ average results. With the probabilities provided by the model, the junior radiologists showed a significant improvement in performance approaching to senior radiologists. These results demonstrate that the AI model can assist less-experienced radiologists in assessing ovarian tumors, providing evidence of the clinical validity of this model.
This is the first study applying ML combined with radiomics and DL features extracted from CT images to differentiate between benign and malignant ovarian tumors. There is limited research on applying DL to differentiate ovarian tumor using CT images. Christiansen et al.  and Wang et al.  applied DL for ovarian tumor differentiation using ultrasound and MRI respectively. Both studies used the CNN to build an end-to-end classification model which needed to be trained with a larger dataset. However, under common medical conditions, collecting a large uniform tumor image dataset with pathological diagnosis is very difficult. DL features, quantified image features extracted through an encoder-decoder CNN [31,32,33,34], may provide an alternative way for tumor imaging analysis on a relatively small dataset. Wang et al.  extracted DL features from CT images to predict tumor recurrence in high-grade serous ovarian cancer. Xia et al.  developed a CT-based scheme to classify ground-glass lung nodules by fusing radiomics and DL features. So far, there is no study using DL features or incorporating radiomics with DL features to differentiate ovarian tumors. Since radiomics, DL features, and clinical data represent different characteristics of tumor, we assume that an AI model integrating these features can accurately distinguish benign and malignant ovarian tumors. The better performance of the ensemble model verified our assumption that radiomics, DL features, and clinical data may provide complementary information on ovarian tumors and work better together in distinguishing benign from malignancy.
ML is often considered as a black box. In order to understand the decisions and mistakes that the AI model and radiologists made, we analyzed three scenarios of misclassified results. In the first scenario where the tumors were misclassified by AI model but correctly differentiated by all junior radiologists, the malignant tumor (Fig. 5a) had obvious solid portion, while the benign one (Fig. 5b) was a hypoattenuation tumor without solid portion or mural nodule. In traditional radiological evaluation, solid portion, mural nodule, and thick septa of an ovarian tumor are clues for malignancy. Tumors with typical CT image features, such as the above two tumors (Fig. 5a and b), would not be misdiagnosed by radiologists even though they were misclassified by AI model. In the second scenario where both AI model and all junior radiologists were wrong, the benign tumor (Fig. 5c) was a multiseptated cystic tumor with uneven thick septa that might raise the suspicion of malignancy in radiological evaluation. In the third scenario where the AI model was correct, but all junior radiologists were wrong, the tumor (Fig. 5d) was a benign multiseptated cystic tumor with thick septum. As mentioned before, it is challenging for radiologists in excluding the possibility of malignancy in such multiseptated cystic ovarian tumors. The AI model may do better than radiologists in identifying subtle features unexplainable by traditional radiological evaluation and help the radiologists to make correct decisions in difficult cases like the one in Fig. 5d.
The proposed model may potentially assist radiologists and gynecologists to assess ovarian tumors and guide therapeutic strategies for these patients, especially in hospitals that lack experienced radiologists. With the growing global physician shortage problem, the availability of an AI-assistance system is very important. Although MRI may provide better performance than CT in tumor differentiation due to its superior tissue contrast [36, 37], we believe a CT-based AI model would benefit more patients, especially those in remote areas. Although the sensitivity of our model is relatively low, its intended clinical application is not for screening. High specificity of the model is considerably more important than sensitivity since CT study usually serves as a confirmation modality for workup of indeterminate tumors on sonogram.
There are several limitations in this study. First, the data size is relatively small and without external validation cohort, and the study design is retrospective. Future studies using larger dataset from different institutions with prospective study design are essential to improve and validate the performance of the model. Second, manual segmentation of the ovarian tumors by a single radiologist can bias the results. However, considering accurate tumor segmentation is important for radiomics and DL feature extraction, we decided to use manual segmentation by an experienced radiologist. Third, recall of cases from the first session may be a concern when the radiologists were asked to reevaluate the CT images with AI assistance. To address this issue, we arranged a time delay of at least one month between the two sessions. Fourth, we chose CT as our imaging tool because it is far more available than MR. However, this remains a potential weakness for the developed tool applicability since an MRI-based model might outperform the proposed CT-based model. Fifth, we applied ML classifiers rather than DL method for tumor classification due to the limitation of small data size.
In this study, we developed a CT-based AI model incorporating radiomics and DL features with clinical data to distinguish benign from malignant ovarian tumors using ML classifiers. The model can distinguish benign from malignant ovarian tumors with high accuracy and specificity. Besides, the model can improve the performance of less-experienced radiologists in assessing ovarian tumors, and potentially guide gynecologists to provide better therapeutic strategies for these patients.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Area under the ROC curve
Cancer antigen 125
Convolutional neural network
Gray-level co-occurrence matrix
Least absolute shrinkage and selection operator
Laplacian of Gaussian
Magnetic resonance imaging
Root mean squared error
Receiver operating characteristic curve
Support vector machine
Siegel RL, Miller KD, Jemal A (2019) Cancer statistics. CA Cancer J Clinic 69: 7–34
Hand R, Fremgen A, Chmiel JS et al (1993) Staging procedures, clinical management, and survival outcome for ovarian carcinoma. JAMA 269:1119–1122
American College of Obstetricians and Gynecologists’ Committee on Practice Bulletins—Gynecology (2016) Practice bulletin no. 174: evaluation and management of adnexal masses. Obstet Gynecol 128(5):e210–e226.
Jeong YY, Outwater EK, Kang HK (2000) Imaging evaluation of ovarian masses. Radiographics 20:1445–1470
Iyer VR, Lee SI (2010) MRI, CT, and PET/CT for ovarian cancer detection and adnexal lesion characterization. AJR Am J Roentgenol 194:311–321
Kinkel K, Lu Y, Mehdizade A, Pelte MF, Hricak H (2005) Indeterminate ovarian mass at US: incremental value of second imaging test for characterization–meta-analysis and Bayesian analysis. Radiology 236:85–94
Moore BJ, Steiner CA, Davis PH, Stocks C, Barrett ML (2006) Trends in hysterectomies and oophorectomies in hospital inpatient and ambulatory settings, 2005–2013: statistical brief #214healthcare cost and utilization project (HCUP) statistical briefs. Agency for healthcare research and quality (US), Rockville (MD)
Lass A (1999) The fertility potential of women with a single ovary. Hum Reprod Update 5:546–550
Parker WH, Broder MS, Liu Z, Shoupe D, Farquhar C, Berek JS (2005) Ovarian conservation at the time of hysterectomy for benign disease. Obstet Gynecol 106:219–226
Bi WL, Hosny A, Schabath MB et al (2019) Artificial intelligence in cancer imaging: Clinical challenges and applications. CA: A Cancer J Clinic 69:127–157
Zhou J, Zeng ZY, Li L (2020) Progress of artificial intelligence in gynecological malignant tumors. Cancer Manage Res 12:12823–12840
Akazawa M, Hashimoto K (2021) Artificial intelligence in gynecologic cancers: current status and future challenges – a systematic review. Artif Intell Med 120:102164
Shrestha P, Poudyal B, Yadollahi S et al (2022) A systematic review on the use of artificial intelligence in gynecologic imaging - background, state of the art, and future directions. Gynecol Oncol. https://doi.org/10.1016/j.ygyno.2022.07.024
Sun R, Limkin EJ, Vakalopoulou M et al (2018) A radiomics approach to assess tumour-infiltrating CD8 cells and response to anti-PD-1 or anti-PD-L1 immunotherapy: an imaging biomarker, retrospective multicohort study. Lancet Oncol 19:1180–1191
Chiappa V, Interlenghi M, Salvatore C et al (2021) Using rADioMIcs and machine learning with ultrasonography for the differential diagnosis of myometRiAL tumors (the ADMIRAL pilot study). Radiomics and differential diagnosis of myometrial tumors. Gynecol Oncol 161:838–844
Chaudhary K, Poirion OB, Lu L, Garmire LX (2018) Deep learning-based multi-omics integration robustly predicts survival in liver cancer. Clin Cancer Res 24:1248–1259
Chiappa V, Interlenghi M, Bogani G et al (2021) A decision support system based on radiomics and machine learning to predict the risk of malignancy of ovarian masses from transvaginal ultrasonography and serum CA-125. Eur Radiol Exp 5:28
Newtson AM, Mattson JN, Goodheart MJ et al (2019) Prediction of optimal surgical outcomes with radiologic images using deep learning artificial intelligence. Gynecol Oncol 154:156
Rizzo S, Botta F, Raimondi S et al (2018) Radiomics of high-grade serous ovarian cancer: association between quantitative CT features, residual tumour and disease progression within 12 months. Eur Radiol 28:4849–4859
Song XL, Ren JL, Zhao D, Wang L, Ren H, Niu J (2021) Radiomics derived from dynamic contrast-enhanced MRI pharmacokinetic protocol features: the value of precision diagnosis ovarian neoplasms. Eur Radiol 31:368–378
Vargas HA, Veeraraghavan H, Micco M et al (2017) A novel representation of inter-site tumour heterogeneity from pre-treatment computed tomography textures classifies ovarian cancers by clinical outcome. Eur Radiol 27:3991–4001
Jian J, Ya Li, Pickhardt PJ et al (2021) MR image-based radiomics to differentiate type Ι and type ΙΙ epithelial ovarian cancers. Eur Radiol 31:403–410
Zhang H, Mao Y, Chen X et al (2019) Magnetic resonance imaging radiomics in categorizing ovarian masses and predicting clinical outcome: a preliminary study. Eur Radiol 29:3358–3371
Wang S, Liu Z, Rong Y et al (2019) Deep learning provides a new computed tomography-based prognostic biomarker for recurrence prediction in high-grade serous ovarian cancer. Radiother Oncol 132:171–177
Xia X, Gong J, Hao W et al (2020) Comparison and fusion of deep learning and radiomics features of ground-glass nodules to predict the invasiveness risk of stage-I lung adenocarcinomas in CT scan. Front Oncol 10:418
Yu XP, Wang L, Yu HY et al (2021) MDCT-based radiomics features for the differentiation of serous borderline ovarian tumors and serous malignant ovarian tumors. Cancer Manage Res 13:329–336
An H, Wang Y, Wong EMF et al (2021) CT texture analysis in histological classification of epithelial ovarian carcinoma. Eur Radiol 31:5050–5058
Park H, Qin L, Guerra P, Bay CP, Shinagare AB (2021) Decoding incidental ovarian lesions: use of texture analysis and machine learning for characterization and detection of malignancy. Abdom Radiol (NY) 46:2376–2383
Christiansen F, Epstein EL, Smedberg E, Åkerlund M, Smith K, Epstein E (2021) Ultrasound image analysis using deep neural networks for discriminating between benign and malignant ovarian tumors: comparison with expert subjective assessment. Ultrasound Obstet Gynecol 57:155–163
Wang R, Cai Y, Lee IK et al (2020) Evaluation of a convolutional neural network for ovarian tumor differentiation based on magnetic resonance imaging. Eur Radiol. https://doi.org/10.1007/s00330-020-07266-x
Masci J, Meier U, Cireşan D, Schmidhuber J (2011) Stacked convolutional auto-encoders for hierarchical feature extraction. In: Honkela T, Duch W, Girolami M, Kaski S (eds) Artificial neural networks and machine learning – ICANN 2011. Springer, Berlin Heidelberg, pp 52–59
Huang G, Liu Z, Maaten LVD, Weinberger KQ (2017) Densely connected convolutional networks2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 2261–2269
Dara S, Tumma P (2018) Feature extraction by using deep learning: a survey2018 second international conference on electronics, communication and aerospace technology (ICECA), pp 1795–1801
Vununu C, Lee S-H, Kwon K-R (2019) A deep feature extraction method for HEp-2 cell image classification. Electronics 8:20
Fonti V, Belitser E (2017) Feature selection using lasso. VU Amsterdam Res Paper Business Anal 30:1–25
Hricak H, Chen M, Coakley FV et al (2000) Complex adnexal masses: detection and characterization with MR imaging–multivariate analysis. Radiology 214:39–46
Foti PV, Attinà G, Spadola S et al (2016) MR imaging of ovarian masses: classification and differential diagnosis. Insights Imaging 7:21–41
The study was supported by the National Science and Technology Council (MOST 111-2314-B-039-042) and China Medical University (CMU111-MF-62).
Ethics approval and consent to participate
The Institutional Review Board approved this study and waived the requirement for patient consent.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1. Table S1.
Radiomics features extracted in this study. Table S2. Patient and tumor characteristics for the training and testing sets. Table S3. Performance metrics of AI models on training set. Table S4. Performancemetrics of AI models on testing set. Table S5. Comparison of AUC between radiologists with and without AI assistance. Table S6. Comparison of AUC between junior radiologists and senior radiologists. Table S7. Comparison of AUC between junior radiologists with AI and senior radiologists. Table S8. Comparison of AUC between ensemble model and radiologists.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Jan, YT., Tsai, PS., Huang, WH. et al. Machine learning combined with radiomics and deep learning features extracted from CT images: a novel AI model to distinguish benign from malignant ovarian tumors. Insights Imaging 14, 68 (2023). https://doi.org/10.1186/s13244-023-01412-x
- Ovarian tumor
- Deep learning
- Machine learning
- Computed tomography