Diagnostic machine learning model for preoperative stratification of patients with benign ovarian tumor-like lesions
Toneeva S.N., Toneev E.A., Volkova N.A., Klinysheva S.Yu., Safiullina A.N., Pisklyukov D.R.
Objective: To develop and evaluate a diagnostic model for stratifying patients with ovarian tumor-like lesions to optimize treatment strategies and reduce the risk of overtreatment.
Materials and methods: This study included 288 patients who underwent laparoscopic surgery for ovarian tumor-like lesions. According to histological findings, 44 (15.3%) patients had functional cysts, while 244 (84.7%) had non-functional benign lesions. The model incorporated the following predictors: HE4, CA125, neutrophil-to-lymphocyte ratio (NLR), body mass index (BMI), lesion size on ultrasound, and the number of years since menopause. A Decision Tree Classifier algorithm was used to construct the model. Seventy percent of the dataset (202 patients) was allocated for training, and the remaining 30% (86 patients) served as an independent test set.
Results: In the training set, the model achieved an AUC of 0.852. In the test set, the AUC was 0.835, with a sensitivity and specificity of 81.1% and 84.6%, respectively. The application of the model improves the accuracy of stratifying patients by the likelihood of a functional lesion and reduces the risk of unnecessary surgical intervention.
Conclusion: The developed diagnostic model may serve as an effective clinical decision-support tool for managing patients with ovarian tumor-like lesions. External validation of this model is required.
Authors' contributions: Toneeva S.N. – conception and design of the study, data analysis, editing of the manuscript; Toneev E.A., Safiullina A.N. – collection of materials, statistical analysis; Volkova N.A. – checking of critical content; Klinysheva S.Yu. – data analysis, editing of the manuscript; Pisklyukov D.R. – drafting of the manuscript.
Conflicts of interest: The authors have no conflicts of interest to declare.
Funding: There was no funding for this study.
Ethical Approval: The study was reviewed and approved by the Research Ethics Committee of the Ulyanovsk Regional Clinical Hospital.
Patient Consent for Publication: All patients provided informed consent for the publication of their data.
Authors' Data Sharing Statement: The data supporting the findings of this study are available upon request from the corresponding author after approval from the principal investigator.
For citation: Toneeva S.N., Toneev E.A., Volkova N.A., Klinysheva S.Yu., Safiullina A.N., Pisklyukov D.R.
Diagnostic machine learning model for preoperative stratification of patients with benign ovarian tumor-like lesions.
Akusherstvo i Ginekologiya/Obstetrics and Gynecology. 2026; (3): 112-118 (in Russian)
https://dx.doi.org/10.18565/aig.2025.230
Keywords
Benign ovarian cysts are among the most common gynecological findings in women of reproductive and postmenopausal ages. Most of these lesions are functional and tend to undergo spontaneous regression, which means that surgical intervention is not always necessary [1]. However, in recent years, the frequency of surgical procedures for ovarian cysts has steadily increased. This trend can be attributed to significant advances in diagnostic capabilities, particularly in high-sensitivity imaging modalities such as high-resolution ultrasonography (US), magnetic resonance imaging, and computed tomography [2].
Modern diagnostic protocols enable the more frequent detection of even small or asymptomatic benign lesions. While this facilitates the early diagnosis of potentially malignant tumors, it also increases the risk of overtreatment and unnecessary pelvic surgeries [3]. Such interventions may lead to postoperative complications and, in the case of oophorectomy, result in loss of ovarian function accompanied by relevant endocrine disturbances [4].
A key objective in contemporary gynecologic practice is personalized patient stratification, which is essential to distinguish between women who require active surgical management and those for whom expectant (dynamic) monitoring may be appropriate [5]. Therefore, developing reliable diagnostic models, including those based on machine learning methods, is critical for optimizing the management of patients with benign ovarian cysts [2].
This study aimed to develop a model capable of predicting the likelihood that an ovarian lesion is functional in nature during the preoperative evaluation stage.
Materials and methods
The study included data from patients who underwent surgical treatment for ovarian tumor-like lesions at the Gynecology Department of Ulyanovsk Regional Clinical Hospital between January 1, 2023, and December 31, 2024. Data were collected retrospectively based on medical documentation.
A total of 288 patients were included, meeting the following inclusion criteria: age ≥18 years, presence of an ovarian tumor-like lesion detected on ultrasonography, scheduled surgical treatment (laparoscopic approach), and availability of complete laboratory and clinical-instrumental data, including HE4 (pmol/L), CA125 (U/mL), body mass index (BMI, kg/m²), neutrophil-to-lymphocyte ratio (NLR), leukocyte intoxication index (LII), platelet-to-lymphocyte ratio (PLR), lesion size on ultrasonography (mm), and number of years since menopause (for postmenopausal patients).
The exclusion criteria were as follows: histologically confirmed malignant neoplasms of the ovaries or other organs, emergency surgeries for complicated cysts, and absence of complete data for key laboratory or clinical indicators.
After receiving the histological report, tumor-like formations were classified into two groups: functional cysts, which included follicular cysts, corpus luteum cysts, and paraovarian cysts, and non-functional (other) formations, which included benign epithelial tumors (serous and mucinous cystadenomas), germ cell tumors (mature teratomas), stromal tumors (fibromas and thecomatosis), and chronic inflammatory formations.
All laboratory tests were performed as part of the standard preoperative examination on the first day of hospitalization. The following indices were used in the study: NLR (the ratio of absolute neutrophil count to absolute lymphocyte count), TLI (calculated as platelets/lymphocytes), and LII according to Ostrovsky (calculated using the formula: (stab neutrophils + segmented neutrophils + young neutrophils + myelocytes) × monocytes)/lymphocytes).
Statistical analysis
Statistical analysis was performed using StatTech v. 4.8.5 (StatTech LLC, Russia). The normality of the distribution of continuous variables was tested using the Shapiro–Wilk test (n<50) or the Kolmogorov–Smirnov test (n>50). In cases of deviation from a normal distribution, the data were described using the median (Me) and quartiles (Q1; Q3). Categorical variables are presented as counts and percentage values, with 95% confidence intervals (95% CI) calculated using the Klopper–Pearson method. The Mann–Whitney U test was used to compare the two groups based on continuous variables. Pearson's χ² test (for expected values >10) or Fisher's exact test (<10) was used to analyze 2×2 tables, with odds ratios (OR) and 95% confidence intervals (CI) calculated. The χ² test was applied to multi-way contingency tables. Differences were considered statistically significant at p<0.05.
Construction of a diagnostic model
The Decision Tree Classifier algorithm was implemented using scikit-learn v. 1.4.2 library in Python v 3.11.8 was used to construct the model. The target variable was defined as the presence of a functional cyst: a value of 1 indicated a functional cyst, whereas a value of 0 indicated a non-functional benign formation. Thus, in the model, a “positive” outcome was a functional cyst, reflecting the clinical interest in identifying patients who potentially do not require surgical interventions. The data were divided into training and test samples in a 70/30 ratio using stratified random sampling to maintain the original class ratio (approximately 1:5.5). Special methods for correcting imbalance (class weighting, oversampling, or undersampling) were not employed, preserving the actual distribution of tumor types in clinical practice. The model was evaluated using an independent test sample by constructing an ROC curve and determining the AUC. The optimal classification probability threshold was identified using the Youden index. For this threshold, the sensitivity, specificity, overall accuracy, positive predictive value (PPV), negative predictive value (NPV), 95% CI, and diagnostic odds ratio (DOR) were calculated. In this context, the sensitivity of the model reflects the algorithm's ability to identify patients who potentially do not require surgical treatment (functional cysts), whereas the specificity indicates the ability to correctly exclude non-functional formations.
To assess the quality of the diagnostic model, the positive predictive value (PPV), negative predictive value (NPV), and DOR were calculated. PPV was defined as the ratio of true-positive classifications to the sum of true-positive and false-positive cases, whereas NPV was defined as the ratio of true-negative classifications to the sum of true-negative and false-negative cases. DOR was calculated as the ratio (TP/FN)/(FP/TN), where TP is the number of true positives, FP is the number of false positives, FN is the number of false negatives, and TN is the number of true negatives. All metrics were evaluated using an independent test sample.
Methods for correcting imbalance (oversampling, undersampling, class weighting) were not used, as the existing disproportion reflected the actual clinical distribution of tumor types. Artificial class equalization would distort PPV/NPV estimates and compromise the model’s clinical applicability.
Results
Based on the results of histological examination, the lesions were classified as follows: functional cysts, 44/288 (15.3%); non-functional lesions, 244/288 (84.7%). A comparison of the clinical characteristics of patients with functional and nonfunctional lesions is presented in Table 1.

Additionally, odds ratios (OR) with 95% CI were calculated for categorical variables, and a post-hoc assessment of the power of the χ² test was performed. Menopause was not associated with an increased likelihood of non-functional lesions (OR=0.53 (95% CI 0.26–1.06), χ²=3.31; p=0.069; the power of the χ² test was 0.44). For arterial hypertension, OR=0.49 (95% CI 0.22–1.07), χ²=3.33; p=0.068; power 0.45. Diabetes mellitus also did not show a statistically significant association with the type of neoplasm (OR 0.36 [95% CI 0.05–2.76], χ²=1.07; p=0.300; power 0.18). The post-hoc assessment of the power of the χ² test for the specified characteristics is consistent with the absence of a pronounced association between them and the type of neoplasm and confirms the limited contribution of these categorical variables to differential diagnosis compared to the objective laboratory and instrumental indicators included in the diagnostic model.
Statistically significant differences between the groups were found in terms of the duration of surgery (p=0.001) and blood loss (p=0.008). No statistically significant differences were observed for the other indicators (Table 1).
Statistically significant differences were observed in HE4 (p<0.001) and the ROMA index (p<0.001). No significant intergroup differences were observed in other laboratory indicators (Table 2).
As a result of machine learning, a diagnostic model was constructed based on a decision tree algorithm (Fig. 1).

Decision tree structure. In the trained model, the primary division node was the HE4 level of the tumor marker. Lower HE4 values were associated with a higher probability of functional lesions. The second most significant feature was the NLR, which reflects the contribution of the systemic inflammatory response. At the third level of the tree, BMI emerged as a significant predictor. Thus, the model demonstrates a multifactorial pathophysiologically justified prognosis, combining tumor markers and general clinical indicators.
In the trained decision tree structure, the primary dividing criterion was the HE4 level, with lower values associated with functional cysts. At subsequent levels, the model considered NLR and BMI. This sequence reflects the clinically interpretable stratification of the signs.
For practical application of the model, the physician needs to determine the values of key indicators for a specific patient and sequentially correlate them with the threshold values specified in the tree nodes, starting from the root node until the leaf node is reached. Moving along the corresponding branches, the physician obtains a probabilistic conclusion about the nature of the lesion—functional or nonfunctional—which can be used to justify the patient-management strategy.
The model was evaluated using an independent test sample (30%) that was not used for training. The model demonstrated high diagnostic ability, with an area under the ROC curve (AUC) = 0.835 on the test sample (Fig. 2).
Figure 2 shows the ROC curve for the decision tree model. The optimal classification probability threshold, determined using the Youden index, was ≥0.91. At this threshold, the model provided a sensitivity of 81.1%, specificity of 84.6%, and overall accuracy of 81.6% for the test sample.
When using a standard probability threshold of 0.5 on the same independent test sample (n=86), the model demonstrated a sensitivity of 46.2% and a specificity of 93.2%. The PPV was 54.5% (95% CI 28.8–78.5%), and the NPV was 90.8% (95% CI 82.2–96.1%). The DOR was 11.83 (95% CI 2.98–46.98), indicating a significant advantage of correct recognition of non-functional lesions over misclassification.
Thus, the proposed model demonstrated a high diagnostic efficiency.
Discussion
In recent years, there has been a steady increase in surgical interventions for ovarian tumor-like masses, largely due to improvements in diagnostic accuracy and the availability of high-sensitivity imaging techniques [2]. However, several studies have indicated that a significant proportion of these detected masses are functional and may undergo spontaneous regression during active monitoring [6, 7]. In the absence of clear stratification algorithms, the risk of unnecessary surgical procedures increases, leading to specific intraoperative and postoperative complications [3, 8]. Therefore, developing reliable predictive tools to assess the likelihood of functional lesions more accurately at the preoperative stage is crucial. In our study, we constructed a diagnostic model based on a decision tree algorithm to stratify patients with ovarian tumor-like masses.
The inclusion of HE4 as a primary predictor is consistent with the current literature. Numerous studies have indicated that HE4 is one of the most specific biomarkers for differentiating benign from malignant ovarian lesions [6, 9, 10]. Furthermore, the current ACOG (2021) and ESGO (2021) guidelines incorporate HE4 into risk stratification algorithms [7, 11].
The significance of NLR identified in the model reflects the role of systemic inflammatory responses. Previous research has shown that NLR may complement tumor markers for risk stratification [3, 9]. While certain variables (e.g., NLR) did not show statistically significant between-group differences in univariate analyses, these features contributed to stratification within the decision tree model when combined with other variables. This can be attributed to the ability of machine learning algorithms to account for complex inter-feature interactions and nonlinear relationships that traditional univariate statistical methods may not detect. This approach aligns with contemporary practices of applying machine learning to clinical risk stratification tasks [2, 6].
BMI was identified as a significant predictor of the outcome. The impact of metabolic disorders on hormonal homeostasis, and consequently on the development of functional cysts, has been widely discussed in the current literature.
Other authors have also attempted to construct diagnostic models for stratifying patients with ovarian tumor-like masses. For instance, Sahu et al. (2023), in a review of modern stratification methods, highlighted the potential of machine learning models incorporating HE4 and CA125 to support clinical decision-making. Jing B. et al. (2023) developed a multicenter predictive model for assessing ovarian mass risk using HE4, CA125, and NLR, demonstrating an AUC ranging from 0.81 to 0.88. Li Y. et al. (2025) [12] applied a decision tree algorithm for preoperative differentiation between malignant and benign ovarian tumors, achieving an AUC of 0.86 in the test sample. Zhang T. et al. (2023) also showed that combining HE4, CA125, and NLR improves differentiation between benign and malignant processes (AUC = 0.83) [14]. These findings confirm the relevance and potential of prognostic (diagnostic) models based on modern machine learning algorithms in gynecological practice.
The “overtreatment” of benign cysts remains a pressing issue in contemporary gynecology [3, 8]. Despite advances in imaging techniques, the proportion of unnecessary surgeries remains high, as emphasized by international guidelines [7, 11]. The use of machine learning tools to optimize management strategies for benign lesions is a current trend [6, 15]. In our study, the proposed model demonstrated high interpretability, facilitating its integration into clinical practice. Recent reviews have supported the promise of artificial intelligence (AI) and machine learning methods for stratifying benign gynecologic lesions. For example, Moro et al. (2025) demonstrated the effectiveness of AI in analyzing ultrasound images and constructing risk models for managing patients with ovarian cysts and other benign gynecological conditions in a systematic review.
Conclusion
The developed diagnostic model enables the effective stratification of patients with ovarian tumor-like masses and can optimize patient management strategies, potentially reducing the risk of unnecessary surgical interventions.
References
- Seguin C.L., Lietz A.P., Wright J.D., Wright A.A., Knudsen A.B., Pandharipande P.V. Surveillance in older women with incidental ovarian cysts: maximal projected benefits by age and comorbidity level. J. Am. Coll. Radiol. 2021; 18(1PtA): 10-8. https://dx.doi.org/10.1016/j.jacr.2020.09.048
- Zeng S., Wang X.L., Yang H. Radiomics and radiogenomics: extracting more information from medical images for the diagnosis and prognostic prediction of ovarian cancer. Mil. Med. Res. 2024; 11(1): 77. https://dx.doi.org/10.1186/s40779-024-00580-1
- Srivastava S., Koay E.J., Borowsky A.D., De Marzo A.M., Ghosh S., Wagner P.D. Cancer overdiagnosis: a biological challenge and clinical dilemma. Nat. Rev. Cancer. 2019. 19(6): 349-58. https://dx.doi.org/10.1038/s41568-019-0142-8
- Feeney L., Harley I.J.G., McCluggage W.G., Mullan P.B., Beirne J.P. Liquid biopsy in ovarian cancer: catching the silent killer before it strikes. World J. Clin. Oncol. 2020; 11(11): 868-89. https://dx.doi.org/10.5306/wjco.v11.i11.868
- Vlăduţ C., Bilous D., Ciocîrlan M. Real-life management of pancreatic cysts: simplified review of current guidelines. J. Clin. Med. 2023; 12(12): 4020. https://dx.doi.org/10.3390/jcm12124020
- Sahu S.A., Shrivastava D. A comprehensive review of screening methods for ovarian masses: towards earlier detection. Cureus. 2023; 15(11): e48534. DOI: https://dx.doi.org/10.7759/cureus.43225
- Timmerman D., Planchamp F., Bourne T., Landolfo C., du Bois A., Chiva L. et al. ESGO/ISUOG/IOTA/ESGE Consensus Statement on pre-operative diagnosis of ovarian tumors. Int. J. Gynecol. Cancer. 2021; 31(7): 961-82. https://dx.doi.org/10.1136/ijgc-2021-002565
- Radwan S.M.A.A. The role of interventional radiology in the management of malignant and benign gynecological diseases. Kaunas; 2024. Available at: https://search.proquest.com/openview/a3fb1eaf1cb33f89fae40dc561208fc8/1
- Rai Talapadi N. Biochemical markers and combination testing for the diagnosis of ovarian cancer in women with symptoms or signs suspicious of ovarian cancer. University of Birmingham. Thesis. 2021. Available at: https://etheses.bham.ac.uk/id/eprint/11145/7/RaiTalapadi2021MD.pdf
- Холова С.Х., Хушвахтова Э.Х. Роль онкомаркеров в диагностике женщин с доброкачественными новообразованиями яичников. Вестник медико-социального института Таджикистана. 2024; 4: 66-72. [Kholova S.Kh., Khushvakhtova E.Kh. The role of tumor markers in the diagnosis of women with benign ovarian tumors. Herald of the Medical and Social Institute of Tajikistan. 2024; 4: 66-72 (in Russian)].
- American College of Obstetricians and Gynecologists’ Committee on Practice Bulletins—Gynecology. Practice Bulletin No. 174: Evaluation and Management of Adnexal Masses. Obstet. Gynecol. 2016; 128(5): e210-e226. https://dx.doi.org/10.1097/AOG.0000000000001768
- Jing B., Chen G., Yang M., Zhang Z., Zhang Y., Zhang J. et al. Development of prediction model to estimate future risk of ovarian lesions: a multi-center retrospective study. Prev. Med. Rep. 2023; 35: 102296. https://dx.doi.org/10.1016/j.pmedr.2023.102312
- Li Y., Zhao X., Zhou Y., Gong L., Peng E. Decision tree model for predicting ovarian tumor malignancy based on clinical markers and preoperative circulating blood cells. BMC Med. Inform. Decis. Mak. 2025; 25(1): 94. https://dx.doi.org/10.1186/s12911-025-02934-8
- Zhang T., Pang A., Lyu J., Ren H., Song J., Zhu F. et al. Application of nonlinear models combined with conventional laboratory indicators for the diagnosis and differential diagnosis of ovarian cancer. J. Clin. Med. 2023; 12(3): 844. https://dx.doi.org/10.3390/jcm12030844
- Liu H., Ai H., Liu Y. Exploring the current state and research innovation in endometrial cancer screening. Oncol. Adv. 2025; 3(1): 50-60. https://dx.doi.org/10.14218/OnA.2024.00034
- Moro F., Giudice M.T., Ciancia M., Zace D. et al. Application of artificial intelligence to ultrasound imaging for benign gynecological disorders: systematic review. Ultrasound Obstet. Gynecol. 2025; 65(3): 295-302. https://dx.doi.org/10.1002/uog.29171
Received 25.08.2025
Accepted 28.11.2025
About the Authors
Svetlana N. Toneeva, obstetrician-gynecologist, Gynecological Department, Ulyanovsk Regional Clinical Hospital, 432017, Russia, Ulyanovsk, Third International str., 7, s.toneeva@inbox.ru, https://orcid.org/0009-0003-3101-881XEvgeniy A. Toneev, PhD, thoracic surgeon at the Department of Thoracic Oncology, Regional Clinical Oncology Dispensary; Associate Professor, Department of Hospital Surgery, Faculty of Medicine named after T.Z. Biktimirov, Institute of Medicine, Ecology and Physical Culture, Ulyanovsk State University, 432000, Russia, Ulyanovsk,
L. Tolstoy str., 42, e.toneev@inbox.ru, SPIN-code 2236-3277, AuthorID: 1043371, https://orcid.org/0000-0001-8590-2350
Natalia A. Volkova, Deputy Chief Physician for Obstetric and Gynecological Care, Ulyanovsk Regional Clinical Hospital, 432017, Russia, Ulyanovsk, Third International str., 7; Senior lecturer at the Department of Obstetrics and Gynecology, Faculty of Medicine named after T.Z. Biktimirov, Institute of Medicine, Ecology and Physical Culture, Ulyanovsk State University, 432000, Russia, Ulyanovsk, L. Tolstoy str., 42, n_volkova2010@mail.ru, https://orcid.org/0009-0009-3018-4852
Svetlana Yu. Klinysheva, 2nd year resident at the Faculty of Postgraduate Medical and Pharmaceutical Education, Ulyanovsk State University, 432000, Russia, Ulyanovsk,
L. Tolstoy str., 42, klinyshevazs99@list.ru, https://orcid.org/0009-0007-7686-8593
Aliia N. Safiullina, 6th year student at the Faculty of Medicine named after T.Z. Biktimirov, Institute of Medicine, Ecology and Physical Culture, Ulyanovsk State University, 432000, Russia, Ulyanovsk, L. Tolstoy str., 42, awesome.mukhutdinova@yandex.ru, SPIN-code: 6344-1106, https://orcid.org/0009-0009-1455-8287
Daniil R. Pisklyukov, 2nd year student at the the Faculty of Medicine named after T.Z. Biktimirov, Institute of Medicine, Ecology and Physical Culture, Ulyanovsk State University, 432000, Russia, Ulyanovsk, L. Tolstoy str., 42, danilovdaniil1999@yandex.ru, https://orcid.org/0009-0002-7967-4528
Corresponding author: Aliia N. Safiullina, awesome.mukhutdinova@yandex.ru



