Impact of canny edge detection preprocessing on performance of machine learning models for Parkinson’s disease classification
Abstrakt
This study investigates the classification of individuals as healthy or at risk of Parkinson’s disease using machine learning (ML) models, focusing on the impact of dataset size and preprocessing techniques on model performance. Four datasets are created from an original dataset: (normal dataset), ( subjected to Canny edge detection and Hessian filtering), (augmented ), and (augmented ). We evaluate a range of ML models-Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), Gradient Boosting (GB), XGBoost (XBG), Naive Bayes (NB), Support Vector Machine (SVM), and AdaBoost (AdB)-on these datasets, analyzing prediction accuracy, model size, and prediction latency. The results show that while larger datasets lead to increased model memory footprints and prediction latencies, the Canny edge detection preprocessing supplemented by Hessian filtering (used in and ) degrades the performance of most models. In our experiment, we observe that Random Forest (RF) maintains a stable memory footprint of 61 KB across all datasets, while models like KNN and SVM show significant increases in memory usage, from 5.7-7 KB on to 102-220 KB on , and similar increases in prediction time. Logistic Regression, Decision Tree, and Naive Bayes show stable memory footprints and fast prediction times across all datasets. XGBoost’s prediction time increases from 180-200 ms on to 700-3000 ms on . Statistical analysis using the Mann-Whitney U test with 100 prediction accuracy observations per model (98 degrees of freedom) reveals significant differences in performance between models trained on and (p-values < 1e-34 for most models), while the effect sizes measured by estimating Cliff’s delta values (approaching ) indicate large shifts in performance, especially for SVM and XGBoost. These findings highlight the importance of selecting lightweight models like LR and DT for deployment in resource-constrained healthcare applications, as models like KNN, SVM, and XGBoost show significant increases in resource demands with larger datasets, particularly when Canny preprocessing is applied.
Cytowania
-
0
CrossRef
-
0
Web of Science
-
0
Scopus
Autorzy (2)
Cytuj jako
Pełna treść
pełna treść publikacji nie jest dostępna w portalu
Słowa kluczowe
Informacje szczegółowe
- Kategoria:
- Publikacja w czasopiśmie
- Typ:
- artykuły w czasopismach
- Opublikowano w:
-
Scientific Reports
nr 15,
ISSN: 2045-2322 - Język:
- angielski
- Rok wydania:
- 2025
- Opis bibliograficzny:
- Bhat S. A., Szczuko P.: Impact of canny edge detection preprocessing on performance of machine learning models for Parkinson’s disease classification// Scientific Reports -,iss. 1 (2025), s.1-38
- DOI:
- Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1038/s41598-025-98356-7
- Źródła finansowania:
-
- Działalność statutowa/subwencja
- Weryfikacja:
- Politechnika Gdańska
wyświetlono 0 razy
Publikacje, które mogą cię zainteresować
News that Moves the Market: DSEX-News Dataset for Forecasting DSE Using BERT
- M. N. R. Khan,
- M. R. Islam,
- C. Sanin
- + 1 autorów
Preeclampsia Risk Prediction Using Machine Learning Methods Trained on Synthetic Data
- M. Mazur-Milecka,
- N. Kowalczyk,
- K. Jaguszewska
- + 8 autorów