4 skills found
TsLu1s / MlimputerMLimputer: Missing Data Imputation Framework for Machine Learning
UrbsLab / AutoMLPipe BCAn automated, rigorous, and largely scikit-learn based machine learning analysis pipeline for binary classification. Adopts current best practices to avoid bias, optimize performance, ensure replicatability, capture complex associations (e.g. interactions and heterogeneity), and enhance interpretability. Includes (1) exploratory analysis, (2) data cleaning, (3) partitioning, (4) scaling, (5) imputation, (6) filter-based feature selection, (7) collective feature selection, (8) modeling with 'optuna' hyperparameter optimization across 13 implemented ML algorithms (including three rule-based machine learning algorithms: ExSTraCS, XCS, and eLCS), (9) testing evaluations with 16 classification metrics, model feature importance estimation, (10) automatically saves all results, models, and publication-ready plots (including proposed composite feature importance plots), (11) non-parametric statistical comparisons across ML algorithms and analyzed datasets, and (12) automatically generated PDF summary reports.
TSLTO2025 / TSLTOTSLTO (Tucker decomposition-based Sparse Low-Rank high-Order Tensor Optimization model), a model for tensor imputation and anomaly diagnosis.
kamruleee51 / Diabetes Classification DatasetIn this article, we proposed a new labeled diabetes dataset from a South Asian country (Bangladesh). Additionally, we recommended an automated classification pipeline, introducing a weighted ensemble of several Machine Learning (ML) classifiers: Naive Bayes (NB), Random Forest (RF), Decision Tree (DT), XGBoost (XGB), and LightGBM (LGB). The critical hyperparameters of these ML models are tuned using a grid search hyperparameter optimization approach. Missing values imputation, feature selection, and K-fold cross-validation were also incorporated into the designed framework.