1,205 skills found · Page 11 of 41
Abdul1028 / Whatsapp RealityA comprehensive Python library for analyzing WhatsApp chat exports. This library provides tools for preprocessing WhatsApp chat data and performing various analyses including sentiment analysis, user activity patterns, conversation patterns, and more.
YueCui-Labs / FFCM MRFMATLAB GUIs for ToF-MRA data preprocessing, cerebrovascular segmentation and quantification
mr-easy / Badminton Stroke ClassificationClassifying badminton strokes based on accelorometer and gyroscope sensor data attached to player's wrist. An end-to-end Machine Learning project, from data collection and preprocessing to final model evaluation.
muyouhang / 3DFRA pipeline for 3D face recognition(FR), including data preprocessing, feature extraction and face recognition. Suit for consumer RGB-D cameras, for example, Kinect V2.
AlexanderSouthan / PyPreprocessingEspecially useful for preprocessing of datasets like Raman spectra, infrared spectra, UV/Vis spectra, but also HPLC data and many other types of data. pyPreprocessing includes baseline correction, smoothing, filtering, normalization and transformation.
IfremerUnderwater / MatisseMatisse is a user friendly software to make structure-from-motion (3D reconstruction from optical images) easily accessible for non expert. Data preprocessing is mainly optimized to work for underwater images but it can also be used any type of optical images.
DIGVISHRAYALA / SUBDIVISION RAINFALLThis repository contains a Jupyter Notebook developed as part of the CA-2 project. It focuses on data analysis using Python with libraries like Pandas, NumPy, Matplotlib, and others. The project demonstrates data loading, preprocessing, exploration, and visualization techniques suitable for academic submissions or real-world data insights.
osm-search / Wikipedia WikidataPreprocessing Wikidata and Wikipedia inlink data for Nominatim geocoder
yucqSUSTech / CrazyseismicCrazyseismic: A MATLAB GUI‐based software package for passive seismic data preprocessing
xucong-zhang / Data Preprocessing GazeUpdated code for paper "Revisiting Data Normalization for Appearance-Based Gaze Estimation"
utkarshshukla2912 / PyEEGpipelineA complete pipeline for reading preprocessing and classification of EEG data using different statistical ML models and different deep Learning models
ELHoussineT / AutoDataCleanerSimple and automatic data cleaning in one line of code! It performs one-hot encoding, date & time casting to datetime dtype, detects binary columns, safely convert non-numeric columns to numeric dtypes, cleaning dirty/empty values, normalizing values and removing unwanted columns all in one line of code. Get your data ready for model training and fitting quickly.
gyrdym / Ml PreprocessingImplementation of popular data preprocessing algorithms for Machine learning
dataset-jp / DatakitThe assemble of methods to preprocess open data.
Devang-C / AutoEDAAutoEDA: An Automated Exploratory Data Analysis (EDA) Toolkit Simplify and automate your data exploration process with AutoEDA. This open-source Python application streamlines data preprocessing, missing data handling, visualization, and more. Easily discover insights and patterns in your datasets without the hassle of manual EDA
LTU-Machine-Learning / Inner Speech EEG FMRIThis repository contains the code used to preprocess the EEG and fMRI data along with the stimulation protocols used to generate the Bimodal Inner Speech dataset.
reddyprasade / Machine Learning Interview PreparationPrepare to Technical Skills Here are the essential skills that a Machine Learning Engineer needs, as mentioned Read me files. Within each group are topics that you should be familiar with. Study Tip: Copy and paste this list into a document and save to your computer for easy referral. Computer Science Fundamentals and Programming Topics Data structures: Lists, stacks, queues, strings, hash maps, vectors, matrices, classes & objects, trees, graphs, etc. Algorithms: Recursion, searching, sorting, optimization, dynamic programming, etc. Computability and complexity: P vs. NP, NP-complete problems, big-O notation, approximate algorithms, etc. Computer architecture: Memory, cache, bandwidth, threads & processes, deadlocks, etc. Probability and Statistics Topics Basic probability: Conditional probability, Bayes rule, likelihood, independence, etc. Probabilistic models: Bayes Nets, Markov Decision Processes, Hidden Markov Models, etc. Statistical measures: Mean, median, mode, variance, population parameters vs. sample statistics etc. Proximity and error metrics: Cosine similarity, mean-squared error, Manhattan and Euclidean distance, log-loss, etc. Distributions and random sampling: Uniform, normal, binomial, Poisson, etc. Analysis methods: ANOVA, hypothesis testing, factor analysis, etc. Data Modeling and Evaluation Topics Data preprocessing: Munging/wrangling, transforming, aggregating, etc. Pattern recognition: Correlations, clusters, trends, outliers & anomalies, etc. Dimensionality reduction: Eigenvectors, Principal Component Analysis, etc. Prediction: Classification, regression, sequence prediction, etc.; suitable error/accuracy metrics. Evaluation: Training-testing split, sequential vs. randomized cross-validation, etc. Applying Machine Learning Algorithms and Libraries Topics Models: Parametric vs. nonparametric, decision tree, nearest neighbor, neural net, support vector machine, ensemble of multiple models, etc. Learning procedure: Linear regression, gradient descent, genetic algorithms, bagging, boosting, and other model-specific methods; regularization, hyperparameter tuning, etc. Tradeoffs and gotchas: Relative advantages and disadvantages, bias and variance, overfitting and underfitting, vanishing/exploding gradients, missing data, data leakage, etc. Software Engineering and System Design Topics Software interface: Library calls, REST APIs, data collection endpoints, database queries, etc. User interface: Capturing user inputs & application events, displaying results & visualization, etc. Scalability: Map-reduce, distributed processing, etc. Deployment: Cloud hosting, containers & instances, microservices, etc. Move on to the final lesson of this course to find lots of sample practice questions for each topic!
manajalali / Voltage Regulation Using SVMThis code is the implementation of the following paper: M. Jalali, V. Kekatos, N. Gatsis and D. Deka, "Designing Reactive Power Control Rules for Smart Inverters Using Support Vector Machines," in IEEE Transactions on Smart Grid, vol. 11, no. 2, pp. 1759-1770, March 2020, doi: 10.1109/TSG.2019.2942850. The main code 1) loads the data. However, the data is not included here. the code should be modified accordingly. 2) Finds the hyperparameters uding cross validation. While the main file includes the optimization using l2 loss function, the functions for l1 oprimization are included here with suffix "2". 3) Solves the volatge regulation optimization problem using the Mosek solver. 4) Solves the volatge regulation problem using the optimal power flow problem and the local control rules as well. The main function includes the following functions: 1) Preprocessing: the scaling, oversizing, centering and normalzing of the data. 2) KFCrossvalid_SVM: finds the hyperparameters using crossvalidation 3) mosek_crossValid (mosek2_crossValid): located inside the KFCrossvalid_SVM which solves theoptimization problem 4) SVM_gauss_mosek and SVM_lin_mosek: solve the ctual optimization problems for finding the parameters a abd b for reative power control rules. 5) localControl: finds the reactive power local control rules 6) eval_SVM_gauss, eval_SVM_lin: evaluates the reactive power control rules given the measurements and obtained parameters 7) optimalGlobal (SOCP): solves the central optimal power flow problem
shinezyy / Gem5 Data Procdata preprocessing scripts for gem5 output
Dhairya10 / Medium Data PreprocessingRepository for the medium article about data preprocessing