1,204 skills found · Page 7 of 41
yuyou-dev / ChatGPT Fine TuningQuick-start guide to fine-tuning ChatGPT using Python. Includes scripts for data preprocessing, model training, and evaluation. 快速入门指南: 使用Python微调ChatGPT。包含数据预处理、模型训练和评估脚本。
connor-hawley / AIS Vessel Data PipelinePreprocessing data based on https://marinecadastre.gov/ais/
r2llab / WranglParallel data preprocessing for NLP and ML.
rohanmistry231 / Parkinsons Disease ClassificationA Python-based machine learning project for classifying Parkinson's disease using patient data and algorithms like XGBoost and Random Forest. Includes data preprocessing, feature analysis, and model evaluation with Scikit-learn and Pandas for accurate predictions.
sduprey / Optimal Transaction ExecutionThis entry contains two topics The first item is entirely based on the following paper: http://sfb649.wiwi.hu-berlin.de/papers/pdf/SFB649DP2011-056.pdf It contains 2 MATLAB demonstrating script : DATA_preprocessing.m & VAR_modeling_script.m DATA_preprocessing.m uses the LOBSTER framework (https://lobster.wiwi.hu-berlin.de/) to preprocess high frequency data from the NASDAQ Total View ITCH (csv files) allowing us to reconstruct exactly at each time the order book up to ten depths. Just look at the published script ! VAR_modeling_script.m contains the modeling of the whole order book as VEC/VAR process. It uses the great VAR/VEC Joahnsen cointegration framework. After calibrating your VAR model, you then assess the impact of an order using shock scenario (sensitivity analysis) to the VAR process. We deal with 3 scenarii : normal limit order, aggressive limit order & normal market order). Play section by section the script (to open up figures which contain a lot of graphs). It contains a power point to help you present this complex topic. The second item is entirely based on the following paper : http://www.courant.nyu.edu/~almgren/papers/optliq.pdf It contains a mupad document : symbolic_demo.mn I did struggle to get something nice with the symbolic toolbox. I was not able to drive a continuous workflow and had to recode some equations myself. I nevertheless managed to get a closed form solution for the simplified linear cost model. It contains a MATLAB demonstrating script : working_script.m For more sophisticated cost model, there is no more closed form and we there highlighted MATLAB numerical optimization abilities (fmincon). It contains an Optimization Apps you can install. Just launch the optimization with the default parameters. And then switch the slider between volatility risk and liquidation costs to see the trading strategies evolve on the efficient frontier. It contains a power point to help you present this complex topic.
MW55 / NatrixOpen-source bioinformatics pipeline for the preprocessing of raw sequencing data.
ffilipponi / Sentinel 1 GRD PreprocessingStandard workflow for the preprocessing of Sentinel-1 GRD satellite data
wmichalska / EEG EmotionsApplication prepares data to learning process. Including preprocessing, cleaning, reformating, feature extraction using PyEEG library and learning using Sklearn tool.
elena-roff / Price Modeling RfData Analysis and Machine Learning with Python: EDA with ECDF and Correlation analysis, Preprocessing and Feature engineering, L1 (Lasso) Regression and Random Forest Regressor with scikit-learn backed up by cross-validation, grid search and plots of feature importance.
ChaitanyaK77 / Building A Small Language Model SLM This Repository provides a Jupyter Notebook for building a small language model from scratch using 'TinyStories' dataset. Covers data preprocessing, BPE tokenization, binary storage, GPU memory management, and training a Transformer in PyTorch. Generate sample stories to test your model. Ideal for learning NLP and PyTorch.
Wuito / Estimation Of Residual Life Of Particle Filter Lithium Ion BatteryUsing particle filtering algorithm to estimate the residual life of lithium ion batteries, the university of Maryland public data set is used. Preprocessing using the python logarithm. The particle filter contains python and matlab. The relevant packets are uploaded together.
ziweiWWANG / EFRCode and dataset for paper "A Linear Comb Filter for Event Flicker Removal", ICRA 2022. An asynchronous linear filter to preprocess event data to remove unwanted flicker events from an event stream.
TsLu1s / AtlanticAtlantic: Automated Data Preprocessing Framework for Machine Learning
poldracklab / Ds003 Post FMRIPrep AnalysisAn exemplary task analysis workflow to run on OpenNeuro's ds000003 data, after preprocessing with fMRIPrep
vlivashkin / GPUParallelJoblib-like interface for parallel GPU computations (e.g. data preprocessing)
tsyoshihara / Alzheimer S Classification EEGAlzheimer’s Disease (AD) is the most common neurodegenerative disease. It is typically late onset and can develop substantially before diagnosable symptoms appear. Electroencephalogram (EEG) could potentially serve as a noninvasive diagnostic tool for AD. Machine learning can be helpful in making inferences about changes in frequency bands in EEG data and how these changes relate to neural function. The EEG data was sourced from 2014 paper titled Alzheimer’s disease patients classification through EEG signals processing by Fiscon et al. There were patients with AD, mild cognitive impairment (MCI), and healthy controls. The data was already preprocessed using a fast fourier transform (FFT) to take the data from the time domain to the frequency domain. There were differing levels of effectiveness in terms of classification but generally, Fisher’s discriminant analysis (FDA), relevance vector machine, and random forest approaches were most successful. Due to inconsistent feature importances in different models, conclusions about important frequency bands for classification were not able to be made at this time. Similarly, different frequencies were not able to be localized to different regions of the brain. Further research is necessary to develop more interpretable models for classification.
NandhanaRameshkumar / Data PreprocessingNo description available
Madhuarvind / Data PreprocessingNo description available
NITHISHKUMAR-C / CODSOFT CREDIT CARD FRAUD DETECTIONBuild a machine learning model to identify fraudulent credit card transactions. Preprocess and normalize the transaction data, handle class imbalance issues, and split the dataset into training and testing sets.
Nidhi-Satyapriya / AutoEDA Automated Data Preprocessing ToolkitThe Automated Data Preprocessing Toolkit streamlines the data preprocessing stage in machine learning by automating tasks like handling missing values, encoding categorical features, and normalizing data. With a user-friendly interface for easy dataset uploads, it enhances data quality and improves model performance efficiently.