CrackDetect
Machine-learning approach for real-time assessment of road pavement service life based on vehicle fleet data.
Install / Use
/learn @rreezN/CrackDetectREADME
CrackDetect (Machine-learning approach for real-time assessment of road pavement service life based on vehicle fleet data)
<img src="https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?style=for-the-badge&logo=PyTorch&logoColor=white"> <img src="https://img.shields.io/badge/Weights_&_Biases-FFBE00?style=for-the-badge&logo=WeightsAndBiases&logoColor=white"> <img src="https://img.shields.io/badge/Python-FFD43B?style=for-the-badge&logo=python&logoColor=blue">
Repository containing code for the project Machine-learning approach for real-time assessment of road pavement service life based on vehicle fleet data. Complete pipeline including data preprocessing, feature extraction, model training and prediction. An overview of the project can be found in the user manual.
Results
Our results are found in reports/figures/our_model_results/.
<p align="center"> <img align="center" src="reports/figures/jupyter/POI/iri_kpis_map.png" alt="drawing" width=90%/> </p>Quickstart
- Clone this repository
git clone https://github.com/rreezN/CrackDetect.git. - (Optional) Create Virtual environment in powershell. Note this project requires
python >= 3.10. - Install requirements
pip install -r requirements.txt. - Download the data from sciencedata.dk, unzip it and place it in the data folder (see data section).
- Call
wandb disabledif you have not set-up a suitable wandb project. (This project and entity information has been hard-coded intosrc\train_hydra_mr.pyas thewandb.init()command.) - Run
python src/main.py all
This will run all steps of the pipeline, from data preprocessing to model prediction. At the end a plot will appear that shows our (FleetYeeters) results and the results from the newly trained model. It will extract features using a Hydra model from all signals except location signals. The main.py script is setup to recreate our results, and thus all arguments are pre specified.
It is possible to call main.py with individual steps, or beginning at a certain step. To call individual steps, you would replace all with the desired step. Possible steps are:
python src/main.py [all, make_data, extract_features, train_model, predict_model, validate_model]
Additionally, if you wish to start from a specific step, skipping the steps before, you can add the --begin-from argument, i.e. if you wish to start from predict_model you would call:
python src/main.py --begin-from predict_model
If you wish to go through each step manually with your own arguments, call each script directly with its own arguments:
- Create dataset with
python src/data/make_dataset.py all - Extract features with
python src/data/feature_extraction - Train model with
python src/train_hydra_mr.py - Predict using trained model with
python src/predict_model.py - See results in
reports/figures/model_results
Table of Contents
- CrackDetect (Machine-learning approach for real-time assessment of road pavement service life based on vehicle fleet data)
- Results
- Quickstart
- Table of Contents
- Installation
- Usage
- Credits
- License
Installation
- Clone this repository
git clone https://github.com/rreezN/CrackDetect.git
- Install requirements
Note: This project requires python > 3.10 to run
There are two options for installing requirements. If you wish to setup a dedicated python virtual environment for the project, follow the steps in Virtual environment in powershell. If not, then simply run the following command, and all python modules required to run the project will be installed
python -m pip install -r requirements.txt
Virtual environment in powershell
Have python >= 3.10
- CD to CrackDetect
cd CracDetect python -m venv fleetenv-- Create environmentSet-ExecutionPolicy -Scope CurrentUser RemoteSigned-- Change execution policy if necessary (to be executed in powershell).\fleetenv\Scripts\Activate.ps1-- Activate venvpython -m pip install -U pip setuptools wheelpython -m pip install -r requirements.txt- ...
- Profit
To activate venv in powershell:
.\fleetenv\Scripts\Activate.ps1
Usage
There are several steps in the pipeline of this project. Detailed explanations of each step, and how to use them in code can be found in notebooks in notebooks/.
Downloading the data
The data is made available at sciencedata.dk.
Once downloaded it should be unzipped and placed in the empty data/ folder. The file structure should be as follows:
- data
- raw
- AutoPi_CAN
- platoon_CPH1_HH.hdf5
- platoon_CPH1_VH.hdf5
- read_hdf5_platoon.m
- read_hdf5.m
- readme.txt
- visualize_hdf5.m
- gopro
- car1
- GH012200
- GH012200_HERO8 Black-ACCL.csv
- GH012200_HERO8 Black-GPS5.csv
- GH012200_HERO8 Black-GYRO.csv
- ...
- GH012200
- car3
- ...
- car1
- ref_data
- cph1_aran_hh.csv
- cph1_aran_vh.csv
- cph1_fric_hh.csv
- cph1_fric_vh.csv
- cph1_iri_mpd_rut_hh.csv
- cph1_iri_mpd_rut_vh.csv
- cph1_zp_hh.csv
- cph1_zp_vh.csv
- AutoPi_CAN
- raw
Preprocessing the data
The data goes through several preprocessing steps before it is ready for use in the feature extractor.
- Convert
- Validate
- Segment
- Matching
- Resampling
- KPIs
To run all preprocessing steps
python src/data/make_dataset.py all
A single step can be run by changing all to the desired step (e.g. matching). You can also run from a step to the end by calling, e.g. from (including) validate:
python src/data/make_dataset.py --begin_from validate
The main data preprocessing script is found in src/data/make_dataset.py. It has the following arguments and default parameters
mode all--begin-from(False)--skip-gopro(False)--speed-threshold 5--time-threshold 10--verbose(False)
Feature extraction
There are two feature extractors implemented in this repository: HYDRA and MultiRocket. They are found in src/models/hydra and src/models/multirocket.
The main feature extraction is found in src/data/feature_extraction.py. It has the following arguments and default parameters
--cols acc.xyz_0 acc.xyz_1 acc.xyz_2--all_cols(default False)--all_cols_wo_location(default False)--feature_extractor both(choices:multirocket,hydra,both)--mr_num_features 50000--hydra_k 8--hydra_g 64--subset None--name_identifier(empty string)--folds 5--seed 42
To extract features using HYDRA and MultiRocket, call
python src/data/feature_extraction.py
The script will automatically set up the feature extractors based on the amount of cols (1 = univariate, >1 = multivariate). The features will be stored in data/processed/features.hdf5, along with statistics used to standardize during training and prediction. Features and statistics will be saved under feature extractors based on their names as defined in the model scripts.
The structure of the HDF5 features file can be seen below
<div style="text-align:center"> <img src="reports/figures/features_tree.png" style="width:50%"> </div>You can print the structure of your own features.hdf5 file with src/data/check_hdf5.py by calling
python src/data/check_hdf5.py
check_hdf5 has the following arguments and defaults
--file_path data/processed/features.hdf5--limit 3--summary(False)
Model training
A simple model has been implemented in src/models/hydramr.py. The model training script is implemented in src/train_hydra_mr.py. It has the following arguments and default parameters
--epochs 50--batch_size 32--lr 1e-3--feature_extractors HydraMV_8_64--name_identifier(empty string)--folds 5--model_name HydraMRRegressor--weight_decay 0.0--hidden_dim 64--project_name hydra_mr_test(for wandb)--dropout 0.5--model_depth 0--batch_norm(False)
To train the model using Hydra on a multivariate dataset call
python src/train_hydra_mr.py
The trained model will be saved in models/, along with the be
