TactileACT
Incorporating Tactile Signals into the ACT framework for peg insertion tasks
Install / Use
/learn @Abraham190137/TactileACTREADME
Visuo-Tactile Pretraining for Cable Plugging
This repo is the code for the paper found here: https://arxiv.org/abs/2403.11898
Repo Structure
imitate_episodes.pyTrain ACT, using either pretrained on non-pretrained encodersclip_pretraining.pyPretrains the Vision and Tactile Encoders using CLIP style contrastive lossrobot_operation.pyExecutes trained policy on a Franka robotpolicy.pyCreates the ACT policyclip_tsne.pyPlots TSNE graphs of the pretrained embedding space.data_collectionFolder containing data collection/processing scriptsinspect_hdf5_file.pyContains helper functions for inspecting collected data.utils.pyDataloader + additional util functionsvisualization_utils.pyHelper function to visualize trajectories durring trainingbase_config.jsonBase config for training. Reduces the number of command line arguments needed. All values can be overridden in the command line.
Installation
conda create -n TactileACT python=3.8
conda activate TactileACT
pip install torchvision
pip install torch
pip install pyyaml
pip install pexpect
pip install opencv-python
pip install matplotlib
pip install einops
pip install packaging
pip install h5py
pip install ipython
pip install tqdm
pip install opencv-python
cd detr && pip install -e .
Example Usages
To train ACT:
python imitate_episodes.py --config base_config.json --save_dir data/data_dir --name pretrained_vision_tactile --batch_size 4 --kl_weight 10 --z_dimension 32 --num_epochs 4000 --dropout 0.025 --chunk_size 30 --backbone clip_backbone --gelsight_backbone_path data/clip_models/gelsight_encoder.pth --vision_backbone_path data/clip_models/vision_encoder.pth
Notes:
As the paper is under review, this repo is still under development and may change, and the code may not be fully documented. If you have any questions on the repo, or want any advise on using visuo-tacitle pretraining for your own project, please do not hesitate to reach out to aigeorge@andrew.cmu.edu. Enjoy!
Related Skills
node-connect
342.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
342.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.7kCommit, push, and open a PR
