Ggcnn
Generative Grasping CNN from "Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach" (RSS 2018)
Install / Use
/learn @dougsm/GgcnnREADME
Note: This is a cleaned-up, PyTorch port of the GG-CNN code. For the original Keras implementation, see the RSS2018 branch.
Main changes are major code clean-ups and documentation, an improved GG-CNN2 model, ability to use the Jacquard dataset and simpler evaluation.
Generative Grasping CNN (GG-CNN)
The GG-CNN is a lightweight, fully-convolutional network which predicts the quality and pose of antipodal grasps at every pixel in an input depth image. The lightweight and single-pass generative nature of GG-CNN allows for fast execution and closed-loop control, enabling accurate grasping in dynamic environments where objects are moved during the grasp attempt.
This repository contains the implementation of the Generative Grasping Convolutional Neural Network (GG-CNN) from the paper:
Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach
Douglas Morrison, Peter Corke, Jürgen Leitner
Robotics: Science and Systems (RSS) 2018
If you use this work, please cite:
@inproceedings{morrison2018closing,
title={{Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach}},
author={Morrison, Douglas and Corke, Peter and Leitner, J\"urgen},
booktitle={Proc.\ of Robotics: Science and Systems (RSS)},
year={2018}
}
Contact
Any questions or comments contact Doug Morrison.
Installation
This code was developed with Python 3.6 on Ubuntu 16.04. Python requirements can installed by:
pip install -r requirements.txt
Datasets
Currently, both the Cornell Grasping Dataset and Jacquard Dataset are supported.
Cornell Grasping Dataset
- Download the and extract Cornell Grasping Dataset.
- Convert the PCD files to depth images by running
python -m utils.dataset_processing.generate_cornell_depth <Path To Dataset>
Jacquard Dataset
- Download and extract the Jacquard Dataset.
Pre-trained Models
Some example pre-trained models for GG-CNN and GG-CNN2 can be downloaded from here. The models are trained on the Cornell grasping
dataset using the depth images. Each zip file contains 1) the full saved model from torch.save(model) and 2) the weights state dict from torch.save(model.state_dict()).
For example loading GG-CNN (replace ggcnn with ggcnn2 as required):
# Enter the directory where you cloned this repo
cd /path/to/ggcnn
# Download the weights
wget https://github.com/dougsm/ggcnn/releases/download/v0.1/ggcnn_weights_cornell.zip
# Unzip the weights.
unzip ggcnn_weights_cornell.zip
# Load the weights in python, e.g.
python
>>> import torch
# Option 1) Load the model directly.
# (this may print warning based on the installed version of python)
>>> model = torch.load('ggcnn_weights_cornell/ggcnn_epoch_23_cornell')
>>> model
GGCNN(
(conv1): Conv2d(1, 32, kernel_size=(9, 9), stride=(3, 3), padding=(3, 3))
(conv2): Conv2d(32, 16, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2))
(conv3): Conv2d(16, 8, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(convt1): ConvTranspose2d(8, 8, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(convt2): ConvTranspose2d(8, 16, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), output_padding=(1, 1))
(convt3): ConvTranspose2d(16, 32, kernel_size=(9, 9), stride=(3, 3), padding=(3, 3), output_padding=(1, 1))
(pos_output): Conv2d(32, 1, kernel_size=(2, 2), stride=(1, 1))
(cos_output): Conv2d(32, 1, kernel_size=(2, 2), stride=(1, 1))
(sin_output): Conv2d(32, 1, kernel_size=(2, 2), stride=(1, 1))
(width_output): Conv2d(32, 1, kernel_size=(2, 2), stride=(1, 1))
)
# Option 2) Instantiate a model and load the weights.
>>> from models.ggcnn import GGCNN
>>> model = GGCNN()
>>> model.load_state_dict(torch.load('ggcnn_weights_cornell/ggcnn_epoch_23_cornell_statedict.pt'))
<All keys matched successfully>
Training
Training is done by the train_ggcnn.py script. Run train_ggcnn.py --help to see a full list of options, such as dataset augmentation and validation options.
Some basic examples:
# Train GG-CNN on Cornell Dataset
python train_ggcnn.py --description training_example --network ggcnn --dataset cornell --dataset-path <Path To Dataset>
# Train GG-CNN2 on Jacquard Datset
python train_ggcnn.py --description training_example2 --network ggcnn2 --dataset jacquard --dataset-path <Path To Dataset>
Trained models are saved in output/models by default, with the validation score appended.
Evaluation/Visualisation
Evaluation or visualisation of the trained networks are done using the eval_ggcnn.py script. Run eval_ggcnn.py --help for a full set of options.
Important flags are:
--iou-evalto evaluate using the IoU between grasping rectangles metric.--jacquard-outputto generate output files in the format required for simulated testing against the Jacquard dataset.--visto plot the network output and predicted grasping rectangles.
For example:
python eval_ggcnn.py --network <Path to Trained Network> --dataset jacquard --dataset-path <Path to Dataset> --jacquard-output --iou-eval
Running on a Robot
Our ROS implementation for running the grasping system see https://github.com/dougsm/mvp_grasp.
The original implementation for running experiments on a Kinva Mico arm can be found in the repository https://github.com/dougsm/ggcnn_kinova_grasping.
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
flutter-tutor
Flutter Learning Tutor Guide You are a friendly computer science tutor specializing in Flutter development. Your role is to guide the student through learning Flutter step by step, not to provide d
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
16.9kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
