RoboticPerception
Explore object recognition, segmentation, and how to process depth data from camera sensors to help a robot better understand and navigate its world.
Install / Use
/learn @georgeerol/RoboticPerceptionREADME
3D Perception
<p align="center"> <img src="./pr2_robot/misc/PR2.gif"/> </p>This Udacity clone project is modeled after Amazon Robotics Challenge. Giving a robot the ability to locate an object in a cluttered environment, pick it up and then move it to some other location is not just an interesting problem to try to solve, it's a challenge at the forefront of the robotics industry today.
A perception pipeline is created base on Exercises 1, 2 and 3 from RoboND Percetion Excercises o identify target objects from a so-called “Pick-List” in that particular order, pick up those objects and place them in corresponding dropboxes.
Perception Pipeline
Tabletop Segmentation
The Table Segmentation is about applying some filtering techniques and use RANSAC plane fitting to segment a table in a point cloud.
Downsample your point cloud by applying a Voxel Grid Filter
When running computation on full resolution point cloud can be slow and may not achieve any improvement on results obtained using a more widely apart sampled point cloud. Therefore, in many cases, it is advantageous to downsample the data.
VoxelGrid is used to Downsampling Filter to drive a point cloud that has fewer points but should do a good job of representing the input point cloud as a whole. As the word "pixel" is short for "picture element", the word "voxel" is short for "volume element". Just as we can divide a 2d image into a regular grid of area element, we can also divid up a 3D point cloud, into a regular 3D grid of volume elements. Each individual cell in grid is now voxel and the 3D grid is known as "Voxel Grid".
Voxel Grid Code
def do_voxel_grid_downssampling(pcl_data,leaf_size):
'''
Create a VoxelGrid filter object for a input point cloud
:param pcl_data: point cloud data subscriber
:param leaf_size: voxel(or leaf) size
:return: Voxel grid downsampling on point cloud
'''
vox = pcl_data.make_voxel_grid_filter()
vox.set_leaf_size(leaf_size, leaf_size, leaf_size)
return vox.filter()
# Convert ROS msg to PCL data
cloud = ros_to_pcl(pcl_msg)
# Voxel Grid Downsampling
LEAF_SIZE = 0.01
cloud = do_voxel_grid_downssampling(cloud,LEAF_SIZE)
2D image into a regular grid of area elements(Left picture) and 3D grid volume elements(Right picture)

Voxel Downsampling Sample code
Apply Statistical Outlier Filtering
Statistical Outlier Filtering is use to remove outlieres using number of neighboring points of 10 and standard deviation threshold of 0.001
Statistical Outlier Filtering Code
def do_statistical_outlier_filtering(pcl_data,mean_k,tresh):
'''
:param pcl_data: point could data subscriber
:param mean_k: number of neighboring points to analyze for any given point
:param tresh: Any point with a mean distance larger than global will be considered outlier
:return: Statistical outlier filtered point cloud data
'''
outlier_filter = pcl_data.make_statistical_outlier_filter()
outlier_filter.set_mean_k(mean_k)
outlier_filter.set_std_dev_mul_thresh(tresh)
return outlier_filter.filter()
# Convert ROS msg to PCL data
cloud = ros_to_pcl(pcl_msg)
# Statistical Outlier Filtering
cloud = do_statistical_outlier_filtering(cloud,10,0.001)
Apply a Pass Through Filter to isolate the table and objects.
When we have prior information about the location of a target in the scene, we can apply a Pass Through Filter to remove useless data from our point cloud. The Pass Through Filter works just like a cropping tool, which allows us to crop any given 3D point cloud by specifying an axis with cut-off values along that axis. The region that we allow to pass through is referred as region of interest.
By applying a Pass Through filter along z axis (the height with respect to the ground) to our tabletop scene in the range 0.1 to 0.8 gives us the table and filtered out all of the objects.

By applying a Pass Through filter with the following range 0.6 and 1.1 isolate our region of interest containing the table and the objects on the table

PassThrouh Filter Code
def do_passthrough(pcl_data,filter_axis,axis_min,axis_max):
'''
Create a PassThrough object and assigns a filter axis and range.
:param pcl_data: point could data subscriber
:param filter_axis: filter axis
:param axis_min: Minimum axis to the passthrough filter object
:param axis_max: Maximum axis to the passthrough filter object
:return: passthrough on point cloud
'''
passthrough = pcl_data.make_passthrough_filter()
passthrough.set_filter_field_name(filter_axis)
passthrough.set_filter_limits(axis_min, axis_max)
return passthrough.filter()
# Convert ROS msg to PCL data
cloud = ros_to_pcl(pcl_msg)
# PassThrough Filter
filter_axis ='z'
axis_min = 0.44
axis_max =0.85
cloud = do_passthrough(cloud,filter_axis,axis_min,axis_max)
filter_axis = 'x'
axis_min = 0.33
axis_max = 1.0
cloud = do_passthrough(cloud, filter_axis, axis_min, axis_max)
Pass Through Filter Sample code
Perform RANSAC plane filtering to identify the table.
To Remove the table completely from the scene we can use a popular technique known as Random Sample Consensus(RANSAC). RANSAC is an algorithm which is used to identify points in out dataset that belong to a particular model. In the 3D scene, the model can be a plane a cylinder, a box or any other common shape.
The algorithm assumes that all of the data in a dataset is composed of both inliers and outliers.
- Inliers can be defined by a particular model with a specific set of parameters.
- Outliers if that model does not fit then it gets discarded.
By modeling the table as a plane, we can remove it from the point cloud.

RANSAC Plane Filtering Code
def do_ransac_plane_segmentation(pcl_data,pcl_sac_model_plane,pcl_sac_ransac,max_distance):
'''
Create the segmentation object
:param pcl_data: point could data subscriber
:param pcl_sac_model_plane: use to determine plane models
:param pcl_sac_ransac: RANdom SAmple Consensus
:param max_distance: Max distance for apoint to be considered fitting the model
:return: segmentation object
'''
seg = pcl_data.make_segmenter()
seg.set_model_type(pcl_sac_model_plane)
seg.set_method_type(pcl_sac_ransac)
seg.set_distance_threshold(max_distance)
return seg
# Convert ROS msg to PCL data
cloud = ros_to_pcl(pcl_msg)
# RANSAC Plane Segmentation
ransac_segmentation = do_ransac_plane_segmentation(cloud,pcl.SACMODEL_PLANE,pcl.SAC_RANSAC,0.01)
# Extract inliers and outliers
cloud_table,cloud_objects= extract_cloud_objects_and_cloud_table(cloud,ransac_segmentation )
Euclidean Clustering with ROS and PCL
To perform Euclidean Clustering, a k-d tree from the 'cloud_objects' point cloud needs to be constructed.
The k-d tree data structure is used in the Euclidian Clustering algorithm to decrease the computational burden of searching for neighboring points. While other efficient algorithms/data structures for nearest neighbor search exist, PCL's Euclidian Clustering algorithm only supports k-d trees.

Eucliean Cluster Extraction Code
def do_euclidean_clustering(white_cloud):
'''
:param cloud_objects:
:return: cluster cloud and cluster indices
'''
tree = white_cloud.make_kdtree()
# Create Cluster-Mask Point Cloud to visualize each cluster separately
ec = white_cloud.make_EuclideanClusterExtraction()
ec.set_ClusterTolerance(0.015)
ec.set_MinClusterSize(50)
ec.set_MaxClusterSize(20000)
ec.set_SearchMethod(tree)
cluster_indices = ec.Extract()
cluster_color = get_color_list(len(cluster_indices))
color_cluster_point_list = []
for j, indices in enumerate(cluster_indices):
for i, indice in enumerate(indices):
color_cluster_point_list.append([white_cloud[indice][0],
white_cloud[indice][1],
white_cloud[indice][2],
rgb_to_float(cluster_color[j])])
cluster_cloud = pcl.PointCloud_PointXYZRGB()
cluster_cloud.from_list(color_cluster_point_list)
return cluster_cloud,cluster_indices
# Euclidean Clustering
white_cloud= XYZRGB_to_XYZ(cloud_objects)
cluster_cloud,cluster_indices = do_euclidean_clustering(white_cloud)
Object Recognition
Object Recognition is a central theme in computer vision and perception for robotics. When we view a scene with our eyes, we are constantly performing the task of object recognition fo
Related Skills
node-connect
353.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
353.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
353.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
