FreeFloatingObjectsDDPG
This code accompanies the paper Manipulation of Free-Floating Objects using Faraday Flows and Deep Reinforcement Learning.
Install / Use
/learn @DSHardman/FreeFloatingObjectsDDPGREADME
FreeFloatingObjectsDDPG
This code accompanies the paper Manipulation of Free-Floating Objects using Faraday Flows and Deep Reinforcement Learning.
Written using MATLAB 2020b (Statistics and Machine Learning Toolbox & Reinforcement Learning Toolbox) & Python 3 (Generic UR5 Controller: kg398/Generic_ur5_controller: new version of ur5 python controller (github.com)).
Camera Calibration
Camera calibration must be carried out before BO/DRL methods can be run. This consists of 3 parts:
- Connecting to the webcam: CameraConnect.m
- Taking calibration photos of a chequerboard with 30mmx30mm squares, floating on the surface of the water: TakeCalibrationPhotos.m. In the first photo, the chequerboard is aligned with the coordinate axes of the UR5.
- Generating the calibration parameters using the photos in the CalibrationImages subdirectory: CalibrateCamera.m.
Bayesian Optimisation
Is run from BayesianOptimisation.m calling the cost function using a handle to i180CostFunction.m. This runs a single iteration, returns a cost, and resets the floating object to its starting position. The reset & run steps are largely identical to those of the DDPG script, described below.
Reinforcement Learning
Run from RL_I.m, which sets up the DDPG agent and calls step/reset functions step_I.m & reset_I.m.
The process for a single iteration is illustrated below:

after which the process is repeated for up to 5000 iterations.
Result Structures
Bayesian
Bayesian results are stored as MATLAB's BayesianOptimization objects. The tracked paths are stored as nx3 arrays, with columns using a polar coordinate system: time|r|theta.
DDPG
Agents at various stages of training are stored as rlDDPGAgent objects. Training progress data is stored in separate Nx1 arrays for results & Q0 values at each iteration, where N is the total number of iterations.
Data file AllRepetitions.mat stores repeated results on the trained agent for different shapes (as RepeatingResults objects, defined in the Results folder) and during development of each of the 5 tasks (as DevelopingResults objects, defined in the Results folder).
Data file MainTests.mat stores the 1000 tests of randomised states on the trained agent, as a ShapeResults object, defined in the Results folder.
David Hardman, 14/04/21
