HTransRL
Hybrid Transformer based Multi-agent Reinforcement Learning (HTransRL) is for drone coordination in air corridors, addressing the challenges of dynamic dimensions and types of state inputs, which cannot addressed by the traditional MARL.
Install / Use
/learn @SECNetLabUNM/HTransRLREADME
Hybrid Transformer based Multi-agent Reinforcement Learning for Multiple Unmanned Aerial Vehicle Coordination in Air Corridors
Modeling
Air Corridor, Cylinder and Torus

Animation
cttc, one-transfer
4 air corridors, cylinder-torus-torus-cylinder, 12 UAVs, 4-static, and 3-mobile

cttcttcttc, 3-transfer
10 air corridors, cylinder-torus-torus-cylinder-torus-torus-cylinder-torus-torus-cylinder, 12 UAVs, 4-static, and 3-mobile

RL Training
Network Structure
- Embedding network normalizes the input values and standardizes the input dimensions.
- Transformer processes dynamic neighbors' information using encoders and decoders.
- Actor-critic network outputs the estimated state value and stochastic action in spherical coordinates.

Training File
Train one set of parameters: main.py
Train a batch, parameter grid search: batched_grid_search.sh
Models (actor/critic) are saved every 0.25 million steps. Training process is visualized with terminal log and TensorBoard.
