OceanGym
OceanGym: A Benchmark Environment for Underwater Embodied Agents
Install / Use
/learn @OceanGPT/OceanGymREADME
OceanGym is a high-fidelity embodied underwater environment that simulates a realistic ocean setting with diverse scenes. As illustrated in figure, OceanGym establishes a robust benchmark for evaluating autonomous agents through a series of challenging tasks, encompassing various perception analyses and decision-making navigation. The platform facilitates these evaluations by supporting multi-modal perception and providing action spaces for continuous control.
- OceanGym supports a wide range of underwater targets and allows users to freely create, edit, and customize these objects within the environment.
- The platform incorporates water–flow and hydrodynamic simulation (there exists a discrepancy), as well as depth-dependent lighting and visibility modeling, enabling reproduction of underwater conditions.
- Users can flexibly modify environmental parameters, inject new scenes, or adjust task settings, serving as a versatile testbed for benchmarking and developing underwater autonomous agents.
We have provided a teaching demonstration video here:bilibili
💐 Acknowledgement
OceanGym environment is built upon Unreal Engine (UE) 5.3, with certain components developed by drawing inspiration from and partially based on HoloOcean. We sincerely acknowledge their valuable contribution.
🔔 News
- 12-2025, we updated the world that supports underwater current simulation
- 10-2025, we released the initial version of OceanGym along with the accompanying paper.
- 04-2025, we launched the OceanGym project.
Contents:
- 💐 Acknowledgement
- 🔔 News
- 📺 Quick Start
- ⚙️ Set up Environment
- 🧠 Decision Task
- 👀 Perception Task
- ⏱️ Results
- 📚 Datasets
- 🚩 Citation
📺 Quick Start
Install the experimental code environment using pip:
pip install -r requirements.txt
Decision Task
Only the environment is ready! Build the environment based on here.
Step 1: Run a Task Script
For example, to run task 4:
python decision\tasks\task4.py
Follow the keyboard instructions or switch to LLM mode for automatic decision-making.
Step 2: Keyboard Control Guide
| Key | Action | |-------------|------------------------------| | W | Move Forward | | S | Move Backward | | A | Move Left | | D | Move Right | | J | Turn Left | | L | Turn Right | | I | Move Up | | K | Move Down | | M | Switch to LLM Mode | | Q | Exit |
You can use WASD for movement, J/L for turning, I/K for up/down. Press
Mto switch to large language model mode (may cause temporary lag). PressQto exit.
Step 3: View Results
Logs and memory files are automatically saved in the log/ and memory/ directories.
Step 4: Evaluate the results
Place the generated memory and important_memory files into the corresponding point folders.
Then, set the evaluation paths in the evaluate.py file.
We provide 6 experimental evaluation paths. In evaluate.py, you can configure them as follows:
eval_roots = [
os.path.join(eval_root, "main", "gpt4omini"),
os.path.join(eval_root, "main", "gemini"),
os.path.join(eval_root, "main", "qwen"),
os.path.join(eval_root, "migration", "gpt4o"),
os.path.join(eval_root, "migration", "qwen"),
os.path.join(eval_root, "scale", "qwen"),
]
To run the evaluation:
python decision\utils\evaluate.py
The generated results will be saved under the \eval\decision folder.
Perception Task
All commands are applicable to Linux, so if you using Windows, you need to change the corresponding path representation (especially the slash).
Step 1: Prepare the dataset
After downloading from Hugging Face or Google Drive, put it into the data/perception folder.
Step 2: Select model parameters
| parameter | function | | ---| --- | | model_template | The large language model message queue template you selected. | | model_name_or_path | If it is an API model, it is the model name; if it is a local model, it is the path. | | api_key | If it is an API model, enter your key. | | base_url | If it is an API model, enter its baseful URL. |
Now we only support OpenAI, Google Gemma, Qwen and OpenBMB.
MODELS_TEMPLATE="Yours"
MODEL_NAME_OR_PATH="Yours"
API_KEY="Yours"
BASE_URL="Yours"
Step 3: Run the experiments
| parameter | function | | ---| --- | | exp_name | Customize the name of the experiment to save the results. | | exp_idx | Select the experiment number, or enter "all" to select all. | | exp_json | JSON file containing the experiment label data. | | images_dir | The folder where the experimental image data is stored. |
For the experimental types, We designed (1) multi-view perception task and (2) context-based perception task.
For the lighting conditions, We designed (1) high illumination and (2) low illumination.
For the auxiliary sonar, We designed (1) without sonar image (2) zero-shot sonar image and (3) sonar image with few sonar example.
Such as this command is used to evaluate the multi-view perception task under high illumination:
python perception/eval/mv.py \
--exp_name Result_MV_highLight_00 \
--exp_idx "all" \
--exp_json "/data/perception/highLight.json" \
--images_dir "/data/perception/highLight" \
--model_template $MODELS_TEMPLATE \
--model_name_or_path $MODEL_NAME_OR_PATH \
--api_key $API_KEY \
--base_url $BASE_URL
For more patterns about perception tasks, please read this part carefully.
⚙️ Set up Environment
This project is based on the HoloOcean environment. 💐
We have placed a simplified version here. If you encounter any detailed issues, please refer to the original installation document.
We have provided a teaching demonstration video here:bilibili
Install the OceanGym_large.zip
From ☁️ <a href="https://drive.google.com/file/d/1EfKHeiyQD5eoJ6-EsiJHuIdBRM5Ope5A/view?usp=drive_link" target="_blank">Google Drive</a> ☁️ <a href="https://pan.baidu.com/s/16h86huHLeFGAKatRWvLrFQ?pwd=wput" target="_blank">Baidu Drive</a> download the OceanGym_large.zip And extract it to the folder you want
Packaged Installation
- Python Library
From the cloned repository, install the Python package by doing the following:
cd OceanGym_large/client
pip install .
- Worlds Packages
Related Skills
node-connect
349.7kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.7kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.7kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
