Hydromodel
新安江等水文模型
Install / Use
/learn @OuyangWenyu/HydromodelREADME
hydromodel
A lightweight Python package for hydrological model calibration and evaluation, featuring the XinAnJiang (XAJ) model.
- Free software: GNU General Public License v3
- Documentation: https://OuyangWenyu.github.io/hydromodel
What is hydromodel
hydromodel is a Python implementation of conceptual hydrological models, with a focus on the XinAnJiang (XAJ) model - one of the most widely-used rainfall-runoff models, especially in China and Asian regions.
Key Features:
- XAJ Model Variants: Standard XAJ and optimized versions (xaj_mz with MizuRoute)
- Multiple Calibration Algorithms:
- SCE-UA: Shuffled Complex Evolution with spotpy
- GA: Genetic Algorithm with DEAP
- scipy: L-BFGS-B, SLSQP, and other gradient-based methods
- Multi-Basin Support: Efficient calibration and evaluation for multiple basins simultaneously
- Unified Results Format: All algorithms save results in standardized JSON + CSV format
- Comprehensive Evaluation Metrics: NSE, KGE, RMSE, PBIAS, and more
- Unified API: Consistent interfaces for calibration, evaluation, and simulation
- Flexible Data Integration: Seamless support for CAMELS datasets via hydrodataset and custom data via hydrodatasource
- Configuration-Based Workflow: YAML configuration for reproducibility
- Progress Tracking: Real-time progress display and intermediate results saving
Why hydromodel?
For Researchers:
- Battle-tested XAJ implementations used in published research
- Configuration-based workflow ensures reproducibility
- Easy to extend with new models or calibration algorithms
For Practitioners:
- Simple YAML configuration, minimal coding required
- Handles multi-basin calibration efficiently
- Integration with global CAMELS series datasets (20+ variants)
- Clear documentation and examples
Installation
For Users
pip install hydromodel hydrodataset hydrodatasource
Or using uv (faster):
uv pip install hydromodel hydrodataset hydrodatasource
Development Setup
For developers, it is recommended to use uv to manage the environment, as this project has local dependencies (e.g., hydroutils, hydrodataset, hydrodatasource).
-
Clone the repository:
git clone https://github.com/OuyangWenyu/hydromodel.git cd hydromodel -
Sync the environment with
uv: This command will install all dependencies, including the local editable packages.uv sync --all-extras
Configuration
Option 1: Use Default Paths (Recommended for Quick Start)
No configuration needed! hydromodel automatically uses default paths:
Default data directory:
- Windows:
C:\Users\YourUsername\hydromodel_data\ - macOS/Linux:
~/hydromodel_data/
The default structure (aqua_fetch automatically creates uppercase dataset directories):
~/hydromodel_data/
├── datasets-origin/
│ ├── CAMELS_US/ # CAMELS US dataset (created by aqua_fetch)
│ ├── CAMELS_AUS/ # CAMELS Australia dataset (if used)
│ └── ... # Other datasets
├── datasets-interim/ # Your custom basin data
└── ...
Option 2: Custom Paths (For Advanced Users)
Create ~/hydro_setting.yml to specify custom paths:
local_data_path:
root: 'D:/data'
datasets-origin: 'D:/data' # For CAMELS datasets (aqua_fetch adds CAMELS_US automatically)
datasets-interim: 'D:/data/my_basins' # For custom data
Important: For CAMELS datasets, provide only the datasets-origin directory. The system automatically appends the uppercase dataset directory name (e.g., CAMELS_US, CAMELS_AUS). If your data is in D:/data/CAMELS_US/, set datasets-origin: 'D:/data'.
How to Use
1. Data Preparation
Using CAMELS Datasets (hydrodataset):
Getting public datasets using hydrodataset
pip install hydrodataset
Run the following code to download data to your directory
from hydrodataset.camels_us import CamelsUs
# Auto-downloads if not found. Provide datasets-origin directory (e.g., "D:/data")
# aqua_fetch automatically appends dataset name, creating "D:/data/CAMELS_US/"
ds = CamelsUs(data_path)
basin_ids = ds.read_object_ids() # Get basin IDs
Note: First-time download may take some time. The complete CAMELS dataset is approximately 70GB (including zipped and unzipped files).
Available datasets: please see README.md in hydrodataset
Using Custom Data (hydrodatasource):
For your own data to be read using hydrodatasource, it needs to be prepared in the format of selfmadehydrodataset :
pip install hydrodatasource
Data structure:
/path/to/your_data_root/
└── my_custom_dataset/ # your dataset name
├── attributes/
│ └── attributes.csv
├── shapes/
│ └── basins.shp
└── timeseries/
├── 1D/ # One sub folder per time resolution (e.g. 1D/3h/1h)
│ ├── basin_01.csv
│ ├── basin_02.csv
│ └── ...
└── 1D_units_info.json # JSON file containing unit information
Required files and formats:
-
attributes/attributes.csv: Basin metadata with required columns
basin_id: Unique basin identifier (e.g., "basin_001")area: Basin area in km² (mapped tobasin_areainternally)- Additional columns: Any basin attributes (e.g., elevation, slope)
-
shapes/basins.shp: Basin boundary shapefiles (all 4 files required: .shp, .shx, .dbf, .prj)
- Must contain
BASIN_IDcolumn (uppercase) matching basin IDs in attributes.csv - Geometries: Polygon features defining basin boundaries
- Coordinate system: Any valid CRS (e.g., EPSG:4326 for WGS84)
- Must contain
-
timeseries/{time_scale}/{basin_id}.csv: Time series data for each basin
time: Datetime column (e.g., "2010-01-01")- Variable columns:
prcp,PET,streamflow(or your chosen variable names) - Format: CSV with header row
-
timeseries/{time_scale}_units_info.json: Variable units metadata
- JSON format:
{"variable_name": "unit"}(e.g.,{"prcp": "mm/day"}) - Must match variable names in time series files
- JSON format:
For detailed format specifications and examples, see:
- Data Guide - Complete guide for both CAMELS and custom data
- hydrodatasource documentation - Source package
configs/example_config_selfmade.yaml- Complete configuration example for custom datasets
2. Quick Start: Calibration, Evaluation, Simulation, and Visualization
Option 1: Use Command-Line Scripts (Recommended for Beginners)
We provide ready-to-use scripts for model calibration, evaluation, simulation, and visualization:
# 1. Calibration (saves config files by default)
python scripts/run_xaj_calibration.py --config configs/example_config.yaml
# 2. Evaluation on test period
python scripts/run_xaj_evaluate.py --calibration-dir results/xaj_mz_SCE_UA
# 3. Simulation with custom parameters (no calibration required!)
python scripts/run_xaj_simulate.py --config configs/example_simulate_config.yaml --param-file configs/example_xaj_params.yaml --plot
# 4. Visualization (time series plots with precipitation and streamflow)
python scripts/visualize.py --eval-dir results/xaj_mz_SCE_UA/evaluation_test
# Visualize specific basins
python scripts/visualize.py --eval-dir results/xaj_mz_SCE_UA/evaluation_test --basins 01013500
Configuration Files:
Edit the appropriate configuration file for your data type:
configs/example_config.yaml- For continuous time series data (e.g., CAMELS datasets)configs/example_config_selfmade.yaml- For custom data and flood event datasets
All configuration options work with the same unified API. For detailed flood event data usage, see Usage Guide - Flood Event Data.
Option 2: Use Python API (For Advanced Users)
from hydromodel.trainers.unified_calibrate import calibrate
from hydromodel.trainers.unified_evaluate import evaluate
config = {
"data_cfgs": {
"data_source_type": "camels_us",
"basin_ids": ["01013500"],
"train_period": ["1985-10-01", "1995-09-30"],
"test_period": ["2005-10-01", "2014-09-30"],
"warmup_length": 365,
"variables": ["precipitation", "potential_evapotranspiration", "streamflow"]
},
"model_cfgs": {
"model_name": "xaj_mz",
},
"training_cfgs": {
"algorithm_name": "SCE_UA",
"algorithm_params": {"rep": 5000, "ngs": 1000},
"loss_config": {"type": "time_series", "obj_func": "RMSE"},
"output_dir": "results",
"experiment_name": "my_experiment",
},
"evaluation_cfgs": {
"metrics": ["NSE", "KGE", "RMSE"],
},
}
results = calibrate(config) # Calibrate
evaluate(config, param_dir="results/my_experiment", eval_period="test") # Evaluate
Results are saved in the results/ directory.
Core API
Configuration Structure
The unified API uses a configuration dictionary with four main sections:
config = {
"data_cfgs": {
"data_source_type": "camels_us", # Dataset type
"basin_ids": ["01013500"], # Basin IDs to calibrate
"train_period": ["1990-10-01", "2000-09-30"],
"test_period": ["2000-10-01", "2010-09-
