SkillAgentSearch skills...

Cfgrib

A Python interface to map GRIB files to the NetCDF Common Data Model following the CF Convention using ecCodes

Install / Use

/learn @ecmwf/Cfgrib
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

cfgrib: A Python interface to map GRIB files to the NetCDF Common Data Model following the CF Convention using ecCodes

.. image:: https://img.shields.io/pypi/v/cfgrib.svg :target: https://pypi.python.org/pypi/cfgrib/

Python interface to map GRIB files to the Unidata's Common Data Model v4 <https://docs.unidata.ucar.edu/netcdf-java/current/userguide/common_data_model_overview.html>_ following the CF Conventions <http://cfconventions.org/>. The high level API is designed to support a GRIB engine for xarray <http://xarray.pydata.org/> and it is inspired by netCDF4-python <http://unidata.github.io/netcdf4-python/>_ and h5netcdf <https://github.com/shoyer/h5netcdf>. Low level access and decoding is performed via the ECMWF ecCodes library <https://confluence.ecmwf.int/display/ECC/> and the eccodes python package <https://pypi.org/project/eccodes>_.

Features with development status Beta:

  • enables the engine='cfgrib' option to read GRIB files with xarray,
  • reads most GRIB 1 and 2 files including heterogeneous ones with cfgrib.open_datasets,
  • supports all modern versions of Python 3.9, 3.8, 3.7 and PyPy3,
  • the 0.9.6.x series with support for Python 2 will stay active and receive critical bugfixes,
  • works wherever eccodes-python does: Linux, MacOS and Windows
  • conda-forge package on all supported platforms,
  • reads the data lazily and efficiently in terms of both memory usage and disk access,
  • allows larger-than-memory and distributed processing via xarray and dask,
  • supports translating coordinates to different data models and naming conventions,
  • supports writing the index of a GRIB file to disk, to save a full-file scan on open,
  • accepts objects implementing a generic Fieldset interface as described in ADVANCED_USAGE.rst.

Work in progress:

  • Beta install a cfgrib utility that can convert a GRIB file to_netcdf with a optional conversion to a specific coordinates data model, see #40 <https://github.com/ecmwf/cfgrib/issues/40>_.
  • Alpha/Broken support writing carefully-crafted xarray.Dataset's to a GRIB1 or GRIB2 file, see the Advanced write usage section below, #18 <https://github.com/ecmwf/cfgrib/issues/18>_ and #156 <https://github.com/ecmwf/cfgrib/issues/156>_.

Limitations:

  • relies on ecCodes for the CF attributes of the data variables,
  • relies on ecCodes for anything related to coordinate systems / gridType, see #28 <https://github.com/ecmwf/cfgrib/issues/28>_.

Installation

The easiest way to install cfgrib and all its binary dependencies is via Conda <https://conda.io/>_::

$ conda install -c conda-forge cfgrib

alternatively, if you install the binary dependencies yourself, you can install the Python package from PyPI with::

$ pip install cfgrib

Binary dependencies

cfgrib depends on the eccodes python package <https://pypi.org/project/eccodes>_ to access the ECMWF ecCodes binary library, when not using conda please follow the System dependencies section there.

You may run a simple selfcheck command to ensure that your system is set up correctly::

$ python -m cfgrib selfcheck
Found: ecCodes v2.20.0.
Your system is ready.

Usage

First, you need a well-formed GRIB file, if you don't have one at hand you can download our ERA5 on pressure levels sample <https://sites.ecmwf.int/repository/earthkit-data/test-data/era5-levels-members.grib>_::

$ wget https://sites.ecmwf.int/repository/earthkit-data/test-data/era5-levels-members.grib

Read-only xarray GRIB engine

Most of cfgrib users want to open a GRIB file as a xarray.Dataset and need to have xarray installed::

$ pip install xarray

In a Python interpreter try:

.. code-block:: python

>>> import xarray as xr
>>> ds = xr.open_dataset('era5-levels-members.grib', engine='cfgrib')
>>> ds
<xarray.Dataset>
Dimensions:        (number: 10, time: 4, isobaricInhPa: 2, latitude: 61,
                    longitude: 120)
Coordinates:
* number         (number) int64 0 1 2 3 4 5 6 7 8 9
* time           (time) datetime64[ns] 2017-01-01 ... 2017-01-02T12:00:00
    step           timedelta64[ns] ...
* isobaricInhPa  (isobaricInhPa) float64 850.0 500.0
* latitude       (latitude) float64 90.0 87.0 84.0 81.0 ... -84.0 -87.0 -90.0
* longitude      (longitude) float64 0.0 3.0 6.0 9.0 ... 351.0 354.0 357.0
    valid_time     (time) datetime64[ns] ...
Data variables:
    z              (number, time, isobaricInhPa, latitude, longitude) float32 ...
    t              (number, time, isobaricInhPa, latitude, longitude) float32 ...
Attributes:
    GRIB_edition:            1
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             European Centre for Medium-Range Weather Forecasts
    history:                 ...

The cfgrib engine supports all read-only features of xarray like:

  • merge the content of several GRIB files into a single dataset using xarray.open_mfdataset,
  • work with larger-than-memory datasets with dask <https://dask.org/>_,
  • allow distributed processing with dask.distributed <http://distributed.dask.org>_.

Read arbitrary GRIB keys

By default cfgrib reads a limited set of ecCodes recognised keys from the GRIB files and exposes them as Dataset or DataArray attributes with the GRIB_ prefix. It is possible to have cfgrib read additional keys to the attributes by adding the read_keys dictionary key to the backend_kwargs with values the list of desired GRIB keys:

.. code-block:: python

>>> ds = xr.open_dataset('era5-levels-members.grib', engine='cfgrib',
...                      backend_kwargs={'read_keys': ['experimentVersionNumber']})
>>> ds.t.attrs['GRIB_experimentVersionNumber']
'0001'

Translate to a custom data model

Contrary to netCDF the GRIB data format is not self-describing and several details of the mapping to the Unidata Common Data Model are arbitrarily set by the software components decoding the format. Details like names and units of the coordinates are particularly important because xarray broadcast and selection rules depend on them. cf2cfm is a small coordinate translation module distributed with cfgrib that make it easy to translate CF compliant coordinates, like the one provided by cfgrib, to a user-defined custom data model with set out_name, units and stored_direction.

For example to translate a cfgrib styled xr.Dataset to the classic ECMWF coordinate naming conventions you can:

.. code-block:: python

>>> import cf2cdm
>>> ds = xr.open_dataset('era5-levels-members.grib', engine='cfgrib')
>>> cf2cdm.translate_coords(ds, cf2cdm.ECMWF)
<xarray.Dataset>
Dimensions:     (number: 10, time: 4, level: 2, latitude: 61, longitude: 120)
Coordinates:
* number      (number) int64 0 1 2 3 4 5 6 7 8 9
* time        (time) datetime64[ns] 2017-01-01 ... 2017-01-02T12:00:00
    step        timedelta64[ns] ...
* level       (level) float64 850.0 500.0
* latitude    (latitude) float64 90.0 87.0 84.0 81.0 ... -84.0 -87.0 -90.0
* longitude   (longitude) float64 0.0 3.0 6.0 9.0 ... 348.0 351.0 354.0 357.0
    valid_time  (time) datetime64[ns] ...
Data variables:
    z           (number, time, level, latitude, longitude) float32 ...
    t           (number, time, level, latitude, longitude) float32 ...
Attributes:
    GRIB_edition:            1
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             European Centre for Medium-Range Weather Forecasts
    history:                 ...

To translate to the Common Data Model of the Climate Data Store use:

.. code-block:: python

>>> import cf2cdm
>>> cf2cdm.translate_coords(ds, cf2cdm.CDS)
<xarray.Dataset>
Dimensions:                  (realization: 10, forecast_reference_time: 4,
                            plev: 2, lat: 61, lon: 120)
Coordinates:
* realization              (realization) int64 0 1 2 3 4 5 6 7 8 9
* forecast_reference_time  (forecast_reference_time) datetime64[ns] 2017-01...
    leadtime                 timedelta64[ns] ...
* plev                     (plev) float64 8.5e+04 5e+04
* lat                      (lat) float64 -90.0 -87.0 -84.0 ... 84.0 87.0 90.0
* lon                      (lon) float64 0.0 3.0 6.0 9.0 ... 351.0 354.0 357.0
    time                     (forecast_reference_time) datetime64[ns] ...
Data variables:
    z                        (realization, forecast_reference_time, plev, lat, lon) float32 ...
    t                        (realization, forecast_reference_time, plev, lat, lon) float32 ...
Attributes:
    GRIB_edition:            1
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             European Centre for Medium-Range Weather Forecasts
    history:                 ...

Filter heterogeneous GRIB files

xr.open_dataset can open a GRIB file only if all the messages with the same shortName can be represented as a single hypercube. For example, a variable t cannot have both isobaricInhPa and ``hy

View on GitHub
GitHub Stars453
CategoryDevelopment
Updated5d ago
Forks83

Languages

Python

Security Score

100/100

Audited on Mar 31, 2026

No findings