Hdf5storage
Python package to read and write a wide range of Python types to/from HDF5 formatted files. Can read/write data to the HDF5 based Matlab v7.3 MAT files.
Install / Use
/learn @frejanordsiek/Hdf5storageREADME
Overview
This Python package provides high level utilities to read/write a variety of Python types to/from HDF5 (Heirarchal Data Format) formatted files. This package also provides support for MATLAB MAT v7.3 formatted files, which are just HDF5 files with a different extension and some extra meta-data.
All of this is done without pickling data. Pickling is bad for security because it allows arbitrary code to be executed in the interpreter. One wants to be able to read possibly HDF5 and MAT files from untrusted sources, so pickling is avoided in this package.
The package's documetation is found at http://pythonhosted.org/hdf5storage/
The package's source code is found at https://github.com/frejanordsiek/hdf5storage
The package is licensed under a 2-clause BSD license (https://github.com/frejanordsiek/hdf5storage/blob/master/COPYING.txt).
Installation
Dependencies
This package only supports Python >= 3.7. Python < 3.7 support was dropped in version 0.2.
This package requires the python packages to run
numpy <https://pypi.org/project/numpy>_h5py <https://pypi.org/project/h5py>_ >= 3.3setuptools <https://pypi.org/project/setuptools>_
Note that support for h5py <https://pypi.org/project/h5py>_ 2.1 to 3.2.x
has been dropped in version 0.2.
This package also has the following optional dependencies
scipy <https://pypi.org/project/scipy>_
Installing by pip
This package is on PyPI <https://pypi.org>_ at
hdf5storage <https://pypi.org/project/hdf5storage>_. To install hdf5storage
using pip, run the command::
pip install hdf5storage
Installing from Source
To install hdf5storage from source,
setuptools <https://pypi.org/project/setuptools>_ >= 61.0.0 is required.
Download this package and then install the dependencies ::
pip install -r requirements.txt
Then to install the package, run either ::
pip install .
Running Tests
For testing, the package pytest <https://pypi.org/project/pytest>_
(>= 6.0) is additionally required. There are some tests that require
Matlab and scipy <https://pypi.org/project/scipy>_ to be installed
and be in the executable path respectively. In addition, there are some
tests that require Julia <http://julialang.org/>_ with the
MAT <https://github.com/simonster/MAT.jl>_ package. Not having them
means that those tests cannot be run (they will be skipped) but all
the other tests will run. To install all testing dependencies, other
than scipy <https://pypi.org/project/scipy>_, Julia, Matlab run ::
pip install -r requirements_tests.txt.
To run the tests ::
pytest
Building Documentation
The documentation additionally requires the following packages
sphinx <https://pypi.org/project/sphinx>_ >= 1.7sphinx_rtd_theme <https://pypi.org/project/sphinx-rtd-theme>_
The documentation dependencies can be installed by ::
pip install -r requirements_doc.txt
To build the HTML documentation, run either ::
sphinx-build doc/source doc/build/html
Development
All Python code is formatted using black <https://pypi.org/project/black>_.
Releases and Pull Requests should pass all unit tests, and ideally pass type
checking and have no warnings found by linting.
Type Checking
This package now has type annotations since version 0.2, which can be checked
with a type checker like mypy <https://pypi.org/project/mypy>. To check with
mypy <https://pypi.org/project/mypy>, run ::
mypy -p hdf5storage
Linting
This package has the configuration in pyproject.toml for linting with
ruff <https://pypi.org/project/ruff>_pylint <https://pypi.org/project/pylint>_
To lint with ruff <https://pypi.org/project/ruff>_, run ::
ruff .
To lint with pylint <https://pypi.org/project/pylint>_, run ::
pylint src/*/*.py
Python 2
This package no longer supports Python 2.6 and 2.7. This package was designed and written for Python 3, then backported to Python 2.x, and then support dropped. But it can still read files made by version 0.1.x of this library with Python 2.x, and this package still tries to write files compatible with 0.1.x when possible.
Hierarchal Data Format 5 (HDF5)
HDF5 files (see http://www.hdfgroup.org/HDF5/) are a commonly used file format for exchange of numerical data. It has built in support for a large variety of number formats (un/signed integers, floating point numbers, strings, etc.) as scalars and arrays, enums and compound types. It also handles differences in data representation on different hardware platforms (endianness, different floating point formats, etc.). As can be imagined from the name, data is represented in an HDF5 file in a hierarchal form modelling a Unix filesystem (Datasets are equivalent to files, Groups are equivalent to directories, and links are supported).
This package interfaces HDF5 files using the h5py package (http://www.h5py.org/) as opposed to the PyTables package (http://www.pytables.org/).
MATLAB MAT v7.3 file support
MATLAB (http://www.mathworks.com/) MAT files version 7.3 and later are
HDF5 files with a different file extension (.mat) and a very
specific set of meta-data and storage conventions. This package provides
read and write support for a limited set of Python and MATLAB types.
SciPy (http://scipy.org/) has functions to read and write the older MAT
file formats. This package has functions modeled after the
scipy.io.savemat and scipy.io.loadmat functions, that have the
same names and similar arguments. The dispatch to the SciPy versions if
the MAT file format is not an HDF5 based one.
Supported Types
The supported Python and MATLAB types are given in the tables below. The tables assume that one has imported collections and numpy as::
import collections as cl
import numpy as np
The table gives which Python types can be read and written, the first version of this package to support it, the numpy type it gets converted to for storage (if type information is not written, that will be what it is read back as) the MATLAB class it becomes if targetting a MAT file, and the first version of this package to support writing it so MATlAB can read it.
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| Python | MATLAB | Notes |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| Type | Version | Converted to | Class | Version | |
+====================+=========+=========================+=============+=========+===================+
| bool | 0.1 | np.bool_ or np.uint8 | logical | 0.1 | [1]_ |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| None | 0.1 | np.float64([]) | [] | 0.1 | |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| Ellipsis | 0.2 | np.float64([]) | [] | 0.2 | |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| NotImplemented | 0.2 | np.float64([]) | [] | 0.2 | |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| int | 0.1 | np.int64 or np.bytes_ | int64 | 0.1 | [2]_ [3]_ |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| long | 0.1 | np.int64 or np.bytes_ | int64 | 0.1 | [3]_ [4]_ |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| float | 0.1 | np.float64 | double | 0.1 | |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| complex | 0.1 | np.complex128 | double | 0.1 | |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| str | 0.1 | np.uint32/16 | char | 0.1 | [5]_ |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| bytes | 0.1 | np.bytes_ or np.uint16 | char | 0.1 | [6]_ |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| bytearray | 0.1 | np.bytes_ or np.uint16 | char | 0.1 | [6]_ |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| list | 0.1 | np.object_ | cell | 0.1 | |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| tuple | 0.1 | np.object_ | cell | 0.1 | |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| set | 0.1 | np.object_ | cell | 0.1 | |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| frozenset | 0.1 | np.object_ | cell | 0.1 | |
+--------------------+---------+-------------------------+-------------+---------+-------------------+
| cl.d
Related Skills
node-connect
350.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
350.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
350.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
