SkillAgentSearch skills...

Rdata

Reader of R datasets in .rda format, in Python

Install / Use

/learn @vnmabus/Rdata
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

rdata

|build-status| |docs| |coverage| |repostatus| |versions| |pypi| |conda| |zenodo| |pyOpenSci| |joss|

A Python library for R datasets.

.. Github does not support include in README for dubious security reasons, so we copy-paste instead. Also Github does not understand Sphinx directives. .. include:: docs/index.rst .. include:: docs/usage.rst

The package rdata offers a lightweight way in Python to import and export R datasets/objects stored in the ".rda" and ".rds" formats. Its main advantages are:

  • It is a pure Python implementation, with no dependencies on the R language or related libraries. Thus, it can be used anywhere where Python is supported, including the web using Pyodide <https://pyodide.org/>__.
  • It attempts to support all objects that can be meaningfully translated between R and Python. As opposed to other solutions, you are no limited to import dataframes or data with a particular structure.
  • It allows users to easily customize the conversion of R classes to Python ones and vice versa. Does your data use custom R classes? Worry no longer, as it is possible to define custom conversions to the Python classes of your choosing.
  • It has a permissive license (MIT). As opposed to other packages that depend on R libraries and thus need to adhere to the GPL license, you can use rdata as a dependency on MIT, BSD or even closed source projects.

Installation

Installing a stable release

The rdata package is on PyPi and can be installed using :code:pip:

.. code::

pip install rdata

The package is also available for :code:conda using the :code:conda-forge channel:

.. code::

conda install -c conda-forge rdata

Installing a develop version

The current version from the develop branch can be installed as

.. code::

pip install git+https://github.com/vnmabus/rdata.git@develop

Documentation

The documentation of rdata is in ReadTheDocs <https://rdata.readthedocs.io/>__.

Examples

Examples of use are available in ReadTheDocs <https://rdata.readthedocs.io/en/stable/auto_examples/>__.

Citing rdata

Please, if you find this software useful in your work, reference it citing the following paper:

.. code-block::

@article{ramos-carreno+rossi_2024_rdata, author = {Ramos-Carreño, Carlos and Rossi, Tuomas}, doi = {10.21105/joss.07540}, journal = {Journal of Open Source Software}, month = dec, number = {104}, pages = {1--4}, title = {{rdata: A Python library for R datasets}}, url = {https://joss.theoj.org/papers/10.21105/joss.07540#}, volume = {9}, year = {2024} }

You can additionally cite the software repository itself using:

.. code-block::

@misc{ramos-carreno++_2024_rdata-repo, author = {The rdata developers}, doi = {10.5281/zenodo.6382237}, month = dec, title = {rdata: A Python library for R datasets}, url = {https://github.com/vnmabus/rdata}, year = {2024} }

If you want to reference a particular version for reproducibility, check the version-specific DOIs available in Zenodo.

Usage

Read an R dataset

The common way of reading an rds file is:

.. code:: python

import rdata

converted = rdata.read_rds(rdata.TESTDATA_PATH / "test_dataframe.rds")
print(converted)

which returns the read dataframe:

.. code:: none

  class  value
1     a      1
2     b      2
3     b      3

The analog rda file can be read in a similar way:

.. code:: python

import rdata

converted = rdata.read_rda(rdata.TESTDATA_PATH / "test_dataframe.rda")
print(converted)

which returns a dictionary mapping the variable name defined in the file (:code:test_dataframe) to the dataframe:

.. code:: none

{'test_dataframe':   class  value
1     a      1
2     b      2
3     b      3}

Under the hood, these reading functions are equivalent to the following two-step code:

.. code:: python

import rdata

parsed = rdata.parser.parse_file(rdata.TESTDATA_PATH / "test_dataframe.rda")
converted = rdata.conversion.convert(parsed)
print(converted)

This consists of two steps:

#. First, the file is parsed using the function rdata.parser.parse_file <https://rdata.readthedocs.io/en/latest/modules/rdata.parser.parse_file.html>. This provides a literal description of the file contents as a hierarchy of Python objects representing the basic R objects. This step is unambiguous and always the same. #. Then, each object must be converted to an appropriate Python object. In this step there are several choices on which Python type is the most appropriate as the conversion for a given R object. Thus, we provide a default rdata.conversion.convert <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.convert.html> routine, which tries to select Python objects that preserve most information of the original R object. For custom R classes, it is also possible to specify conversion routines to Python objects as exemplified in the documentation <https://rdata.readthedocs.io/en/latest/usage.html#converting>__.

Write an R dataset

The common way of writing data to an rds file is:

.. code:: python

import pandas as pd
import rdata

df = pd.DataFrame({"class": pd.Categorical(["a", "b", "b"]), "value": [1, 2, 3]})
print(df)

rdata.write_rds("data.rds", df)

which writes the dataframe to file :code:data.rds:

.. code:: none

  class  value
0     a      1
1     b      2
2     b      3

Similarly, the dataframe can be written to an rda file with a given variable name:

.. code:: python

import pandas as pd
import rdata

df = pd.DataFrame({"class": pd.Categorical(["a", "b", "b"]), "value": [1, 2, 3]})
data = {"my_dataframe": df}
print(data)

rdata.write_rda("data.rda", data)

which writes the name-dataframe dictionary to file :code:data.rda:

.. code:: none

{'my_dataframe':   class  value
0     a      1
1     b      2
2     b      3}

Under the hood, these writing functions are equivalent to the following two-step code:

.. code:: python

import pandas as pd
import rdata

df = pd.DataFrame({"class": pd.Categorical(["a", "b", "b"]), "value": [1, 2, 3]})
data = {"my_dataframe": df}

r_data = rdata.conversion.convert_python_to_r_data(data, file_type="rda")
rdata.unparser.unparse_file("data.rda", r_data, file_type="rda")

This consists of two steps (reverse to reading):

#. First, each Python object is converted to an appropriate R object. Like in reading, there are several choices, and the default rdata.conversion.convert_python_to_r_data <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.convert_python_to_r_data.html>. routine tries to select R objects that preserve most information of the original Python object. For Python classes, it is also possible to specify custom conversion routines to R classes as exemplified in the documentation <https://rdata.readthedocs.io/en/latest/usage.html#converting>. #. Then, the created RData representation is unparsed to a file using the function rdata.unparser.unparse_file <https://rdata.readthedocs.io/en/latest/modules/rdata.unparser.unparse_file.html>__.

Additional examples

Additional examples illustrating the functionalities of this package can be found in the ReadTheDocs documentation <https://rdata.readthedocs.io/en/latest/auto_examples/index.html>__.

.. |build-status| image:: https://github.com/vnmabus/rdata/actions/workflows/main.yml/badge.svg?branch=master :alt: build status :target: https://github.com/vnmabus/rdata/actions/workflows/main.yml

.. |docs| image:: https://readthedocs.org/projects/rdata/badge/?version=latest :alt: Documentation Status :target: https://rdata.readthedocs.io/en/latest/?badge=latest

.. |coverage| image:: http://codecov.io/github/vnmabus/rdata/coverage.svg?branch=develop :alt: Coverage Status :target: https://codecov.io/gh/vnmabus/rdata/branch/develop

.. |repostatus| image:: https://www.repostatus.org/badges/latest/active.svg :alt: Project Status: Active – The project has reached a stable, usable state and is being actively developed. :target: https://www.repostatus.org/#active

.. |versions| image:: https://img.shields.io/pypi/pyversions/rdata :alt: PyPI - Python Version

.. |pypi| image:: https://badge.fury.io/py/rdata.svg :alt: Pypi version :target: https://pypi.python.org/pypi/rdata/

.. |conda| image:: https://anaconda.org/conda-forge/rdata/badges/version.svg :alt: Conda version :target: https://anaconda.org/conda-forge/rdata

.. |zenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.6382237.svg :alt: Zenodo DOI :target: https://doi.org/10.5281/zenodo.6382237

.. |pyOpenSci| image:: https://tinyurl.com/y22nb8up :alt: pyOpenSci: Peer reviewed :target: https://github.com/pyOpenSci/software-submission/issues/144

.. |joss| image:: https://joss.theoj.org/papers/10.21105/joss.07540/status.svg :target: https://doi.org/10.21105/joss.07540

Related Skills

View on GitHub
GitHub Stars57
CategoryDevelopment
Updated22h ago
Forks4

Languages

Python

Security Score

100/100

Audited on Apr 2, 2026

No findings