SkillAgentSearch skills...

Datar

A Grammar of Data Manipulation in python

Install / Use

/learn @pwwang/Datar
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

datar

A Grammar of Data Manipulation in python

<!-- badges -->

Pypi Github Building Docs and API Codacy Codacy coverage Downloads

Documentation | Reference Maps | Notebook Examples | API

datar is a re-imagining of APIs for data manipulation in python with multiple backends supported. Those APIs are aligned with tidyverse packages in R as much as possible.

Installation

pip install -U datar

# install with a backend
pip install -U datar[pandas]

# More backends support coming soon
<!-- ## Maximum compatibility with R packages |Package|Version| |-|-| |[dplyr][21]|1.0.8| -->

Backends

|Repo|Badges| |-|-| |datar-numpy|3 18| |datar-pandas|4 19| |datar-arrow|23 24|

Example usage

# with pandas backend
from datar import f
from datar.dplyr import mutate, filter_, if_else
from datar.tibble import tibble
# or
# from datar.all import f, mutate, filter_, if_else, tibble

df = tibble(
    x=range(4),  # or c[:4]  (from datar.base import c)
    y=['zero', 'one', 'two', 'three']
)
df >> mutate(z=f.x)
"""# output
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       1
2       2      two       2
3       3    three       3
"""

df >> mutate(z=if_else(f.x>1, 1, 0))
"""# output:
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       0
2       2      two       1
3       3    three       1
"""

df >> filter_(f.x>1)
"""# output:
        x        y
  <int64> <object>
0       2      two
1       3    three
"""

df >> mutate(z=if_else(f.x>1, 1, 0)) >> filter_(f.z==1)
"""# output:
        x        y       z
  <int64> <object> <int64>
0       2      two       1
1       3    three       1
"""
# works with plotnine
# example grabbed from https://github.com/has2k1/plydata
import numpy
from datar import f
from datar.base import sin, pi
from datar.tibble import tibble
from datar.dplyr import mutate, if_else
from plotnine import ggplot, aes, geom_line, theme_classic

df = tibble(x=numpy.linspace(0, 2 * pi, 500))
(
    df
    >> mutate(y=sin(f.x), sign=if_else(f.y >= 0, "positive", "negative"))
    >> ggplot(aes(x="x", y="y"))
    + theme_classic()
    + geom_line(aes(color="sign"), size=1.2)
)

example

# very easy to integrate with other libraries
# for example: klib
import klib
from pipda import register_verb
from datar import f
from datar.data import iris
from datar.dplyr import pull

dist_plot = register_verb(func=klib.dist_plot)
iris >> pull(f.Sepal_Length) >> dist_plot()

example

Testimonials

@coforfe:

Thanks for your excellent package to port R (dplyr) flow of processing to Python. I have been using other alternatives, and yours is the one that offers the most extensive and equivalent to what is possible now with dplyr.

View on GitHub
GitHub Stars287
CategoryDevelopment
Updated18d ago
Forks20

Languages

Python

Security Score

100/100

Audited on Mar 15, 2026

No findings