Tidier.jl
Meta-package for data analysis in Julia, modeled after the R tidyverse.
Install / Use
/learn @TidierOrg/Tidier.jlREADME
Tidier.jl
<a href="https://github.com/TidierOrg/Tidier.jl"><img src="https://raw.githubusercontent.com/TidierOrg/Tidier.jl/main/docs/src/assets/Tidier_jl_logo.png" align="left" style="padding-right:10px;" width="150"></img></a>
<a href="https://github.com/TidierOrg/Tidier.jl">Tidier.jl</a>
Tidier.jl is a data analysis package inspired by R's tidyverse and crafted specifically for Julia. Tidier.jl is a meta-package in that its functionality comes from a series of smaller packages. Installing and using Tidier.jl brings the combined functionality of each of these packages to your fingertips.
Installing Tidier.jl
There are 2 ways to install Tidier.jl: using the package console, or using Julia code when you're using the Julia console. You might also see the console referred to as the "REPL," which stands for Read-Evaluate-Print Loop. The REPL is where you can interactively run code and view the output.
Julia's REPL is particularly cool because it provides a built-in package REPL and shell REPL, which allow you to take actions on managing packages (in the case of the package REPL) or run shell commands (in the shell REPL) without ever leaving the Julia REPL.
To install the stable version of Tidier.jl, you can type the following into the Julia REPL:
]add Tidier
The ] character starts the Julia package manager. The add Tidier command tells the package manager to install the Tidier package from the Julia registry. You can exit the package REPL by pressing the backspace key to return to the Julia prompt.
If you already have the Tidier package installed, the add Tidier command will not update the package. Instead, you can update the package using the the update Tidier (or up Tidier for short) commnds. As with the add Tidier command, make sure you are in the package REPL before you run these package manager commands.
If you need to (or prefer to) install packages using Julia code, you can achieve the same outcome using the following code to install Tidier:
import Pkg
Pkg.add("Tidier")
You can update Tidier.jl using the Pkg.update() function, as follows:
import Pkg; Pkg.update("Tidier")
Note that while Julia allows you to separate statements by using multiple lines of code, you can also use a semi-colon (;) to separate multiple statements. This is convenient for short snippets of code. There's another practical reason to use semi-colons in coding, which is to silence the output of a function call. We will come back to this in the "Getting Started" section below.
In general, installing the latest version of the package from the Julia registry should be sufficient because we follow a continuous-release cycle. After every update to the code, we update the version based on the magnitude of the change and then release the latest version to the registry. That's why it's so important to know how to update the package!
However, if for some reason you do want to install the package directly from GitHub, you can get the newest version using either the package REPL...
]add Tidier#main
...or using Julia code.
import Pkg; Pkg.add(url="https://github.com/TidierOrg/Tidier.jl")
Loading Tidier.jl
Once you've installed Tidier.jl, you can load it by typing:
using Tidier
When you type this command, multiple things happen behind the scenes. First, the following packages are loaded and re-exported, which is to say that all of the exported macros and functions from these packages become available:
- TidierData
- TidierPlots
- TidierDB
- TidierCats
- TidierDates
- TidierStrings
- TidierText
- TidierVest
- TidierIteration
Don't worry if you don't know what each of these packages does yet. We will cover them in package-specific documentation pages, which can be accessed below. For now, all you need to know is that these smaller packages are actually the ones doing all the work when you use Tidier.
There are also a few other packages whose exported functions also become available. We will discuss these in the individual package documentation, but the most important ones for you to know about are:
- The
DataFrame()function from the DataFrames package is re-exported so that you can create a data frame without loading the DataFrames package. - The
@chain()macro from the Chain package is re-exported, so you chain together functions and macros - The entire Statistics package is re-exported so you can access summary statistics like
mean()andmedian() - The CategoricalArrays package is re-exported so you can access the
categorical()function to define categorical variables - The Dates package is re-exported to enable support for variables containing dates
What can Tidier.jl do?
Before we dive into an introduction of Julia and a look into how Tidier.jl works, it's useful to show you what Tidier.jl can do. First, we will read in some data, and then we will use Tidier.jl to chain together some data analysis operations.
First, let's read in the "Visits to Physician Office" dataset.
This dataset comes with the Ecdat R package and and is titled OFP. You can read more about the dataset here. To read in datasets packaged with commonly used R packages, we can use the RDatasets Julia package.
julia> using Tidier, RDatasets
julia> ofp = dataset("Ecdat", "OFP")
4406×19 DataFrame
Row │ OFP OFNP OPP OPNP EMR Hosp NumChro ⋯
│ Int32 Int32 Int32 Int32 Int32 Int32 Int32 ⋯
──────┼────────────────────────────────────────────────────
1 │ 5 0 0 0 0 1 ⋯
2 │ 1 0 2 0 2 0
3 │ 13 0 0 0 3 3
4 │ 16 0 5 0 1 1
5 │ 3 0 0 0 0 0 ⋯
6 │ 17 0 0 0 0 0
7 │ 9 0 0 0 0 0
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
4401 │ 12 4 1 0 0 0
4402 │ 11 0 0 0 0 0 ⋯
4403 │ 12 0 0 0 0 0
4404 │ 10 0 20 0 1 1
4405 │ 16 1 0 0 0 0
4406 │ 0 0 0 0 0 0 ⋯
13 columns and 4393 rows omitted
Note that a preview of the data frame is automatically printed to the console. The reason this happens is that when you run this code line by line, the output of each line is printed to the console. This is convenient because it saves you from having to directly print the newly created ofp to the console in order to get a preview for what it contains. If this code were bundled in a code chunk (such as in a Jupyter notebook), then only the final line of the code chunk would be printed.
The exact number of rows and columns that print will depend on the physical size of the REPL window. If you resize the console (e.g., in VS Code), Julia will adjust the number of rows/columns accordingly.
If you want to suppress the output, you can add a ; at the end of this statement, like this:
julia> ofp = dataset("Ecdat", "OFP"); # Nothing prints
With the OFP dataset loaded, let's ask some basic questions.
What does the dataset consist of?
We can use @glimpse() to find out the columns, data types, and peek at the first few values contained within the dataset.
julia> @glimpse(ofp)
Rows: 4406
Columns: 19
.OFP Int32 5, 1, 13, 16, 3, 17, 9, 3, 1, 0, 0, 44, 2, 1, 19,
.OFNP Int32 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0,
.OPP Int32 0, 2, 0, 5, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 1, 0, 0,
.OPNP Int32 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0,
.EMR Int32 0, 2, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
.Hosp Int32 1, 0, 3, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0,
.NumChron Int32 2, 2, 4, 2, 2, 5, 0, 0, 0, 0, 1, 5, 1, 1, 1, 0, 1,
.AdlDiff Int32 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1,
.Age Float64 6.9, 7.4, 6.6, 7.6, 7.9, 6.6, 7.5, 8.7, 7.3, 7.8,
.Black CategoricalValue{String, UInt8}yes, no, yes, no, no, no, no, no,
.Sex CategoricalValue{String, UInt8}male, female, female, male, female
.Married CategoricalValue{String, UInt8}yes, yes, no, yes, yes, no, no, no
.School Int32 6, 10, 10, 3, 6, 7, 8, 8, 8, 8, 8, 15, 8, 8, 12, 8
.FamInc Float64 2.881, 2.7478, 0.6532, 0.6588, 0.6588, 0.3301, 0.8
.Employed CategoricalValue{String, UInt8}yes, no, no, no, no, no, no, no, n
.Privins CategoricalValue{String, UInt8}yes, yes
Related Skills
feishu-drive
345.4k|
things-mac
345.4kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
345.4kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
codebase-memory-mcp
1.1kHigh-performance code intelligence MCP server. Indexes codebases into a persistent knowledge graph — average repo in milliseconds. 66 languages, sub-ms queries, 99% fewer tokens. Single static binary, zero dependencies.
