Data.table
R's data.table package extends data.frame:
Install / Use
/learn @Rdatatable/Data.tableREADME
data.table <a href="https://r-datatable.com"><img src="https://raw.githubusercontent.com/Rdatatable/data.table/master/.graphics/logo.png" align="right" height="140" /></a>
<!-- badges: start --> <!-- badges: end -->data.table provides a high-performance version of base R's data.frame with syntax and feature enhancements for ease of use, convenience and programming speed.
The data.table project uses a custom governance agreement
and is fiscally sponsored by NumFOCUS. Consider making
a tax-deductible donation to help the project
pay for developer time, professional services, travel, workshops, and a variety of other needs.
Why data.table?
- concise syntax: fast to type, fast to read
- fast speed
- memory efficient
- careful API lifecycle management
- community
- feature rich
Features
- fast and friendly delimited file reader:
?fread, see also convenience features for small data - fast and feature rich delimited file writer:
?fwrite - low-level parallelism: many common operations are internally parallelized to use multiple CPU threads
- fast and scalable aggregations; e.g. 100GB in RAM (see benchmarks on up to two billion rows)
- fast and feature rich joins: ordered joins (e.g. rolling forwards, backwards, nearest and limited staleness), overlapping range joins (similar to
IRanges::findOverlaps), non-equi joins (i.e. joins using operators>, >=, <, <=), aggregate on join (by=.EACHI), update on join - fast add/update/delete columns by reference by group using no copies at all
- fast and feature rich reshaping data:
?dcast(pivot/wider/spread) and?melt(unpivot/longer/gather) - any R function from any R package can be used in queries not just the subset of functions made available by a database backend, also columns of type
listare supported - has no dependencies at all other than base R itself, for simpler production/maintenance
- the R dependency is as old as possible for as long as possible, currently R 3.5.0 (2018), and we continuously test against that version
Installation
install.packages("data.table")
# latest development version (only if newer available)
data.table::update_dev_pkg()
# latest development version (force install)
install.packages("data.table", repos="https://rdatatable.gitlab.io/data.table")
See the Installation wiki for more details.
Usage
Use data.table subset [ operator the same way you would use data.frame one, but...
- no need to prefix each column with
DT$(likesubset()andwith()but built-in) - any R expression using any package is allowed in
jargument, not just list of columns - extra argument
byto computejexpression by group
library(data.table)
DT = as.data.table(iris)
# FROM[WHERE, SELECT, GROUP BY]
# DT [i, j, by]
DT[Petal.Width > 1.0, mean(Petal.Length), by = Species]
# Species V1
#1: versicolor 4.362791
#2: virginica 5.552000
Getting started
- Introduction to data.table vignette
- Getting started wiki page
- Examples produced by
example(data.table)
Cheatsheets
<a href="https://raw.githubusercontent.com/rstudio/cheatsheets/master/datatable.pdf"><img src="https://raw.githubusercontent.com/rstudio/cheatsheets/master/pngs/datatable.png" width="615" height="242"/></a>
Community
data.table is widely used by the R community. It is being directly used by hundreds of CRAN and Bioconductor packages, and indirectly by thousands. It is one of the top most starred R packages on GitHub, and was highly rated by the Depsy project. If you need help, the data.table community is active on StackOverflow.
A list of packages that significantly support, extend, or make use of data.table can be found in the Seal of Approval document.
Stay up-to-date
- click the Watch button at the top and right of GitHub project page
- read NEWS file
- follow #rdatatable and the r_data_table account on X/Twitter
- follow #rdatatable and the r_data_table account on fosstodon
- follow the data.table community page on LinkedIn
- watch recent Presentations
- read recent Articles
- read posts on The Raft
Contributing
Guidelines for filing issues / pull requests: Contribution Guidelines.
Related Skills
node-connect
344.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
96.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
344.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
344.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
