DataLoaders.jl
A parallel iterator for large machine learning datasets that don't fit into memory inspired by PyTorch's `DataLoader` class.
Install / Use
/learn @lorenzoh/DataLoaders.jlREADME
DataLoaders.jl
A Julia package implementing performant data loading for deep learning on out-of-memory datasets that. Works like PyTorch's DataLoader.
What does it do?
- Uses multi-threading to load data in parallel while keeping the primary thread free for the training loop
- Handles batching and collating
- Is simple to extend for custom datasets
- Integrates well with other packages in the ecosystem
- Allows for inplace loading to reduce memory load
When should you use it?
- You have a dataset that does not fit into memory
- You want to reduce the time your training loop is waiting for the next batch of data
How do you use it?
Install like any other Julia package using the package manager (see setup):
]add DataLoaders
After installation, import it, create a DataLoader from a dataset and batch size, and iterate over it:
using DataLoaders
# 10.000 observations of inputs with 128 features and one target feature
data = (rand(128, 10000), rand(1, 10000))
dataloader = DataLoader(data, 16)
for (xs, ys) in dataloader
@assert size(xs) == (128, 16)
@assert size(ys) == (1, 16)
end
Next, you may want to read
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
13.8kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
000-main-rules
Project Context - Name: Interactive Developer Portfolio - Stack: Next.js (App Router), TypeScript, React, Tailwind CSS, Three.js - Architecture: Component-driven UI with a strict separation of conce
