StreamSampling.jl
Sampling methods for data streams
Install / Use
/learn @JuliaDynamics/StreamSampling.jlREADME
StreamSampling.jl
The scope of this package is to provide general methods to sample from any stream in a single pass through the data, even when the number of items contained in the stream is unknown.
This has some advantages over other sampling procedures:
- If the iterable is lazy, the memory required is a small constant or grows in relation to the size of the sample, instead of all the population.
- With reservoir methods, the sample collected is a random sample of the portion of the stream seen thus far at any point of the sampling process.
- In some cases, sampling with the techniques implemented in this library can bring considerable performance gains, since the population of items doesn't need to be previously stored in memory.
For information about the available functionalities consult the documentation.
Installation
To install the package you can do
julia> ]
(@v1.xx) pkg> add StreamSampling
or
julia> using Pkg
julia> Pkg.add("StreamSampling")
If you want to install the latest source that is not registered yet, you can instead do
julia> using Pkg
julia> Pkg.develop("StreamSampling")
Seek Support
If you have general questions, need help using the package, or want to brainstorm new ideas, please use the Discussions section.
Contributing
Contributions are welcome!
- If you encounter a bug or have a concrete feature proposal, feel free to open an issue.
- If you'd like to contribute to the codebase, we'd love to see your pull requests!
