587 skills found · Page 1 of 20
halide / Halidea language for fast, portable data-parallel computation
diku-dk / Futhark:boom::computer::boom: A data-parallel functional programming language
numaproj / NumaflowKubernetes-native platform to run massively parallel data/streaming jobs
VcDevel / VcSIMD Vector Classes for C++
tilo / Smarter CsvFastest end-to-end CSV ingestion for Ruby (with C acceleration). SmarterCSV auto-detects formats, applies smart defaults, and returns Rails-ready hashes for seamless use with ActiveRecord, Sidekiq, parallel jobs, and S3 pipelines — even for messy user-uploaded real-world data.
functime-org / FunctimeTime-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.
yandex / YaFSDPYaFSDP: Yet another Fully Sharded Data Parallel
Tiramisu-Compiler / TiramisuA polyhedral compiler for expressing fast and portable data parallel algorithms
GoogleCloudPlatform / DataflowJavaSDKGoogle Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
tuplex / TuplexTuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather than invoking the Python interpreter, Tuplex generates optimized LLVM bytecode for the given pipeline and input data set.
hpcc-systems / HPCC PlatformHPCC Systems (High Performance Computing Cluster) is an open source, massive parallel-processing computing platform for big data processing and analytics.
DiskFrame / Disk.frameFast Disk-Based Parallelized Data Manipulation Framework for Larger-than-RAM Data
binpash / PashPaSh: Light-touch Data-Parallel Shell Processing
spcl / DaceDaCe - Data Centric Parallel Programming
leimao / Voice Converter CycleGANVoice Converter Using CycleGAN and Non-Parallel Data
MicrosoftResearch / NaiadThe Naiad system provides fast incremental and iterative computation for data-parallel workloads
ufora / UforaCompiled, automatically parallel Python for data science
cudpp / CudppCUDA Data Parallel Primitives Library
timescale / Timescaledb Parallel CopyA binary for parallel copying of CSV data into a TimescaleDB hypertable
lithops-cloud / LithopsA multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀