85 skills found · Page 1 of 3
lance-format / LanceOpen Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
apache / Parquet FormatApache Parquet Format
uber / PetastormPetastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
bigdatagenomics / AdamADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
Cinchoo / ChoETLETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Basekick-Labs / ArcHigh-performance analytical database. DuckDB SQL engine + Parquet storage + Arrow format. 18M+ records/sec ingestion. 6M+ rows/sec queries. Use for analytics, observability, AI, IoT, logs. Single Go binary. S3/Azure native. No vendor lock-in. AGPL-3.0
ironSource / Parquetjsfully asynchronous, pure JavaScript implementation of the Parquet file format
jcrobak / Parquet Pythonpython implementation of the parquet columnar file format.
Eugene-Mark / Bigdata File ViewerA cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
fraugster / Parquet GoGo package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.
eto-ai / RikaiParquet-based ML data format optimized for working with unstructured data
JuliaIO / Parquet.jlJulia implementation of Parquet columnar file format reader
Vitruves / Nail ParquetFast parquet command line tool with many functions, nailed it!
ddotta / ParquetizeR package that allows to convert databases of different formats to parquet format
skale-me / Node ParquetNodeJS module to access apache parquet format files
Vitruves / CarquetA high-performance, SIMD-optimized, pure C library for reading and writing Apache Parquet files.
severo / Awesome ParquetUseful resources for using the Parquet format
alsmola / Cloudtrail Parquet GlueGlue workflow to convert CloudTrail logs to Athena-friendly Parquet format
tideworks / Arvo2parquetExample program that writes Parquet formatted data to plain files (i.e., not Hadoop hdfs); Parquet is a columnar storage format.
blackrock / Xml To ParquetConvert one or more XML files into Apache Parquet format. Only requires a XSD and XML file to get started.