56 skills found · Page 2 of 2
rust-rs / Tabler📊 Tabler: A lightweight TUI tool to view, query, and navigate CSV, TSV, and Parquet data files.
grouzen / Zio Apache ParquetScala ZIO-powered Apache Parquet library
casidiablo / Parquet Tools For Dumb People Like Meparquet-tools for the masses
rdblue / Parquet CliParquet Command-line Tools
GuandataOSS / Universe LiteA lightweight ELT & ETL tool, based on Duckdb and Apache Parquet, seamless integration with Python & Java plugins
FabioBatSilva / XpqSimple command line (CLI) tool to inspect parquet files
contactsunny / Parquet File Writer POCThis is a simple Java POC to create Parquet files This is a Spring Boot project.
makepath / Census ParquetPython tools for creating Parquet files from 2020 Census Data
scientist-hq / Dracula Covid19An ETL tool for converting untyped CSV to parquet. Also triggers data lake updates.
vahid110 / Sqlxportsql2parquet: A modern CLI tool to export SQL query results from PostgreSQL or Amazon Redshift directly to Parquet files, with optional upload to S3 or MinIO.
CEDStandards / CEDS Data Warehouse ParquetThe Common Education Data Standards (CEDS) Data Warehouse Parquet (DW Parquet) standard is designed for data engineering and data science needs in the cloud. The DW Parquet Models mirror the SQL-based CEDS Data Warehouse. Parquet files are designed for rapid and distributed reporting across multiple technology stacks, data processing and BI tools, and are cloud vendor agnostic. This standard is ideal for stakeholders implementing reporting structures in a data lake environment.
marklit / PqviewParquet File Statistics Reporting Tool
neptune-ai / Neptune ExporterCLI tool to move Neptune experiments (version 2.x or 3.x) to disk as parquet + files, with an option to load them into alternative supported trackers.
stoewer / Parquet CliCommnd line tool to analyze parquet files
rouapps / CaretTerminal tool for inspecting and cleaning large LLM training datasets. Handles JSONL, Parquet, and CSV with memory-mapped I/O, near-duplicate detection, token visualization, dataset linting, and an MCP server.
NathanHowell / Parquet ToolsNo description available
viirya / Parquet Toolsparquet-tools and dependency jar files
Cobliteam / Kafka Topic DumperPython tool to get messages from kafka and send it to an AWS-S3 bucket in parquet format
maxmind / MmdbconvertA command-line tool to merge multiple MaxMind MMDB databases and export to CSV, Parquet, or MMDB format.
tonivade / PqThe objetive is create a tool similar to jq but for parquet files