85 skills found · Page 2 of 3
laserson / Avro2parquetHadoop MapReduce tool to convert Avro data files to Parquet format.
bigdatagenomics / EggoReady-to-go Parquet-formatted public 'omics datasets
datenoio / IterabledataPython library to read, write and convert data files with formats BSON, JSON, NDJSON, Parquet, ORC, XLS, XLSX, XML and many others
grouzen / Zio Apache ParquetScala ZIO-powered Apache Parquet library
aws-samples / Amazon Redshift Query Patterns And OptimizationsIn this workshop you will launch an Amazon Redshift cluster in your AWS account and load sample data ~ 100GB using TPCH dataset. You will learn query patterns that affects Redshift performance and how to optimize them. In this lab we will also provide a framework to simulate workload management (WLM) queue and run concurrent queries in regular interval and measure performance metrics- query throughput, query duration etc. We will also provide some use cases for Redshift spectrum to query data from s3 in columnar format such as Parquet.
shrinivdeshmukh / AvroconvertConvert avro files to parquet, csv and json format
DerwenAI / PynockA proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologies
redapt / Pyspark S3 Parquet ExampleThis repo demonstrates how to load a sample Parquet formatted file from an AWS S3 Bucket. A python job will then be submitted to a Apache Spark instance running on AWS EMR, which will run a SQLContext to create a temporary table using a DataFrame. SQL queries will then be possible against the temporary table.
sunchao / Parquet Format RsApache Parquet format for Rust, hosting the Thrift definition file and the generated .rs file
yigitozkavci / Parquet HsRead parquet format with Haskell
contactsunny / Parquet File Writer POCThis is a simple Java POC to create Parquet files This is a Spring Boot project.
necatiarslan / Table Fakertablefaker is a versatile Python package that enables effortless generation of realistic yet synthetic table data and save in CSV, Parquet, Delta Lake, Excel, Sql formats for various applications.
MrPowers / Python Parquet ExamplesUsing the Parquet file format with Python
ljishen / Tpch DataGenerate tpch data in parquet format
opencost / Opencost Parquet ExporterExport OpenCost data in parquet format
nezihyigitbasi / FlinkParquetUsing the Parquet file format (with Avro) to process data with Apache Flink
renesugar / FileConvertConverts between file formats such as CSV and Parquet
jinbeizame007 / Rosbag2parquetA ROS-independent CLI to convert ROS 2 bag files (MCAP format) into Apache Parquet.
AndreaBozzo / DataprofLibrary and CLI for profiling tabular data
YelpArchive / Aws Logs To Parquet ConverterSpark batch converter to convert AWS S3 server side logs to Parquet file format