69 skills found · Page 1 of 3
paradedb / Pg AnalyticsDuckDB-powered data lake analytics from Postgres
databrickslabs / DbldatagenGenerate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
delta-io / Kafka Delta IngestA highly efficient daemon for streaming data from Kafka into Delta Lake
MrPowers / MackDelta Lake helper methods in PySpark
japila-books / Delta Lake InternalsThe Internals of Delta Lake
harrystuart / FlintmlOne-click ML infrastructure for teams that just want to get sh*t done.
smart-data-lake / Smart Data LakeSmart Automation Tool for building modern Data Lakes and Data Pipelines
uname-n / Deltabasea lightweight, comprehensive solution for managing delta tables built on polars and deltalake
izhangzhihao / Real Time Data WarehouseReal-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
annegl / 101 Upsert DeltaThis repository exemplifies a simple ELT process using delta to perform upsert and remove data files that aren't in the latest state of the transaction log for the table.
WeBankFinTech / StreamisStreaming application development and management system, based on Linkis and DSS, planning to provide the workflow-like graphical drag-and-drop development capability.
martandsingh / ApacheSparkThis repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
Mause / Duckdb Deltatable ExtensionA purely experimental DuckDB Deltalake extension
dacort / Faker CliCommand-line interface to quickly generate fake CSV and JSON data
delta-incubator / Delta DotnetDeltaLake bindings for dotnet based on delta-rs
buoyant-data / OxbowCollection of AWS Lambdas for creating and managing Delta tables
bhavink / DatabricksDatabricks Platform - Architecture, Security, Automation and much more!!
jeppe742 / DeltaLakeReaderRead Delta tables without any Spark
mrjsj / MsfabricutilsSpark-free Python utilities for Microsoft Fabric focused on Data Engineering using Polars and delta-rs
sankamuk / PysparkCheatsheetPySpark Cheatsheet