14 skills found
awslabs / DeequDeequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
awslabs / Python DeequPython API for Deequ
aws-samples / Amazon Deequ GlueAutomated data quality suggestions and analysis with Deequ on AWS Glue
margitaii / PydeequPython API for Deequ
hexnn / Stark基于Spark+SparkMLlib+Debezium+Deequ打造的简单易用、超高性能大数据治理引擎。适用于批流一体的数据集成和数据分析,支持CDC实时数据采集、机器学习算法模型、数据质量校验、数据标注、敏感数据识别、数据建模、算法建模和OLAP数据分析
mfcabrera / Hooquhooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to Python
mrpowers-io / Tsumugi SparkSparkConnect Server plugin and protobuf messages for the Amazon Deequ Data Quality Engine.
timgent / Data FlareData quality control tool built on spark and deequ
lexaneon / Amazon Deequ AddonsNo description available
jobtech-dev / Graphen JGraphen_J is a framework written in the Scala language, built on top of Apache Spark and Deequ to perform EL, ETL and Data Quality processes for large datasets.
branst / Aws Glue Datacatalog Deequ WorkshopAWS Workshop of Data Quality and ETL. Using Glue and Deequ
samueleresca / Deequ.netdeequ.NET is a port of the awslabs/deequ library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
sllynn / Unittest ExampleExample of using scalatest and deequ from AWS Labs to unit test a Spark pipeline
wesleywilian / Pydeequ Dynamic ParserPython library which makes it possible to use validation rules in pydeequ (https://github.com/awslabs/python-deequ) based on json/dict structures.