775 skills found · Page 2 of 26
cbg-ethz / V PipeV-pipe is a pipeline designed for analysing NGS data of short viral genomes
trannhatnguyen2 / NYC Taxi Data PipelineNyc_Taxi_Data_Pipeline - DE Project
lvgalvao / Pipeline Api Bitcoin Com DatabricksNo description available
frictionlessdata / Datapackage PipelinesFramework for processing data packages in pipelines of modular components.
K9Ns / Data Pipelines With Apache AirflowNo description available
young-geng / Koala Data PipelineThe data processing pipeline for the Koala chatbot language model
axel-sirota / Productionalizing Data Pipelines AirflowProductionalizing Data Pipelines with Apache Airflow
Akajiaku11 / Automated ETL Pipeline For Weather DataThis project implements an Automated ETL (Extract, Transform, Load) Pipeline that fetches daily weather data, processes it, and stores it locally. It uses the OpenWeatherMap API to extract current weather data for a given city
Vatyx / NamedPipeCaptureA Windows tool that can be used to stream data from named pipe between two other process to Wireshark
YelpArchive / Data PipelineData Pipeline Clientlib provides an interface to tail and publish to data pipeline topics.
microsoft / BiomedCLIP Data PipelineBiomedCLIP data pipeline
aws-samples / Aws Cdk Pipelines Datalake InfrastructureThis solution helps you deploy Data Lake Infrastructure on AWS using CDK Pipelines.
ddgope / Data Pipelines With AirflowThis project helps me to understand the core concepts of Apache Airflow. I have created custom operators to perform tasks such as staging the data, filling the data warehouse, and running checks on the data quality as the final step. Automate the ETL pipeline and creation of data warehouse using Apache Airflow. Skills include: Using Airflow to automate ETL pipelines using Airflow, Python, Amazon Redshift. Writing custom operators to perform tasks such as staging data, filling the data warehouse, and validation through data quality checks. Transforming data from various sources into a star schema optimized for the analytics team’s use cases. Technologies used: Apache Airflow, S3, Amazon Redshift, Python.
hoangsonww / End To End Data Pipeline📈 A scalable, production-ready data pipeline for real-time streaming & batch processing, integrating Kafka, Spark, Airflow, AWS, Kubernetes, and MLflow. Supports end-to-end data ingestion, transformation, storage, monitoring, and AI/ML serving with CI/CD automation using Terraform & GitHub Actions.
Unity-Technologies / ScriptableRenderPipelineDataData for Scriptable Render Pipeline
PacktPublishing / Building Big Data Pipelines With Apache BeamBuilding Big Data Pipelines with Apache Beam, published by Packt
Akajiaku11 / Water Quality Monitoring Data PipelineThis project simulates a real-time water quality monitoring system, which collects, processes, and displays water quality data such as pH level, temperature, and turbidity. The goal of this project is to create a simple prototype for simulating water quality data collection, real-time data display, and historical data storage.
GoogleCloudPlatform / Data PipelineData pipeline is a tool to run Data loading pipelines. It is an open sourced app engine app that users can extend to suit their own needs. Out of the box it will load files from a source, transform them and then output them (output might be writing to a file or loading them into a data analysis tool). It is designed to be modular and support various sources, transformation technologies and output types. The transformations can be chained together to form complex pipelines.
alexcasalboni / Serverless Data Pipeline SamServerless Data Pipeline powered by Kinesis Firehose, API Gateway, Lambda, S3, and Athena
abeltavares / Batch Data Pipeline🦆 Batch data pipeline with Airflow, DuckDB, Delta Lake, Trino, MinIO, and Metabase. Full observability and data quality.