DuckDBwithAWSLambda
Using DuckDB with AWS Lambda to process Delta Lake data
Install / Use
/learn @danielbeach/DuckDBwithAWSLambdaREADME
DuckDB + AWS Lambda
This repo goes along with a Substack post that explores using
DuckDB with DeltaLake on an AWS Lambda.
https://dataengineeringcentral.substack.com/p/aws-lambda-duckdb-and-delta-lake
To build and deploy the Docker image to ECR for Lambda ...
docker build \
--build-arg AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
--build-arg AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
--platform linux/amd64 \
-t duckdelta .
To drop into that Docker container ...
docker run -it duckdelta . /bin/bash
To tag and push an image ...
docker tag duckdelta 992921014520.dkr.ecr.us-east-1.amazonaws.com/duckdelta/docker:latest
docker push 992921014520.dkr.ecr.us-east-1.amazonaws.com/duckdelta/docker:latest
Create the Delta Lake tables on S3
Next, I wanted to create the Delta Lake tables on s3 that
our AWS Lambda will interact with. See create_delta_tables.py for
the code used to do that. Uses dask and a sample dataset.
Lambda code to read and write the S3 Delta Tables (DuckDB and DeltaLake)
There is some code located in lambda_function.py that shows who you can
read, write, and transform data with duckdb and deltalake Python packages,
also using pyarrow datasets as the go-between.
Lambda setup
You need to have a working ECR Registry created for the Docker image.
Related Skills
node-connect
339.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
339.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.9kCommit, push, and open a PR
