Databricks
Databricks Platform - Architecture, Security, Automation and much more!!
Install / Use
/learn @bhavink/DatabricksREADME
I design and implement secure, production-grade Data and AI platforms across Azure, AWS, and GCP. Specializing in Databricks architecture, zero-trust security, and infrastructure automation.
🎯 What I Do
- 🏗️ Build secure data lakehouses with Private Link, Unity Catalog, and data exfiltration protection
- ☁️ Multi-cloud Databricks architecture for regulated industries (finance, healthcare, government)
- ⚙️ Infrastructure as Code with modular Terraform templates and automation frameworks
- 📝 Share knowledge through technical articles and open source contributions
📚 Recent Work
Latest Articles (13+ published on Databricks Blog):
- A Unified Approach to Data Exfiltration Protection on Databricks (Aug 2025)
- BigQuery adds first-party support for Delta Lake (Jun 2024)
- How Delta Sharing Enables Secure End-to-End Collaboration (May 2024)
- Data Exfiltration Protection with Azure Databricks (Mar 2024)
💡 Core Expertise
Security Infrastructure Multi-Cloud
━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━
• DEP Frameworks • Terraform Modules • Azure (ADB)
• Unity Catalog • CI/CD Pipelines • AWS (DB)
• Private Link • Config Management • GCP (DB)
• CMK/Encryption • Custom Agents • VNet/VPC/VPC-SC
• Network Security • Automation • Cross-Cloud
📫 Connect
- 📝 Blog: databricks.com/blog/author/bhavin-kukadia
- 💼 LinkedIn: linkedin.com/in/bhavink
"Building secure, scalable data platforms that enable innovation while protecting what matters most."
Repository Contents: All Things Databricks ✅
This repository contains production-ready infrastructure templates, ready-to-use code samples, how-to guides, and deployment architectures to help you learn and operate the Databricks Lakehouse on Azure, AWS, and GCP.
Quick Links 🔗
| Cloud | Description | Path | |-------|-------------|------| | 📖 Guides | Cross-cloud guides (authentication, networking, troubleshooting) | guides | | 🤖 AI Governance | Authentication & authorization for Agent Bricks, Genie, Databricks Apps | applied-ai-governance | | 🔷 Azure | Production-ready security & modular Terraform deployment patterns | adb4u | | ☁️ AWS | Private Link workspace templates with DEP controls | awsdb4u | | 🟢 GCP | VPC-SC, Private Service Connect, CMEK implementations | gcpdb4u | | 🛠️ Utils | Databricks IP range extraction tool | databricksIPranges | | 📦 Archive | Legacy content and code samples | archive |
📖 Cross-Cloud Guides (Start Here!)
New to Databricks infrastructure? Check out our comprehensive guides:
- Authentication Guide - Set up Terraform authentication for Azure, AWS, or GCP (zero jargon!)
- Identities Guide - Understand how Databricks accesses your cloud account
- Networking Guide - Complete multi-cloud guide covering AWS, Azure, and GCP networking with troubleshooting
- Common Questions & Answers - Quick answers to frequently asked questions
Building AI Applications? Check out our AI governance guide:
- Applied AI Governance - Production-ready authentication & authorization patterns for:
- 🔮 Genie Space - Multi-team access, 1000+ users with complex UC governance
- 🤖 Agent Bricks - Knowledge Assistant, Information Extraction, Multi-Agent Supervisor, Custom LLM
- 📱 Databricks Apps - App authorization vs user authorization patterns
- Includes real-world scenarios mapped to official use cases
🌩️ Databricks Deployment Guides by Cloud
🔷 Azure (adb4u)
Production-Ready Modular Terraform Templates
- ✅ Focus: Security, governance, and production-ready deployment patterns
- 🏗️ Architecture: Non-PL, Full Private (air-gapped), Hub-Spoke with firewall
- 🔐 Security: Unity Catalog, Private Link, NPIP/SCC, CMK, Service Endpoints
- 📚 Documentation: 2,300+ lines with UML diagrams, traffic flows, troubleshooting guides
- 📁 Path:
adb4u/
Key Features:
- Modular Terraform structure (Networking, Workspace, Unity Catalog, Key Vault)
- BYOV (Bring Your Own VNet/Subnet/NSG) support
- Automated NSG rule management for SCC workspaces
- Customer-Managed Keys with auto-rotation
- Comprehensive deployment checklists and troubleshooting
Quick Start: See adb4u/docs/01-QUICKSTART.md
☁️ AWS (awsdb4u)
Private Link Workspace Templates with DEP Controls
- 🎯 Focus: Deploying and operating Databricks on AWS with best practices
- 🔐 Security: VPC design, Private Link, PrivateLink endpoints, data exfiltration protection
- 📊 Topics: S3 data access patterns, IAM roles and policies, cross-account setups
- 🛠️ Automation: Infrastructure templates and configuration management
- 📁 Path:
awsdb4u/
Key Features:
- Private Link workspace deployments
- Data Exfiltration Protection (DEP) controls
- VPC and subnet design patterns
- IAM role and policy automation
- Cross-account setup guidance
🟢 GCP (gcpdb4u)
VPC-SC, Private Service Connect, CMEK Implementations
- 🎯 Focus: GCP-specific guidance with emphasis on data plane security
- 🔐 Security: VPC-SC perimeters, Private Service Connect, KMS integration
- 🌐 Networking: VPC and subnet design, private connectivity patterns
- 🔑 Identity: IAM & service accounts, Workload Identity Federation
- 📁 Path:
gcpdb4u/
Key Features:
- VPC Service Controls (VPC-SC) integration
- Private Service Connect (PSC) for workspace connectivity
- Google KMS integration for encryption
- GCS connectors and data access patterns
- Data exfiltration prevention patterns
🔧 How to Use This Repository
1. Choose Your Cloud Platform
Pick the folder that matches your target environment:
2. Select Deployment Pattern
Each cloud folder contains multiple deployment patterns:
- Non-Private Link: Public control plane + private data plane (NPIP)
- Full Private: Private Link for both control and data planes
- Hub-Spoke: Centralized networking with egress control
3. Follow Deployment Guides
- Read the README in your chosen folder
- Review architecture diagrams and documentation
- Follow step-by-step deployment instructions
- Use provided Terraform modules and templates
4. Explore Additional Resources
- Cross-Cloud Guides: guides/ - Authentication, networking, troubleshooting
- Utility Scripts: databricksIPranges - Databricks IP range extraction tool
- Archive: archive/ - Legacy code samples and REST API collections
🌟 Highlighted Features
Production-Ready Templates
- ✅ Modular Terraform code with conditional logic
- ✅ Support for BYOV (Bring Your Own VNet/VPC)
- ✅ Automated network security group rules
- ✅ Unity Catalog with regional metastore management
Comprehensive Documentation
- 📚 2,300+ lines of detailed guides
- 📊 UML architecture and sequence diagrams
- 🔍 Traffic flow analysis with cost breakdowns
- ⚠️ Troubleshooting guides and deployment checklists
Security Best Practices
- 🔐 Data Exfiltration Protection (DEP) frameworks
- 🔑 Customer-Managed Keys (CMK) with auto-rotation
- 🌐 Private Link, VPC-SC, and network isolation
- 🛡️ Zero-trust architectures for regulated industries
✨ Contributing
Contributions are welcome! Please:
- Open issues for bugs, questions, or feature requests
- Submit pull requests for:
- Documentation improvements
- Additional cloud scenarios
- New deployment templates
- Bug fixes or enhancements
📄 License
This repository follows the licensing described in the project. Please see the LICENSE file (if present) or reach out for clarification.
🔗 Additional Resources
- Databricks Blog Articles: All 13+ Articles
Related Skills
healthcheck
341.0kHost security hardening and risk-tolerance configuration for OpenClaw deployments
tmux
341.0kRemote-control tmux sessions for interactive CLIs by sending keystrokes and scraping pane output.
prose
341.0kOpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.
claude-opus-4-5-migration
84.4kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
