SkillAgentSearch skills...

Databricks

Databricks Platform - Architecture, Security, Automation and much more!!

Install / Use

/learn @bhavink/Databricks

README

I design and implement secure, production-grade Data and AI platforms across Azure, AWS, and GCP. Specializing in Databricks architecture, zero-trust security, and infrastructure automation.

🎯 What I Do

  • 🏗️ Build secure data lakehouses with Private Link, Unity Catalog, and data exfiltration protection
  • ☁️ Multi-cloud Databricks architecture for regulated industries (finance, healthcare, government)
  • ⚙️ Infrastructure as Code with modular Terraform templates and automation frameworks
  • 📝 Share knowledge through technical articles and open source contributions

📚 Recent Work

Latest Articles (13+ published on Databricks Blog):

💡 Core Expertise

Security           Infrastructure        Multi-Cloud
━━━━━━━━━━━━━━━━━  ━━━━━━━━━━━━━━━━━━━━  ━━━━━━━━━━━━━
• DEP Frameworks   • Terraform Modules   • Azure (ADB)
• Unity Catalog    • CI/CD Pipelines     • AWS (DB)
• Private Link     • Config Management   • GCP (DB)
• CMK/Encryption   • Custom Agents       • VNet/VPC/VPC-SC
• Network Security • Automation          • Cross-Cloud

📫 Connect


"Building secure, scalable data platforms that enable innovation while protecting what matters most."


Repository Contents: All Things Databricks ✅

This repository contains production-ready infrastructure templates, ready-to-use code samples, how-to guides, and deployment architectures to help you learn and operate the Databricks Lakehouse on Azure, AWS, and GCP.


Quick Links 🔗

| Cloud | Description | Path | |-------|-------------|------| | 📖 Guides | Cross-cloud guides (authentication, networking, troubleshooting) | guides | | 🤖 AI Governance | Authentication & authorization for Agent Bricks, Genie, Databricks Apps | applied-ai-governance | | 🔷 Azure | Production-ready security & modular Terraform deployment patterns | adb4u | | ☁️ AWS | Private Link workspace templates with DEP controls | awsdb4u | | 🟢 GCP | VPC-SC, Private Service Connect, CMEK implementations | gcpdb4u | | 🛠️ Utils | Databricks IP range extraction tool | databricksIPranges | | 📦 Archive | Legacy content and code samples | archive |


📖 Cross-Cloud Guides (Start Here!)

New to Databricks infrastructure? Check out our comprehensive guides:

Building AI Applications? Check out our AI governance guide:

  • Applied AI Governance - Production-ready authentication & authorization patterns for:
    • 🔮 Genie Space - Multi-team access, 1000+ users with complex UC governance
    • 🤖 Agent Bricks - Knowledge Assistant, Information Extraction, Multi-Agent Supervisor, Custom LLM
    • 📱 Databricks Apps - App authorization vs user authorization patterns
    • Includes real-world scenarios mapped to official use cases

🌩️ Databricks Deployment Guides by Cloud

🔷 Azure (adb4u)

Production-Ready Modular Terraform Templates

  • Focus: Security, governance, and production-ready deployment patterns
  • 🏗️ Architecture: Non-PL, Full Private (air-gapped), Hub-Spoke with firewall
  • 🔐 Security: Unity Catalog, Private Link, NPIP/SCC, CMK, Service Endpoints
  • 📚 Documentation: 2,300+ lines with UML diagrams, traffic flows, troubleshooting guides
  • 📁 Path: adb4u/

Key Features:

  • Modular Terraform structure (Networking, Workspace, Unity Catalog, Key Vault)
  • BYOV (Bring Your Own VNet/Subnet/NSG) support
  • Automated NSG rule management for SCC workspaces
  • Customer-Managed Keys with auto-rotation
  • Comprehensive deployment checklists and troubleshooting

Quick Start: See adb4u/docs/01-QUICKSTART.md


☁️ AWS (awsdb4u)

Private Link Workspace Templates with DEP Controls

  • 🎯 Focus: Deploying and operating Databricks on AWS with best practices
  • 🔐 Security: VPC design, Private Link, PrivateLink endpoints, data exfiltration protection
  • 📊 Topics: S3 data access patterns, IAM roles and policies, cross-account setups
  • 🛠️ Automation: Infrastructure templates and configuration management
  • 📁 Path: awsdb4u/

Key Features:

  • Private Link workspace deployments
  • Data Exfiltration Protection (DEP) controls
  • VPC and subnet design patterns
  • IAM role and policy automation
  • Cross-account setup guidance

🟢 GCP (gcpdb4u)

VPC-SC, Private Service Connect, CMEK Implementations

  • 🎯 Focus: GCP-specific guidance with emphasis on data plane security
  • 🔐 Security: VPC-SC perimeters, Private Service Connect, KMS integration
  • 🌐 Networking: VPC and subnet design, private connectivity patterns
  • 🔑 Identity: IAM & service accounts, Workload Identity Federation
  • 📁 Path: gcpdb4u/

Key Features:

  • VPC Service Controls (VPC-SC) integration
  • Private Service Connect (PSC) for workspace connectivity
  • Google KMS integration for encryption
  • GCS connectors and data access patterns
  • Data exfiltration prevention patterns

🔧 How to Use This Repository

1. Choose Your Cloud Platform

Pick the folder that matches your target environment:

2. Select Deployment Pattern

Each cloud folder contains multiple deployment patterns:

  • Non-Private Link: Public control plane + private data plane (NPIP)
  • Full Private: Private Link for both control and data planes
  • Hub-Spoke: Centralized networking with egress control

3. Follow Deployment Guides

  • Read the README in your chosen folder
  • Review architecture diagrams and documentation
  • Follow step-by-step deployment instructions
  • Use provided Terraform modules and templates

4. Explore Additional Resources

  • Cross-Cloud Guides: guides/ - Authentication, networking, troubleshooting
  • Utility Scripts: databricksIPranges - Databricks IP range extraction tool
  • Archive: archive/ - Legacy code samples and REST API collections

🌟 Highlighted Features

Production-Ready Templates

  • ✅ Modular Terraform code with conditional logic
  • ✅ Support for BYOV (Bring Your Own VNet/VPC)
  • ✅ Automated network security group rules
  • ✅ Unity Catalog with regional metastore management

Comprehensive Documentation

  • 📚 2,300+ lines of detailed guides
  • 📊 UML architecture and sequence diagrams
  • 🔍 Traffic flow analysis with cost breakdowns
  • ⚠️ Troubleshooting guides and deployment checklists

Security Best Practices

  • 🔐 Data Exfiltration Protection (DEP) frameworks
  • 🔑 Customer-Managed Keys (CMK) with auto-rotation
  • 🌐 Private Link, VPC-SC, and network isolation
  • 🛡️ Zero-trust architectures for regulated industries

✨ Contributing

Contributions are welcome! Please:

  1. Open issues for bugs, questions, or feature requests
  2. Submit pull requests for:
    • Documentation improvements
    • Additional cloud scenarios
    • New deployment templates
    • Bug fixes or enhancements

📄 License

This repository follows the licensing described in the project. Please see the LICENSE file (if present) or reach out for clarification.


🔗 Additional Resources

Related Skills

View on GitHub
GitHub Stars55
CategoryOperations
Updated9d ago
Forks30

Languages

Python

Security Score

85/100

Audited on Mar 20, 2026

No findings