SkillAgentSearch skills...

Discogsography

🎶 Using the Discogs database export for local graph exploration. 🎶

Install / Use

/learn @SimplicityGuy/Discogsography

README

🎵 Discogsography

<div align="center">

Build Code Quality Tests E2E Tests codecov License: MIT Python 3.13+ Rust uv just Ruff Cargo Clippy pre-commit mypy Bandit Docker Claude Code

A modern Python 3.13+ microservices platform for transforming the complete Discogs music database into powerful, queryable knowledge graphs and analytics engines.

🚀 Quick Start | 📖 Documentation | 🎯 Features | 💬 Community

</div>

🎯 What is Discogsography?

Discogsography transforms monthly Discogs data dumps (~11.3GB compressed XML) into:

  • 🔗 Neo4j Graph Database: Navigate complex music industry relationships
  • 🐘 PostgreSQL Database: High-performance queries and full-text search
  • 🔍 Interactive Explorer: Graph visualisation, trends, and path discovery
  • 📊 Real-time Dashboard: Monitor system health and processing metrics
  • 🎵 MusicBrainz Enrichment: Cross-reference with MusicBrainz for metadata, relationships, and external links

Perfect for music researchers, data scientists, developers, and music enthusiasts who want to explore the world's largest music database.

🏛️ Architecture Overview

⚙️ Core Services

| Service | Purpose | Key Technologies | | ------------------------------------------------------------- | ------------------------------------------------ | ------------------------------------------------------------ | | 🔐 API | User accounts, JWT auth, and collection sync | FastAPI, psycopg3, redis, Discogs OAuth 1.0 | | 📊 Dashboard | Real-time monitoring and admin panel | FastAPI, WebSocket, reactive UI | | 🔍 Explore | Serves graph exploration frontend (static files) | FastAPI, Tailwind CSS, Alpine.js, D3.js, Plotly.js | | Extractor | High-performance Rust-based extractor | tokio, quick-xml, lapin | | 🔗 Graphinator | Builds Neo4j knowledge graphs | neo4j-driver, graph algorithms | | 🔧 Schema-Init | One-shot database schema initializer | neo4j-driver, psycopg3 | | 🐘 Tableinator | Creates PostgreSQL analytics tables | psycopg3, JSONB, full-text search | | 📈 Insights | Precomputed analytics and music trends | FastAPI, psycopg3, httpx | | 🤖 MCP Server | Exposes knowledge graph to AI assistants | FastMCP, httpx |

🎵 MusicBrainz Enrichment Services

| Service | Purpose | Key Technologies | | -------------------------------------------------------------------- | ---------------------------------------------------------- | ----------------------------------- | | 🧠 Brainzgraphinator | Enriches Neo4j graph with MusicBrainz metadata and relationships | neo4j-driver, pika | | 🧬 Brainztableinator | Populates PostgreSQL with MusicBrainz data and external links | psycopg3, pika |

📐 System Architecture

graph TD
    S3[("🌐 Discogs S3<br/>Data Dumps")]
    MB[("🎵 MusicBrainz<br/>JSONL Dumps")]

    subgraph Pipeline ["Data Pipeline"]
        EXT[["⚡ Extractor"]]
        RMQ{{"🐰 RabbitMQ"}}
        GRAPH[["🔗 Graphinator"]]
        TABLE[["🐘 Tableinator"]]
    end

    subgraph MBPipeline ["MusicBrainz Enrichment"]
        BGRAPH[["🧠 Brainzgraphinator"]]
        BTABLE[["🧬 Brainztableinator"]]
    end

    subgraph Storage ["Storage"]
        NEO4J[("🔗 Neo4j")]
        PG[("🐘 PostgreSQL")]
        REDIS[("🔴 Redis")]
    end

    subgraph Services ["User-Facing Services"]
        API[["🔐 API"]]
        EXPLORE[["🔍 Explore"]]
        DASH[["📊 Dashboard"]]
        INSIGHTS[["📈 Insights"]]
    end

    S3 --> EXT --> RMQ
    MB --> EXT
    RMQ --> GRAPH --> NEO4J
    RMQ --> TABLE --> PG
    RMQ --> BGRAPH --> NEO4J
    RMQ --> BTABLE --> PG

    API --- NEO4J & PG & REDIS
    EXPLORE --- API
    INSIGHTS --- PG & REDIS
    DASH -.- RMQ & NEO4J & PG

    style S3 fill:#e1f5fe,stroke:#01579b,stroke-width:2px
    style MB fill:#e1f5fe,stroke:#01579b,stroke-width:2px
    style EXT fill:#ffccbc,stroke:#d84315,stroke-width:2px
    style RMQ fill:#fff3e0,stroke:#e65100,stroke-width:2px
    style NEO4J fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style PG fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style REDIS fill:#ffebee,stroke:#b71c1c,stroke-width:2px
    style GRAPH fill:#e0f2f1,stroke:#004d40,stroke-width:2px
    style TABLE fill:#fce4ec,stroke:#880e4f,stroke-width:2px
    style BGRAPH fill:#e0f2f1,stroke:#004d40,stroke-width:2px
    style BTABLE fill:#fce4ec,stroke:#880e4f,stroke-width:2px
    style API fill:#e3f2fd,stroke:#0d47a1,stroke-width:2px
    style EXPLORE fill:#e8eaf6,stroke:#283593,stroke-width:2px
    style DASH fill:#fce4ec,stroke:#880e4f,stroke-width:2px
    style INSIGHTS fill:#fff9c4,stroke:#f57f17,stroke-width:2px

See Architecture Overview for detailed diagrams covering data pipeline, service communication, and message queue structure.

🌟 Key Features

  • ⚡ High-Speed Processing: ~130–480 records/second end-to-end throughput per data type with Rust-based extractor
  • 🔄 Smart Deduplication: SHA256 hash-based change detection prevents reprocessing
  • 📈 Handles Big Data: Processes 19M+ releases, 10M+ artists across ~11.3GB compressed XML
  • 🔁 Auto-Recovery: Automatic retries with exponential backoff and dead letter queues
  • 🐋 Container Security: Non-root users, read-only filesystems, dropped capabilities
  • 📝 Type Safety: Full type hints with strict mypy validation and Bandit security scanning
  • ✅ Comprehensive Testing: Unit, integration, and E2E tests with Playwright
  • 🚀 Query Performance: 249x overall query performance optimization across 88 endpoints (PRs #175–#184), plus configurable data quality rules for extraction validation (#187) — see Recent Improvements

🚀 Quick Start

# Clone and start all services
git clone https://github.com/SimplicityGuy/discogsography.git
cd discogsography
docker-compose up -d

# Access the dashboard
open http://localhost:8003

| Service | URL | Default Credentials | | ----------------- | ---------------------- | ----------------------------------- | | 🔐 API | http://localhost:8004 | Register via /api/auth/register | | 📊 Dashboard | http://localhost:8003 | None | | 🔗 Neo4j | http://localhost:7474 | neo4j / discogsography | | 🐘 PostgreSQL | localhost:5433 | discogsography / discogsography | | 🐰 RabbitMQ | http://localhost:15672 | discogsography / discogsography |

See the Quick Start Guide for prerequisites, local development setup, and environment configuration.

📖 Documentati

View on GitHub
GitHub Stars9
CategoryData
Updated2h ago
Forks1

Languages

Python

Security Score

90/100

Audited on Mar 31, 2026

No findings