Atproto
Blacksky fork of bluesky-social/atproto with AppView performance optimizations, caching, and community features
Install / Use
/learn @blacksky-algorithms/AtprotoREADME
Blacksky AppView
This is Blacksky's fork of the AT Protocol reference implementation by Bluesky Social PBC. It powers the AppView at api.blacksky.community.
We're publishing this for transparency and so other communities can benefit from the work. This repository is not accepting contributions, issues, or PRs. If you want the canonical atproto implementation, use bluesky-social/atproto.
What's Different
All changes are in packages/bsky (appview logic), services/bsky (runtime config), and one custom migration. Everything else is upstream.
Why Not the Built-in Firehose Consumer?
The upstream dataplane includes a TypeScript firehose consumer (subscription.ts) that indexes events directly. We replaced it with rsky-wintermute, a Rust indexer, for several reasons:
- Performance at scale: The TypeScript consumer processes events sequentially. At network scale (~1,000 events/second, 18.5 billion total records), a full backfill at ~90 records/sec would take 6.5 years. Wintermute targets 10,000+ records/sec with parallel queue processing.
- Backfill architecture: Wintermute separates live indexing from backfill into independent queues (firehose_live, firehose_backfill, repo_backfill, labels). Live events are never blocked by backfill work.
- Operational tooling: Wintermute includes utilities for direct indexing of specific accounts, PLC directory bulk import, label stream replay, blob reference repair, and queue management -- all needed when bootstrapping an AppView from scratch.
The dataplane and appview from this repo still run as-is. They read from the PostgreSQL database that wintermute writes to. We just don't start the built-in firehose subscription.
Performance & Operational Fixes
These are broadly useful to anyone self-hosting an AppView at scale.
LATERAL JOIN query optimization (packages/bsky/src/data-plane/server/routes/feeds.ts)
getTimelineandgetListFeedrewritten with PostgreSQL LATERAL JOINs to force per-user index usage instead of full table scans. Major improvement for users following thousands of accounts.
Redis caching layer (packages/bsky/src/data-plane/server/cache/)
- Actor profiles (60s TTL), records (5m), interaction counts (30s), post metadata (5m)
- Reduces database load under production traffic
- Known issue: The actor cache has a protobuf timestamp serialization bug where
Timestampobjects lose their.toDate()method after JSON round-tripping through Redis, causing incomplete profile hydration on cache hits. We currently run with Redis caching disabled. The fix is to serialize timestamps as ISO strings on cache write and reconstruct on read.
Notification preferences server-side enforcement (packages/bsky/src/api/app/bsky/notification/listNotifications.ts)
- When the client doesn't specify
reasons, the server applies the user's saved notification preferences. Without this, preferences are only enforced client-side and have no effect.
Auth verifier stale signing key fix (packages/bsky/src/auth-verifier.ts)
- On JWT verification retry (
forceRefresh), bypasses the dataplane's in-memory identity cache and resolves the DID document directly from PLC directory. Fixes authentication failures after account migration where the signing key rotates but the cache holds the old key.
JSON sanitization (packages/bsky/src/data-plane/server/routes/records.ts)
- Strips null bytes (
\u0000) and control characters from stored records before JSON parsing. These are valid per RFC 8259 but rejected by Node.jsJSON.parse(), causing silentrowToRecordparse failures in the dataplane that surface as missing posts.
Community Posts (Blacksky-specific)
Infrastructure for private community posts that live on the AppView rather than individual PDSes. Specific to how Blacksky works, but could serve as a reference for other communities.
- Custom lexicon namespace
community.blacksky.feed.*with endpoints for submit, get, delete, timeline, and thread views - Separate
community_posttable (migration:20260202T120000000Z-add-community-post.ts) - Membership gating at the dataplane and API layer
- Integration with
getPostThreadV2for mixed standard/community post threads - Requires a separate membership database (
BLACKSKY_MEMBERSHIP_DB_URL)
Architecture
Bluesky Relay (bsky.network)
|
v
rsky-wintermute -----> PostgreSQL 17 <----- Palomar
(Rust indexer) | (Go search)
- firehose consumer | |
- backfiller | v
- label indexer | OpenSearch
- direct indexer |
v
bsky-dataplane (gRPC :2585) <--- Redis (optional)
|
v
bsky-appview (HTTP :2584)
|
v
Reverse proxy (Caddy/nginx)
Component Overview
| Component | Source | Purpose |
|-----------|--------|---------|
| rsky-wintermute | blacksky-algorithms/rsky | Rust firehose indexer: consumes events, backfills repos, indexes records into PostgreSQL |
| rsky-relay | blacksky-algorithms/rsky | AT Protocol relay for receiving moderation labels from labeler services |
| rsky-video | blacksky-algorithms/rsky | Video upload service: transcodes via Bunny Stream CDN, uploads blob refs to user PDSes |
| bsky-dataplane | This repo (services/bsky) | gRPC data layer over PostgreSQL |
| bsky-appview | This repo (services/bsky) | HTTP API server for app.bsky.* XRPC endpoints |
| Palomar | blacksky-algorithms/indigo | Full-text search: indexes profiles and posts into OpenSearch with follower count boosting |
| palomar-sync | blacksky-algorithms/rsky | Syncs follower counts and PageRank scores from PostgreSQL to OpenSearch |
rsky-wintermute in Detail
Wintermute is a monolithic Rust service with four parallel processing paths:
- Ingester: Connects to
bsky.networkfirehose via WebSocket, writes events to Fjall (embedded key-value store) queues - Indexer: Reads from queues, parses records, writes to PostgreSQL with
ON CONFLICTfor idempotency - Backfiller: Fetches full repo CAR files from PDSes, unpacks records into the backfill queue
- Label indexer: Subscribes to labeler WebSocket streams, processes label create/negate events
Additional CLI tools included in the rsky repo:
queue_backfill-- queue DIDs for backfill from CSV, PDS discovery, or direct DID listsdirect_index-- fetch and index specific repos bypassing queues (useful for fixing individual accounts)label_sync-- replay label streams from cursor 0 to catch up on missed negationsplc_import-- bulk import handle/DID mappings from PLC directorypalomar-sync-- sync follower counts and PageRank to OpenSearch
rsky-video
Video upload service for users whose PDS doesn't support Bluesky's video.bsky.app. Uses its own DID (did:web:video.blacksky.community) to authenticate to user PDSes via service auth JWTs. Flow:
- Client gets service auth token from PDS (audience: video service DID)
- Client uploads video bytes to rsky-video
- rsky-video generates a CID, uploads the blob to the user's PDS
- Video forwarded to Bunny Stream CDN for transcoding
- On completion, client creates the post referencing the blob -- PDS validates the blob exists
Label Handling
Moderation labels come from labeler services (e.g., Bluesky's Ozone) via WebSocket subscription. Wintermute's ingester processes labels in a dedicated label_live queue (low volume, separate from the main firehose). The label_sync tool can replay a labeler's full stream to catch up on missed negations (label removals) without reinserting labels.
Setup
Prerequisites
- Node.js 18+ and pnpm (for building the dataplane and appview)
- PostgreSQL 17 with the
bskyschema - Redis (optional, for caching -- see known issue above)
- rsky-wintermute consuming the firehose and populating the database
- OpenSearch (if running Palomar search)
Database
The bsky schema is created by the dataplane's migrations. On first run, the dataplane will apply all migrations automatically. The only Blacksky-specific migration is 20260202T120000000Z-add-community-post.ts (community posts table). If you don't need community posts, you can remove it.
rsky-wintermute writes to this same schema. All its INSERT statements use ON CONFLICT so it's safe to run wintermute and the dataplane migrations in any order.
Build
pnpm install
pnpm build
Run the Dataplane
node services/bsky/dataplane.js
| Variable | Required | Description |
|----------|----------|-------------|
| DB_PRIMARY_URL | Yes | PostgreSQL connection string with ?options=-csearch_path%3Dbsky |
| DB_REPLICA_URL | No | Read replica connection string |
| BSKY_DATAPLANE_PORT | No | gRPC port (default 2585) |
| BSKY_REDIS_HOST | No | Redis host:port for caching (currently recommended to leave disabled) |
| BLACKSKY_MEMBERSHIP_DB_URL | No | Separate DB for community membership (Blacksky-specific) |
Run the AppView
node services/bsky/api.js
| Variable | Required | Description |
|----------|----------|-------------|
| BSKY_APPVIEW_PORT | No | HTTP port (default 2584) |
| BSKY_DATAPLANE_URLS | Yes | Comma-separated dataplane gRPC URLs |
| BSKY_DID | Yes | The AppView's DID (e.g. did:web:api.example.com) |
Related Skills
node-connect
344.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
96.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
344.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
344.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
