Querynox
A multi‑AI chat platform combining live web search, document retrieval (RAG), and built‑in image generation in one unified interface.
Install / Use
/learn @hackice20/QuerynoxAbout this skill
Quality Score
0/100
Category
Customer SupportSupported Platforms
Claude Code
Claude Desktop
Gemini CLI
README
QueryNox Backend - Technical Architecture Documentation
System Overview
QueryNox is a production-grade, multi-model AI chat platform engineered with a sophisticated microservices-inspired architecture. The system integrates 8+ AI providers, implements real-time web search augmentation, advanced document processing through RAG (Retrieval-Augmented Generation), subscription-based rate limiting, comprehensive monitoring, and enterprise-level security features.
Technical Stack
Core Framework & Infrastructure
- Runtime: Node.js 18+ (Event-driven, non-blocking I/O)
- Web Framework: Express.js 4.18.2 with custom middleware pipeline
- Database: MongoDB 8.0.1 with Mongoose ODM (Atlas/Self-hosted)
- Process Manager: Docker Compose for containerized deployments
- File Storage: Cloudflare R2 (S3-compatible) for image artifacts
Authentication & Authorization
- Primary Auth: Clerk Authentication with JWT validation
- Secondary Auth: Basic HTTP Authentication for admin endpoints
- Session Management: Stateless JWT with user context hydration
Monitoring & Observability
- Metrics: Prometheus with custom metrics collection
- Logging: Winston + Grafana Loki (structured JSON logging)
- Visualization: Grafana dashboards for real-time monitoring
- Performance Tracking: Custom request/response time histograms
AI Provider Integrations
- Primary: OpenAI (GPT models, DALL-E, Embeddings)
- Anthropic: Claude 3.5 Sonnet via native SDK
- Groq: Llama 3.3-70B with hardware acceleration
- Google: Gemini 2.5 Flash via GenerativeAI SDK
- OpenRouter: Unified proxy for gpt-oss-120b, Grok-3-mini
- Embedding: OpenAI text-embedding-3-small (1536 dimensions)
Architecture Deep Dive
Service Layer Architecture
graph TB
subgraph "HTTP Layer"
A[Express Server] --> B[CORS Middleware]
B --> C[Clerk Auth Middleware]
C --> D[Rate Limiting Middleware]
D --> E[File Upload Middleware]
end
subgraph "Controller Layer"
F[Chat Controller] --> G[User Controller]
G --> H[Payment Controller]
end
subgraph "Service Layer"
I[Service Manager] --> J[AI Service]
I --> K[RAG Service]
I --> L[Web Search Service]
I --> M[Image Service]
I --> N[OpenRouter Service]
end
subgraph "Data Layer"
O[MongoDB] --> P[User Model]
O --> Q[Chat Model]
O --> R[ChatQuery Model]
O --> S[Product Model]
end
E --> F
F --> I
I --> O
Database Schema Design
User Document Structure
{
_id: String, // Clerk User ID (primary key)
chats: [ObjectId], // Array of chat references
bookmarkedChats: [ObjectId], // Bookmarked chat references
productId: String, // Subscription product reference
usedChatGeneration: Number, // Monthly usage counter
usedImageGeneration: Number, // Monthly image generation counter
usedWebSearch: Number, // Monthly web search counter
usedFileRag: Number, // Monthly RAG usage counter
limitsUpdatedAt: Number, // Last reset timestamp
createdAt: Number, // Account creation timestamp
updatedAt: Number // Last modification timestamp
}
Chat Document Structure
{
_id: ObjectId, // Auto-generated chat ID
userId: String, // User reference (Clerk ID)
title: String, // Auto-generated chat title
chatName: String, // AI-generated descriptive name
model: String, // Current model being used
systemPrompt: String, // System instructions
webSearch: Boolean, // Web search enabled flag
isShared: Boolean, // Public sharing flag
createdAt: Number, // Chat creation timestamp
updatedAt: Number // Last activity timestamp
}
ChatQuery Document Structure
{
_id: ObjectId, // Auto-generated query ID
chatId: ObjectId, // Parent chat reference
prompt: String, // User input prompt
model: String, // Model used for response
systemPrompt: String, // System prompt at time of query
webSearch: Boolean, // Web search used flag
response: String, // AI model response
meta: Map<String, String>, // Metadata (image keys, etc.)
createdAt: Number, // Query timestamp
updatedAt: Number // Response completion timestamp
}
Product Document Structure (Subscription Management)
{
_id: String, // Polar product ID
name: String, // Product display name
description: String, // Product description
metadata: {
chatGenerationLimit: Number, // Monthly chat limit
imageGenerationLimit: Number, // Monthly image limit
webSearchLimit: Number, // Monthly web search limit
fileRagLimit: Number, // Monthly RAG limit
fileCountLimit: Number // Per-request file limit
},
recurringInterval: String, // Billing cycle (month/year)
isRecurring: Boolean, // Subscription flag
isArchived: Boolean, // Product active state
organizationId: String, // Polar organization ID
createdAt: Number, // Product creation
modifiedAt: Number, // Last modification
prices: Array, // Pricing tiers
benefits: Array, // Feature list
medias: Array, // Product media
attachedCustomFields: Array // Additional metadata
}
Request Processing Pipeline
1. Authentication Flow
// Clerk Authentication Middleware
const clerkAuthMiddleware = (requestUser = false, upInsert = true) => {
return async (req, res, next) => {
let userId = getAuth(req).userId;
// Development bypass
if (!userId && process.env.NODE_ENV === "development") {
userId = req.query.userId;
}
if (!userId) {
return res.status(401).json({ error: 'Not authenticated' });
}
if (requestUser) {
let _user = await User.findById(userId);
if (!_user && upInsert) {
_user = new User({ _id: userId }); // Auto-create user
}
req.user = _user;
}
req.userId = userId;
next();
};
};
2. Rate Limiting Implementation
// Monthly usage-based rate limiting
const userLimitMiddleware = () => {
return async (req, res, next) => {
const user = req.user;
const { model, webSearch } = req.body;
const files = req.files;
// Monthly reset logic
const now = new Date();
const lastUpdated = new Date(user.limitsUpdatedAt);
if (now.getMonth() !== lastUpdated.getMonth() ||
now.getFullYear() !== lastUpdated.getFullYear()) {
// Reset all counters
user.usedChatGeneration = 0;
user.usedFileRag = 0;
user.usedImageGeneration = 0;
user.usedWebSearch = 0;
user.limitsUpdatedAt = Date.now();
await user.save();
}
// Apply limits based on subscription tier
const limits = user.productId ?
await getProductLimits(user.productId) :
DEFAULT_LIMITS;
// Validate usage against limits
if (exceedsLimits(user, limits, req)) {
return res.status(429).json({ error: 'Usage limit exceeded' });
}
next();
};
};
AI Service Architecture
Service Manager Pattern
class ServiceManager {
constructor() {
this.openRouterService = openRouterService;
this.aiService = aiService;
this.ragService = ragService;
this.webSearchService = webSearchService;
this.imageService = imageService;
}
// Intelligent routing based on model type
isOpenRouterModel(modelName) {
return ['gpt-oss-120b', 'grok-3-mini'].includes(modelName);
}
async* generateStreamingResponse(model, messages, systemPrompt) {
if (this.isOpenRouterModel(model)) {
yield* this.openRouterService.generateStreamingResponse(model, messages, systemPrompt);
} else {
yield* this.aiService.generateStreamingResponse(model, messages, systemPrompt);
}
}
}
Model Configuration
const models = [
{
name: "Claude 3.5 Sonnet",
fullName: "claude-haiku-4-5-20251001",
category: "Text Generation",
description: "Fast and efficient text generation",
limit: 200000,
pro: true
},
{
name: "llama-3.3-70b-versatile",
fullName: "llama-3.3-70b-versatile",
category: "Text Generation",
description: "Powerful open-source model via Groq",
limit: 32768,
pro: false
},
{
name: "gpt-3.5-turbo",
fullName: "gpt-3.5-turbo",
category: "Text Generation",
description: "Reliable and versatile text generation",
limit: 16385,
pro: false
},
{
name: "gemini-2.5-flash",
fullName: "gemini-2.5-flash",
category: "Text Generation",
description: "Google's advanced language model",
limit: 1000000,
pro: false
},
{
name: "dall-e-3",
fullName: "dall-e-3",
category: "Image Generation",
description: "High-quality image generation",
limit: 4000,
pro: false
},
{
name: "gpt-oss-120b",
fullName: "gpt-oss-120b",
category: "Text Generation",
description: "OpenAI Open Source 120B Model",
limit: 128000,
pro: false
},
{
name: "grok-3-mini",
fullName: "grok-3-mini",
category: "Text Generation",
description: "X.AI Grok 3 Mini Model",
limit: 128000,
pro: true
}
];
RAG (Retrieval-Augmented Generation) Implementation
Document Processing Pipeline
const ragService = {
// PDF text extraction using pdf-parse
getTextFromPDF: async (pdfBuffer) => {
const data
