SkillAgentSearch skills...

Scrapitor

Proxy and capture JanitorAI traffic to OpenRouter with automatic logging. Features rule-driven parser for character sheet extraction, versioned exports, web dashboard, and one-click Cloudflare tunnel deployment.

Install / Use

/learn @daksh-7/Scrapitor
About this skill

Quality Score

0/100

Category

Operations

Supported Platforms

Universal

README

<p align="center"> <img src="https://raw.githubusercontent.com/daksh-7/scrapitor/main/app/static/assets/logo_black.svg" alt="scrapitor Logo" width="180" height="180"> </p> <h1 align="center">scrapitor</h1> <p align="center"> <strong>Intercept. Parse. Export.</strong> </p> <p align="center"> <img src="https://img.shields.io/badge/Python-3.10+-1e3a8a?style=flat-square" alt="Python 3.10+"> <img src="https://img.shields.io/badge/Svelte-5-1d4ed8?style=flat-square" alt="Svelte 5"> <img src="https://img.shields.io/badge/Version-2.2-3b82f6?style=flat-square" alt="Version 2.2"> <img src="https://img.shields.io/badge/PRs-welcome-0ea5e9?style=flat-square" alt="PRs Welcome"> </p>

A local proxy that intercepts JanitorAI traffic, captures request payloads as JSON logs, and provides a rule-driven parser to extract clean character sheets. Exports to SillyTavern-compatible JSON format.

Table of Contents


Quick Start

Quick Start (Windows)

  1. Download: https://github.com/daksh-7/scrapitor → Code → Download ZIP → Unzip
  2. Double-click run.bat
  3. Copy the Cloudflare Proxy URL from the terminal
  4. In JanitorAI: Enable "Using proxy" → paste the URL → add your OpenRouter API key
  5. Send a message — your request appears in the dashboard Activity tab

Requirements: Python 3.10+ and PowerShell 7. The launcher auto-installs everything else.

Quick Start (Linux/macOS)

  1. Clone and run:
git clone https://github.com/daksh-7/scrapitor && cd scrapitor && ./run.sh
  1. Copy the Cloudflare Proxy URL from the terminal
  2. In JanitorAI: Enable "Using proxy" → paste the URL → add your OpenRouter API key

Requirements: Python 3.10+, curl, and bash. The launcher auto-installs cloudflared and Python dependencies.

Quick Start (Termux/Android)

  1. Install Termux from F-Droid (Play Store version is outdated)

  2. Install dependencies:

pkg update && pkg upgrade -y && pkg install python git curl cloudflared -y
  1. Clone and run:
git clone https://github.com/daksh-7/scrapitor && cd scrapitor && ./run.sh
  1. In another Termux session, run termux-wake-lock to prevent Android from killing the process
  2. Copy the Cloudflare Proxy URL and use it in JanitorAI

Requirements: Termux with python, curl, git, and cloudflared packages. ARM64 device required.


Architecture

graph LR
    %% --- NODES & DATA ---
    J([JanitorAI<br/>Browser Client])
    S[scrapitor<br/>Flask Proxy]
    OR(OpenRouter<br/>API)

    subgraph Data_Processing [Data Processing & UI]
        direction TB
        L[(JSON Log<br/>Files)]
        P[[Parser<br/>Engine]]
        D(Dashboard<br/>Svelte 5)
        E[/Parsed TXT /<br/>SillyTavern Export/]
    end

    %% --- CONNECTIONS ---
    %% Bi-directional traffic flow
    J <==>|HTTP Request<br/>& Response| S
    S <==>|Forward &<br/>Inference| OR
    
    %% Internal Data flow
    S -.->|Live State| D
    S -- Capture<br/>Completion --> L
    L -.->|Read| P
    P -->|Generate| E

    %% --- STYLING ---
    classDef base fill:#fff,stroke:#333,stroke-width:1px,color:#333;
    classDef client fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#0d47a1;
    classDef proxy fill:#e8eaf6,stroke:#3949ab,stroke-width:3px,color:#1a237e;
    classDef external fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,stroke-dasharray: 5 5,color:#4a148c;
    classDef storage fill:#e0f2f1,stroke:#00695c,stroke-width:2px,color:#004d40;
    classDef ui fill:#fce4ec,stroke:#c2185b,stroke-width:2px,color:#880e4f;
    
    %% Apply Styles
    class J client;
    class S proxy;
    class OR external;
    class L,P,E storage;
    class D ui;

    %% Style Subgraph
    style Data_Processing fill:#ffffff,stroke:#e0e0e0,stroke-width:2px,stroke-dasharray: 5 5,color:#9e9e9e

Data Flow:

  1. JanitorAI sends chat requests to the scrapitor proxy (via Cloudflare tunnel)
  2. scrapitor logs the full request payload as JSON, then forwards to OpenRouter
  3. The parser extracts character data using tag-aware rules
  4. Parsed content is saved as versioned .txt files or exported to SillyTavern JSON

Installation

Windows (Recommended)

Prerequisites:

  • Python 3.10+ (Download — check "Add python.exe to PATH")
  • PowerShell 7: winget install Microsoft.PowerShell

Option A: Download ZIP from GitHub → Code → Download ZIP → Unzip

Option B: Clone with Git:

git clone https://github.com/daksh-7/scrapitor && cd scrapitor

Then: Double-click run.bat

The launcher will:

  • Create a virtual environment and install dependencies
  • Build the Svelte frontend (if Node.js is available and sources changed)
  • Start Flask on port 5000
  • Establish a Cloudflare tunnel and display the public URL
  • Show live status (press Q to quit)
███████╗ ██████╗██████╗  █████╗ ██████╗ ██╗████████╗ ██████╗ ██████╗
██╔════╝██╔════╝██╔══██╗██╔══██╗██╔══██╗██║╚══██╔══╝██╔═══██╗██╔══██╗
███████╗██║     ██████╔╝███████║██████╔╝██║   ██║   ██║   ██║██████╔╝
╚════██║██║     ██╔══██╗██╔══██║██╔═══╝ ██║   ██║   ██║   ██║██╔══██╗
███████║╚██████╗██║  ██║██║  ██║██║     ██║   ██║   ╚██████╔╝██║  ██║
╚══════╝ ╚═════╝╚═╝  ╚═╝╚═╝  ╚═╝╚═╝     ╚═╝   ╚═╝    ╚═════╝ ╚═╝  ╚═╝

  [✓] Python 3.14.0 found
  [✓] Dependencies up to date
  [✓] Cloudflared ready
  [✓] Flask healthy on :5000
  [✓] Tunnel ready

  ┌────────────────────────────────────────────────────────────────┐
  │  Dashboard:  http://localhost:5000                             │
  │  LAN:        http://192.168.0.101:5000                         │
  │  Proxy URL:  https://example.trycloudflare.com/openrouter-cc   │
  └────────────────────────────────────────────────────────────────┘

macOS / Linux

Prerequisites:

  • Python 3.10+ (most systems have this pre-installed)
  • Bash 3.0+ (macOS ships with 3.2, Linux typically has 4.0+)
  • curl (for cloudflared download)

Supported Architectures: | Platform | Architecture | Notes | |----------|--------------|-------| | macOS | Apple Silicon (M1/M2/M3/M4) | arm64 binary auto-downloaded | | macOS | Intel | amd64 binary auto-downloaded | | Linux | x86_64/amd64 | Standard servers and desktops | | Linux | aarch64/arm64 | ARM servers, Raspberry Pi 4+ (64-bit) | | Linux | armv7l/armhf | Raspberry Pi 3 and older (32-bit) |

Option A: Download ZIP from GitHub → Code → Download ZIP → Unzip

Option B: Clone with Git:

git clone https://github.com/daksh-7/scrapitor && cd scrapitor && ./run.sh

The launcher will:

  • Create a virtual environment at app/.venv and install dependencies
  • Auto-download cloudflared from GitHub releases (if not found in PATH)
  • Build the Svelte frontend (if Node.js is available and sources changed)
  • Start Flask on port 5000
  • Establish a Cloudflare tunnel and display the public URL
  • Show live status with uptime (press Q to quit gracefully)

macOS Notes:

  • If you prefer Homebrew: brew install cloudflared (then the launcher uses the system binary)
  • On Apple Silicon, Rosetta is not required — native arm64 binary is used

Manual Setup (alternative):

python3 -m venv app/.venv && source app/.venv/bin/activate && pip install -r app/requirements.txt && python -m app.server

In another terminal (optional):

cloudflared tunnel --no-autoupdate --url http://127.0.0.1:5000

Termux (Android)

Run scrapitor directly on your Android device using Termux.

Prerequisites:

  • Install Termux from F-Droid (the Play Store version is outdated and will not work)
  • ARM64 device required (most Android phones from 2017+ are ARM64)
  • Grant storage permissions: termux-setup-storage

Device Compatibility: | Architecture | Supported | Notes | |--------------|-----------|-------| | ARM64 (aarch64) | Yes | Most modern Android phones and tablets | | ARM32 (armv7l) | No | Older devices; cloudflared binary not available | | x86/x86_64 | Untested | Some Android emulators and Chromebooks |

Install:

pkg update && pkg upgrade -y && pkg install python git curl cloudflared -y
git clone https://github.com/daksh-7/scrapitor && cd scrapitor && ./run.sh

The launcher will:

  • Create a virtual environment at app/.venv and install dependencies
  • Detect Termux environment and show helpful tips
  • Start Flask on port 5000
  • Establish a Cloudflare tunnel and display the public URL
  • Show live status with uptime (press Q to quit gracefully)

Preventing Android from Killing Termux:

Android aggressively kills background apps to save battery. To keep scrapitor running:

termux-wake-lock          # Option 1 (recommended): Run in separate session
pkg install termux-services  # Option 2: Install termux-services
# Option 3: Disable battery optimization for Termux in Android settings

Optional packages:

pkg install nodejs           # For frontend building (~200MB)
pkg install net-tools iproute2  # For better LAN IP detection

Tips for Termux:

  • Access the dashboard from your device's browser at http://localhost:5000
  • Use a split-screen or floating window to keep Termux visible
  • The LAN URL (e.g., http://192.168.x.x:5000) works for other devices on your WiFi
  • If you see "Termux killed in background," the wa
View on GitHub
GitHub Stars16
CategoryOperations
Updated3d ago
Forks2

Languages

Svelte

Security Score

75/100

Audited on Apr 5, 2026

No findings