Architeuthis

MITM HTTP(S) proxy with integrated load-balancing, rate-limiting and error handling. Built for automated web scraping.

Generate Convert Improve

Install / Use

/learn @simon987/Architeuthis

About this skill

Quality Score

0/100

README

Architeuthis 🦑

GitHub

HTTP(S) proxy with integrated load-balancing, rate-limiting and error handling. Built for automated web scraping.

Strictly obeys configured rate-limiting for each IP & Host
Seamless exponential backoff retries on timeout or error HTTP codes
Requires no additional configuration for integration into existing programs
Configurable per-host behavior
Monitoring with InfluxDB

grafana

Typical use case

user_case

Usage

git clone https://github.com/simon987/Architeuthis
vim config.json # Configure settings here

docker-compose up

You can add proxies using the /add_proxy API:

curl http://<Architeuthis IP>:5050/add_proxy?url=<url>&name=<name>

Or automatically using Proxybroker:

python3 import_from_broker.py http://<Architeuthis IP>:5050

Example usage with wget

export http_proxy="http://localhost:5050"
# --no-check-certificates is necessary for https mitm
# You don't need to specify user-agent if it's already in your config.json
wget -m -np -c --no-check-certificate -R index.html* http http://ca.releases.ubuntu.com/

With "every": "500ms" and a single proxy, you should see

...
level=trace msg=Sleeping wait=414.324437ms
level=trace msg="Routing request" conns=0 proxy=p0 url="http://ca.releases.ubuntu.com/12.04/SHA1SUMS.gpg"
level=trace msg=Sleeping wait=435.166127ms
level=trace msg="Routing request" conns=0 proxy=p0 url="http://ca.releases.ubuntu.com/12.04/SHA256SUMS"
level=trace msg=Sleeping wait=438.657784ms
level=trace msg="Routing request" conns=0 proxy=p0 url="http://ca.releases.ubuntu.com/12.04/SHA256SUMS.gpg"
level=trace msg=Sleeping wait=457.06543ms
level=trace msg="Routing request" conns=0 proxy=p0 url="http://ca.releases.ubuntu.com/12.04/ubuntu-12.04.5-alternate-amd64.iso"
level=trace msg=Sleeping wait=433.394361ms
...

Hot config reload

# Note: this will reset current rate limiters, if there are many active
# connections, this might cause a small request spike and go over
# the rate limits.
./reload.sh

Rules

Conditions

Note that response_time can never be higher than the configured timeout value.

Examples:

[
  {"condition":  "header:X-Test>10", "action":  "..."},
  {"condition":  "body=*Try again in a few minutes*", "action":  "..."},
  {"condition":  "response_time>10s", "action":  "..."},
  {"condition":  "status>500", "action":  "..."},
  {"condition":  "status=404", "action":  "..."},
  {"condition":  "status=40*", "action":  "..."}
]

Actions

In the event of a temporary network error, should_retry is ignored (it will always retry unless dont_retry is set)

Note that having too many rules for one host might negatively impact performance (especially the body condition for large requests)

Sample configuration

{
  "addr": "localhost:5050",
  "timeout": "15s",
  "wait": "4s",
  "multiplier": 2.5,
  "retries": 3,
  "hosts": [
    {
      "host": "*",
      "every": "500ms",
      "burst": 25,
      "headers": {
        "User-Agent": "Some user agent for all requests",
        "X-Test": "Will be overwritten"
      }
    },
    {
      "host": "*.reddit.com",
      "every": "2s",
      "burst": 2,
      "headers": {
        "X-Test": "Will overwrite default"
      }
    },
    {
      "host": ".s3.amazonaws.com",
      "every": "2s",
      "burst": 30,
      "rules": [
        {"condition": "status=403", "action": "dont_retry"}
      ]
    }
  ]
}

Related Skills

node-connect

337.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

83.2k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

337.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

83.2k

Commit, push, and open a PR