Architeuthis
MITM HTTP(S) proxy with integrated load-balancing, rate-limiting and error handling. Built for automated web scraping.
Install / Use
/learn @simon987/ArchiteuthisREADME
Architeuthis 🦑
HTTP(S) proxy with integrated load-balancing, rate-limiting and error handling. Built for automated web scraping.
- Strictly obeys configured rate-limiting for each IP & Host
- Seamless exponential backoff retries on timeout or error HTTP codes
- Requires no additional configuration for integration into existing programs
- Configurable per-host behavior
- Monitoring with InfluxDB

Typical use case

Usage
git clone https://github.com/simon987/Architeuthis
vim config.json # Configure settings here
docker-compose up
You can add proxies using the /add_proxy API:
curl http://<Architeuthis IP>:5050/add_proxy?url=<url>&name=<name>
Or automatically using Proxybroker:
python3 import_from_broker.py http://<Architeuthis IP>:5050
Example usage with wget
export http_proxy="http://localhost:5050"
# --no-check-certificates is necessary for https mitm
# You don't need to specify user-agent if it's already in your config.json
wget -m -np -c --no-check-certificate -R index.html* http http://ca.releases.ubuntu.com/
With "every": "500ms" and a single proxy, you should see
...
level=trace msg=Sleeping wait=414.324437ms
level=trace msg="Routing request" conns=0 proxy=p0 url="http://ca.releases.ubuntu.com/12.04/SHA1SUMS.gpg"
level=trace msg=Sleeping wait=435.166127ms
level=trace msg="Routing request" conns=0 proxy=p0 url="http://ca.releases.ubuntu.com/12.04/SHA256SUMS"
level=trace msg=Sleeping wait=438.657784ms
level=trace msg="Routing request" conns=0 proxy=p0 url="http://ca.releases.ubuntu.com/12.04/SHA256SUMS.gpg"
level=trace msg=Sleeping wait=457.06543ms
level=trace msg="Routing request" conns=0 proxy=p0 url="http://ca.releases.ubuntu.com/12.04/ubuntu-12.04.5-alternate-amd64.iso"
level=trace msg=Sleeping wait=433.394361ms
...
Hot config reload
# Note: this will reset current rate limiters, if there are many active
# connections, this might cause a small request spike and go over
# the rate limits.
./reload.sh
Rules
Conditions
| Left operand | Description | Allowed operators | Right operand
| :--- | :--- | :--- | :---
| body | Contents of the response | =, != | String w/ wildcard
| body | Contents of the response | <, > | float
| status | HTTP response code | =, != | String w/ wildcard
| status | HTTP response code | <, > | float
| response_time | HTTP response code | <, > | duration (e.g. 20s)
| header:<header> | Response header | =, != | String w/ wildcard
| header:<header> | Response header | <, > | float
Note that response_time can never be higher than the configured timeout value.
Examples:
[
{"condition": "header:X-Test>10", "action": "..."},
{"condition": "body=*Try again in a few minutes*", "action": "..."},
{"condition": "response_time>10s", "action": "..."},
{"condition": "status>500", "action": "..."},
{"condition": "status=404", "action": "..."},
{"condition": "status=40*", "action": "..."}
]
Actions
| Action | Description | :--- | :--- | | should_retry | Override default retry behavior for http errors (by default it retries on 403,408,429,444,499,>500) | force_retry | Always retry (Up to retries_hard times) | dont_retry | Immediately stop retrying
In the event of a temporary network error, should_retry is ignored (it will always retry unless dont_retry is set)
Note that having too many rules for one host might negatively impact performance (especially the body condition for large requests)
Sample configuration
{
"addr": "localhost:5050",
"timeout": "15s",
"wait": "4s",
"multiplier": 2.5,
"retries": 3,
"hosts": [
{
"host": "*",
"every": "500ms",
"burst": 25,
"headers": {
"User-Agent": "Some user agent for all requests",
"X-Test": "Will be overwritten"
}
},
{
"host": "*.reddit.com",
"every": "2s",
"burst": 2,
"headers": {
"X-Test": "Will overwrite default"
}
},
{
"host": ".s3.amazonaws.com",
"every": "2s",
"burst": 30,
"rules": [
{"condition": "status=403", "action": "dont_retry"}
]
}
]
}
Related Skills
node-connect
337.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
337.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.2kCommit, push, and open a PR
