Samoscout
one-for-all llm powered, passive & active subdomain enumeration tool
Install / Use
/learn @samogod/SamoscoutREADME
samoscout is an orchestrated subdomain discovery pipeline implementing passive reconnaissance, active enumeration, deep-level permutation, and neural network-based prediction. Native Go implementation with zero external binary dependencies for passive sources.
Features
Core Engine
- Multi-phase workflow orchestrator with concurrent execution
- YAML configuration system with runtime reload capability
- HTTP client pool with connection reuse and retry mechanisms
- PostgreSQL database for subdomain tracking
- Optional Elasticsearch indexing for HTTPX results (status, headers, html, tech)
Passive Reconnaissance
- 53 native API integrations without external binary dependencies
- Concurrent source execution with configurable timeouts
- Result deduplication and validation pipeline
- Source-specific rate limiting and error handling
Active Enumeration
- Wordlist-based subdomain generation and permutation
- Custom wordlist support with automatic fallback to six2dez default
- Multi-level dsieve algorithm with Trickest wordlists
- DNS resolution via puredns with wildcard detection
- Rate-limited query execution with trusted resolver pools
Machine Learning Prediction
- PyTorch transformer model for subdomain prediction
- Iterative refinement with validated result feedback
- Configurable generation parameters and recursion depth
- Cross-platform inference via stdin/stdout IPC
Passive Sources AlienVault, Anubis, BeVigil, BufferOver, BuiltWith, C99, CeBaidu, Censys, CertSpotter, Chaos, Chinaz, Cloudflare, CommonCrawl, crt.sh, DigiCert, DigitalYama, DigiTorus, DNSDB, DNSDumpster, DNSGrep, DNSRepo, DriftNet, FOFA, FullHunt, GitLab, HackerTarget, HudsonRock, Hunter, JSMon, MySSL, Netcraft, Netlas, PugRecon, Quake, RapidDNS, ReconCloud, RedHuntLabs, Robtex, RSECloud, SecurityTrails, Shodan, ShrewdEye, SiteDossier, SubdomainCenter, ThreatBook, ThreatCrowd, ThreatMiner, URLScan, VirusTotal, WaybackArchive, WhoisXMLAPI, WindVane, ZoomEye
Technical Implementation
DNS Resolution Integration
puredns Configuration
- Resolver sources: Trickest community + trusted, public-dns.info
- Resolver count: ~73,000 aggregated resolvers
- Trusted resolvers: 31 high-reliability servers
- Rate limiting: 100 queries/sec (normal), 100 queries/sec (trusted)
- Wildcard detection: 30 tests, 1M batch size
- Resolution modes: resolve (standard), bruteforce (wordlist-based)
LLM Inference Architecture
Model Specifications
- Architecture: GPT transformer
- Source: HuggingFace HadrianSecurity/subwiz
- Context window: 1024 tokens
- Tokenization: Comma-separated subdomain list + [DELIM] token
- Decoding: Beam search with top-N sampling
- Temperature: Configurable (default: 0.0 for deterministic output)
Model Development A custom, project-specific finetuned and trained model is currently in development for samoscout. This dedicated model will be significantly larger (GB+) than the current general-purpose model, providing more consistent and accurate results specifically tailored for subdomain discovery workflows. Due to its high resource requirements, users will have the option to choose between:
- Light Model: Current general-purpose model (lower memory footprint)
- Heavy Model: Custom project-specific model (higher accuracy, larger size)
Iterative Refinement
Iteration 1: Seed with passive results → predictions → validate
Iteration 2: Seed with (passive + validated) → predictions → validate
...
Iteration N: Continue until max_recursion or no new results
Install
Quick Install
go install github.com/samogod/samoscout@latest
From Source
git clone https://github.com/samogod/samoscout.git
cd samoscout
go mod download
go build -o samoscout main.go
Deep Enumeration Workflow
Usage
samoscout -h
This will display help for the tool. Here are all the switches it supports.
Usage:
samoscout [flags]
samoscout [command]
Available Commands:
track Query subdomain tracking database
update Update samoscout to the latest version
version Show version information
help Help about any command
Flags:
INPUT:
-d, -domain string target domain to enumerate
-dL, -list string file containing list of domains to enumerate
SOURCE:
-s, -sources string comma-separated list of sources to use (e.g., 'subdomaincenter,shrewdeye')
-es string comma-separated list of sources to exclude (e.g., 'alienvault,zoomeyeapi')
ACTIVE ENUMERATION:
-active enable active subdomain enumeration (wordlist + dsieve + mksub)
-w, -wordlist string custom wordlist path for active enumeration (default: six2dez wordlist)
-deep-enum enable deep level enumeration (dsieve + trickest wordlists)
AI PREDICTION:
-llm enable AI-powered subdomain prediction
HTTP PROBING:
-httpx enable HTTP/HTTPS probing on discovered subdomains
OUTPUT:
-o, -output string file to write output to
-j, -json write output in JSONL(ines) format
-silent silent mode - no banner or extra output
-stats display source statistics after scan
CONFIGURATION:
-c, -config string config file path (default: config/config.yaml)
OPTIMIZATION:
-v, -verbose enable verbose/debug output
Running Samoscout
Basic Operations
# Single domain enumeration
samoscout -d example.com
# Multiple domains from file
samoscout -dL domains.txt
# Source selection
samoscout -d example.com -s alienvault,crtsh,virustotal
# Source exclusion
samoscout -d example.com --es alienvault,anubis
# Output formats
samoscout -d example.com -o output.txt
samoscout -d example.com -j > output.json
# Statistics display
samoscout -d example.com --stats
Advanced Enumeration
# Active enumeration (wordlist + dsieve + mksub + gotator)
samoscout -d example.com --active
# Use custom wordlist for active enumeration
samoscout -d example.com --active -w /path/to/wordlist.txt
# Deep enumeration (multi-level dsieve + trickest wordlists)
samoscout -d example.com --active --deep-enum
# LLM-powered prediction
samoscout -d example.com --llm
# HTTP probing
samoscout -d example.com --httpx
# HTTP probing + Elasticsearch (index HTTPX JSONL)
# Ensure elasticsearch.enabled=true in config.yaml
samoscout -d example.com --httpx
# Combined full enumeration
samoscout -d example.com --active --deep-enum --llm --httpx --stats
# Use custom wordlist with full enumeration
samoscout -d example.com --active -w custom-wordlist.txt --deep-enum --llm --httpx
Elasticsearch Integration
Samoscout can stream HTTPX results directly into Elasticsearch for powerful search and analytics across active web services. Each record contains URL, status_code, title, technologies, webserver, content_type, and content_length, enabling rich filtering (e.g., framework, server, status class) and dashboards.
- Enable in
config.yamlwithelasticsearch.enabled: true - Provide
url,username,password, and optionalindex(default:samoscout_httpx) - Run any scan with
--httpx; the JSONL output at.samoscout_active/<domain>/httpx_results.jsonis bulk-indexed automatically
Expected console output on success:
[ES] Indexed httpx_results.json into index '<name>'
You can then create visualizations and saved searches over fields like status_code, title, technologies, and webserver.

Database Operations
# Query tracked subdomains
samoscout track example.com
# Filter by status
samoscout track example.com --status new
samoscout track example.com --status dead
samoscout track example.com --status active
# List all tracked domains
samoscout track --all
System Operations
# Display version information
samoscout version
# Update to latest release
samoscout update
samoscout update -v
Configuration Reference
Configure API keys and runtime settings in the config file located at $HOME/.config/samoscout/config.yaml.
API Keys Configuration
api_keys:
virustotal: ""
chaos: ""
censys: ""
securitytrails: ""
shodan: ""
Runtime Configuration
default_settings:
timeout: 10 # Global timeout in minutes
active_enumeration:
enabled: false # Enable active enumeration
dsieve_top: 50 # Top N domains for dsieve
dsieve_factor: 4 # Permutation depth
output_dir: ".samoscout_active" # Workspace directory
llm_enumeration:
enabled: false
