WebRecon
WebRecon is an advanced Open Source Intelligence (OSINT) web reconnaissance tool designed for cybersecurity professionals, penetration testers, and security researchers. It automates the process of gathering intelligence from target websites through comprehensive crawling, data extraction, and analysis.
Install / Use
/learn @techenthusiast167/WebReconREADME
WebRecon Pro
Advanced OSINT Web Reconnaissance Tool with Relationship Graph Visualization
Overview
WebRecon Pro is an advanced Open Source Intelligence (OSINT) web reconnaissance tool designed for cybersecurity professionals, penetration testers, and security researchers. This major update introduces comprehensive relationship graph visualization, tabular reporting, enhanced image downloading, and sophisticated data correlation capabilities.
Unlike traditional reconnaissance tools, WebRecon Pro focuses on understanding the relationships between discovered entities - emails, social media profiles, technologies, domains, and people - creating an interactive visual map of organizational digital footprints.
WebRecon Features
-
Relationship Graph Visualization: Interactive network graphs showing connections between entities
-
Comprehensive Tabular Reporting: Structured data tables with source tracking
-
Enhanced Image Intelligence: Smart image downloading with metadata extraction
-
Data Correlation Engine: Intelligent relationship discovery between findings
-
Multi-format Export: JSON, HTML, CSV, and interactive visualizations
Enhanced Capabilities
-
Advanced False Positive Filtering: Improved email and image pattern detection
-
Source Tracking: Every finding traced back to its source URL
-
Interactive HTML Reports: Browser-based exploration of findings
-
Network Analysis Metrics: Centrality, clustering, and relationship strength calculations
-
Professional Output: Enterprise-ready reports and visualizations
Key Features
Intelligent Web Crawling
-
Configurable depth and breadth crawling (1-5 levels)
-
Robots.txt and sitemap.xml parsing
-
JavaScript source extraction
-
Login page detection
-
Cloud storage discovery (AWS S3, Azure, GCP)
Advanced Email Harvesting
-
Intelligent pattern matching with false positive filtering
-
Domain-based email grouping
-
Source URL tracking for each email
-
Image filename exclusion
-
Corporate vs personal email classification
Social Media Intelligence
-
Platform-specific pattern matching (LinkedIn, Twitter, Facebook, etc.)
-
Username extraction and correlation
-
Profile validation to avoid share buttons and widgets
-
Organizational vs personal profile detection
Image Intelligence Suite
-
Smart Image Downloading: Filters placeholders and icons
-
Metadata Extraction: EXIF data, dimensions, file types
-
Thumbnail Generation: Automatic resizing for analysis
-
HTML Gallery Creation: Visual browsing of collected images
-
Size Filtering: Configurable minimum/maximum file sizes
Technology Stack Detection
-
50+ technology patterns (CMS, frameworks, servers, analytics)
-
Header and content-based detection
-
Marketing tag identification (GA, GTM, Facebook Pixel)
-
CDN and hosting provider detection
DNS & Network Intelligence
-
Comprehensive DNS record enumeration (A, MX, TXT, NS, CNAME)
-
Domain IP resolution and reverse DNS lookup
-
Subdomain discovery from crawled content
-
Automated DNSDumpster browser integration
Document & File Discovery
-
File type detection (PDF, DOC, XLS, PPT, CSV, etc.)
-
Configuration file discovery (.config, .conf, .ini)
-
Log file identification (.log)
-
Database file detection (.sql)
Relationship Graph System
-
Interactive Network Visualization: Drag, zoom, explore relationships
-
Entity Categorization: Automatic classification of nodes
-
Intelligent Relationship Discovery: Same-domain emails, shared usernames, etc
-
Network Metrics: Centrality, density, clustering coefficients
-
Export Formats: HTML interactive, JSON data, analysis reports
Comprehensive Reporting
-
Tabular Data Presentation: Organized category-based tables
-
Source Tracking: Every finding linked to its discovery URL
-
Multi-format Export: JSON, HTML, CSV, Text
-
Executive Summaries: High-level overviews with statistics
-
Detailed Findings: Complete data with context and sources
Relationship Graph Visualization
Graph Features
-
Interactive HTML Graphs: Drag nodes, zoom, hover for details
-
Entity Categories: Color-coded nodes (emails, domains, social, tech, etc.)
-
Intelligent Layout: Force-directed graph algorithms
-
Relationship Types: Different line styles for different connections
-
Network Analysis: Metrics and insights about the discovered network
Node Categories & Colors
-
🔵 Domains: Target and related domains
-
🔴 Emails: Discovered email addresses
-
🟢 Social Media: Profiles and accounts
-
🟡 IP Addresses: Network infrastructure
-
🟣 People/Organizations: WHOIS and contact information
-
🟠 Documents/Files: Discovered files
-
🔶 Technologies: Detected tech stack
-
⚫ URLs: Web pages and endpoints
Relationship Types
-
Solid Blue Lines: Direct domain relationships
-
Red Lines: Same email domain connections
-
Purple Lines: Same username across platforms
-
Dashed Gray Lines: Found-on-page relationships
-
Green Dashed Lines: Technology usage relationships
Graph Output Files
webrecon_output/graphs/
├── relationship_graph_domain_timestamp.html # Interactive graph
├── graph_data_domain_timestamp.json # Raw graph data
└── graph_analysis_report.txt # Network metrics
Output Structure
Default Output Directory
webrecon_output/
├── comprehensive_report_TIMESTAMP.json # Complete JSON data
├── comprehensive_report_TIMESTAMP.html # Interactive HTML report
├── comprehensive_report_TIMESTAMP.txt # Text summary
├── images_DOMAIN_TIMESTAMP/ # Downloaded images
│ ├── raw/ # Original images
│ ├── thumbnails/ # Resized thumbnails
│ ├── extracted/ # Metadata and extracted data
│ ├── gallery.html # Image gallery
│ ├── metadata.json # Image metadata
│ └── images_summary.csv # CSV summary
├── graphs/ # Relationship graphs
│ ├── relationship_graph_DOMAIN_TIMESTAMP.html
│ ├── graph_data_DOMAIN_TIMESTAMP.json
│ └── graph_analysis_report.txt
└── webrecon_DOMAIN_TIMESTAMP.json # Legacy JSON format
Installation
Prerequisites
-
Python 3.8 or higher
-
pip package manager
-
500MB+ free disk space (for images and graphs)
Quick Installation
Direct Download using Wget
wget -O WebRecon.py https://gist.githubusercontent.com/techenthusiast167/47c8f8c94a520c8d96a1495b7c9a1fcb/raw/42e8dacea162de67c8377682aea3907349aa6c9d/WebRecon.py
Make Executable (Optional)
chmod +x WebRecon.py
Install Dependencies
pip install requests beautifulsoup4 colorama tabulate tldextract dnspython python-whois pillow networkx pyvis lxml html5lib pysocks urllib3
Note: For the most stable installation, it is highly recommended to use a Python virtual environment. This prevents conflicts with your system's global Python packages.
Verification
Test installation
python3 WebRecon.py --help
Expected output should show features including:
--no-graphs Disable relationship graph generation --table-only Display only tabular output --detailed-tables Show detailed tables for all categories
Usage Examples
Basic Usage
Comprehensive reconnaissance with all features
python3 WebRecon.py https://example.com
With custom output directory
python3 WebRecon.py https://example.com --output ./my_report.json
Limited crawling
python3 WebRecon.py https://example.com --max-pages 50 --max-depth 2
Advanced Reconnaissance
Enterprise reconnaissance with full graph visualization
python3 WebRecon.py https://target-company.com --max-pages 200 --max-depth 3
Stealth reconnaissance through Tor
python3 WebRecon.py https://target.com --proxy socks5://127.0.0.1:9050
Technology-focused reconnaissance
python3 WebRecon.py https://tech-company.com --no-images --no-dnsdumpster
Feature Control
Disable specific modules
python3 WebRecon.py https://example.com \
--no-images \ # Disable image downloading --no-graphs \ # Disable relationship graphs --no-dns \ # Disable DNS reconnaissance --no-whois \ # Disable WHOIS lookup --no-wayback \ # Disable Wayback Machine --no-builtwith \ # Disable BuiltWith analysis --no-dnsdumpster # Disable DNSDumpster
Table-only output mode
python3 WebRecon.py https://example.com --table-only
Detailed tabular output
python3 WebRecon.py https://example.com --detailed-tables
Output Customization
Custom proxy configuration
python3 WebRecon.py https://example.com --proxy http://proxy:8080
Specific crawl limits
python3 WebRecon.py https://large-site.com --max-pages 500 --max-depth 4
Save to specific location
python3 WebRecon.py https://example.com --output /path/to/report.json
Command Line Arguments
Basic Arguments
| Argument | Description | Default |
|----------|-------------|---------|
| url | Target URL for reconnaissance | Required |
| --max-pages | Maximum pages to crawl | 100 |
| --max-depth | Maximum crawl depth | 2 |
| --output | Custom output file path | Auto-generated |
|
Related Skills
diffs
325.9kUse the diffs tool to produce real, shareable diffs (viewer URL, file artifact, or both) instead of manual edit summaries.
ui-ux-pro-max-skill
46.4kAn AI SKILL that provide design intelligence for building professional UI/UX multiple platforms
onlook
24.9kThe Cursor for Designers • An Open-Source AI-First Design tool • Visually build, style, and edit your React App with AI
Figma-Context-MCP
13.8kMCP server to provide Figma layout information to AI coding agents like Cursor
Security Score
Audited on Mar 12, 2026
