SkillAgentSearch skills...

WebRecon

WebRecon is an advanced Open Source Intelligence (OSINT) web reconnaissance tool designed for cybersecurity professionals, penetration testers, and security researchers. It automates the process of gathering intelligence from target websites through comprehensive crawling, data extraction, and analysis.

Install / Use

/learn @techenthusiast167/WebRecon
About this skill

Quality Score

0/100

Category

Design

Supported Platforms

Universal

README

WebRecon Pro

Advanced OSINT Web Reconnaissance Tool with Relationship Graph Visualization

WebRecon Pro Python OSINT Graph Cybersecurity

Overview

WebRecon Pro is an advanced Open Source Intelligence (OSINT) web reconnaissance tool designed for cybersecurity professionals, penetration testers, and security researchers. This major update introduces comprehensive relationship graph visualization, tabular reporting, enhanced image downloading, and sophisticated data correlation capabilities.

Unlike traditional reconnaissance tools, WebRecon Pro focuses on understanding the relationships between discovered entities - emails, social media profiles, technologies, domains, and people - creating an interactive visual map of organizational digital footprints.

WebRecon Features

  • Relationship Graph Visualization: Interactive network graphs showing connections between entities

  • Comprehensive Tabular Reporting: Structured data tables with source tracking

  • Enhanced Image Intelligence: Smart image downloading with metadata extraction

  • Data Correlation Engine: Intelligent relationship discovery between findings

  • Multi-format Export: JSON, HTML, CSV, and interactive visualizations

Enhanced Capabilities

  • Advanced False Positive Filtering: Improved email and image pattern detection

  • Source Tracking: Every finding traced back to its source URL

  • Interactive HTML Reports: Browser-based exploration of findings

  • Network Analysis Metrics: Centrality, clustering, and relationship strength calculations

  • Professional Output: Enterprise-ready reports and visualizations

Key Features

Intelligent Web Crawling

  • Configurable depth and breadth crawling (1-5 levels)

  • Robots.txt and sitemap.xml parsing

  • JavaScript source extraction

  • Login page detection

  • Cloud storage discovery (AWS S3, Azure, GCP)

Advanced Email Harvesting

  • Intelligent pattern matching with false positive filtering

  • Domain-based email grouping

  • Source URL tracking for each email

  • Image filename exclusion

  • Corporate vs personal email classification

Social Media Intelligence

  • Platform-specific pattern matching (LinkedIn, Twitter, Facebook, etc.)

  • Username extraction and correlation

  • Profile validation to avoid share buttons and widgets

  • Organizational vs personal profile detection

Image Intelligence Suite

  • Smart Image Downloading: Filters placeholders and icons

  • Metadata Extraction: EXIF data, dimensions, file types

  • Thumbnail Generation: Automatic resizing for analysis

  • HTML Gallery Creation: Visual browsing of collected images

  • Size Filtering: Configurable minimum/maximum file sizes

Technology Stack Detection

  • 50+ technology patterns (CMS, frameworks, servers, analytics)

  • Header and content-based detection

  • Marketing tag identification (GA, GTM, Facebook Pixel)

  • CDN and hosting provider detection

DNS & Network Intelligence

  • Comprehensive DNS record enumeration (A, MX, TXT, NS, CNAME)

  • Domain IP resolution and reverse DNS lookup

  • Subdomain discovery from crawled content

  • Automated DNSDumpster browser integration

Document & File Discovery

  • File type detection (PDF, DOC, XLS, PPT, CSV, etc.)

  • Configuration file discovery (.config, .conf, .ini)

  • Log file identification (.log)

  • Database file detection (.sql)

Relationship Graph System

  • Interactive Network Visualization: Drag, zoom, explore relationships

  • Entity Categorization: Automatic classification of nodes

  • Intelligent Relationship Discovery: Same-domain emails, shared usernames, etc

  • Network Metrics: Centrality, density, clustering coefficients

  • Export Formats: HTML interactive, JSON data, analysis reports

Comprehensive Reporting

  • Tabular Data Presentation: Organized category-based tables

  • Source Tracking: Every finding linked to its discovery URL

  • Multi-format Export: JSON, HTML, CSV, Text

  • Executive Summaries: High-level overviews with statistics

  • Detailed Findings: Complete data with context and sources

Relationship Graph Visualization

Graph Features

  • Interactive HTML Graphs: Drag nodes, zoom, hover for details

  • Entity Categories: Color-coded nodes (emails, domains, social, tech, etc.)

  • Intelligent Layout: Force-directed graph algorithms

  • Relationship Types: Different line styles for different connections

  • Network Analysis: Metrics and insights about the discovered network

Node Categories & Colors

  • 🔵 Domains: Target and related domains

  • 🔴 Emails: Discovered email addresses

  • 🟢 Social Media: Profiles and accounts

  • 🟡 IP Addresses: Network infrastructure

  • 🟣 People/Organizations: WHOIS and contact information

  • 🟠 Documents/Files: Discovered files

  • 🔶 Technologies: Detected tech stack

  • URLs: Web pages and endpoints

Relationship Types

  • Solid Blue Lines: Direct domain relationships

  • Red Lines: Same email domain connections

  • Purple Lines: Same username across platforms

  • Dashed Gray Lines: Found-on-page relationships

  • Green Dashed Lines: Technology usage relationships

Graph Output Files

webrecon_output/graphs/
├── relationship_graph_domain_timestamp.html  # Interactive graph
├── graph_data_domain_timestamp.json          # Raw graph data
└── graph_analysis_report.txt                 # Network metrics

Output Structure

Default Output Directory

webrecon_output/
├── comprehensive_report_TIMESTAMP.json       # Complete JSON data
├── comprehensive_report_TIMESTAMP.html       # Interactive HTML report
├── comprehensive_report_TIMESTAMP.txt        # Text summary
├── images_DOMAIN_TIMESTAMP/                  # Downloaded images
│   ├── raw/                                  # Original images
│   ├── thumbnails/                           # Resized thumbnails
│   ├── extracted/                            # Metadata and extracted data
│   ├── gallery.html                          # Image gallery
│   ├── metadata.json                         # Image metadata
│   └── images_summary.csv                    # CSV summary
├── graphs/                                   # Relationship graphs
│   ├── relationship_graph_DOMAIN_TIMESTAMP.html
│   ├── graph_data_DOMAIN_TIMESTAMP.json
│   └── graph_analysis_report.txt
└── webrecon_DOMAIN_TIMESTAMP.json            # Legacy JSON format

Installation

Prerequisites

  • Python 3.8 or higher

  • pip package manager

  • 500MB+ free disk space (for images and graphs)

Quick Installation

Direct Download using Wget

wget -O WebRecon.py https://gist.githubusercontent.com/techenthusiast167/47c8f8c94a520c8d96a1495b7c9a1fcb/raw/42e8dacea162de67c8377682aea3907349aa6c9d/WebRecon.py

Make Executable (Optional)

chmod +x WebRecon.py

Install Dependencies

pip install requests beautifulsoup4 colorama tabulate tldextract dnspython python-whois pillow networkx pyvis lxml html5lib pysocks urllib3

Note: For the most stable installation, it is highly recommended to use a Python virtual environment. This prevents conflicts with your system's global Python packages.

Verification

Test installation

python3 WebRecon.py --help

Expected output should show features including:

--no-graphs Disable relationship graph generation --table-only Display only tabular output --detailed-tables Show detailed tables for all categories

Usage Examples

Basic Usage

Comprehensive reconnaissance with all features

python3 WebRecon.py https://example.com

With custom output directory

python3 WebRecon.py https://example.com --output ./my_report.json

Limited crawling

python3 WebRecon.py https://example.com --max-pages 50 --max-depth 2

Advanced Reconnaissance

Enterprise reconnaissance with full graph visualization

python3 WebRecon.py https://target-company.com --max-pages 200 --max-depth 3

Stealth reconnaissance through Tor

python3 WebRecon.py https://target.com --proxy socks5://127.0.0.1:9050

Technology-focused reconnaissance

python3 WebRecon.py https://tech-company.com --no-images --no-dnsdumpster

Feature Control

Disable specific modules

python3 WebRecon.py https://example.com \

--no-images \ # Disable image downloading --no-graphs \ # Disable relationship graphs --no-dns \ # Disable DNS reconnaissance --no-whois \ # Disable WHOIS lookup --no-wayback \ # Disable Wayback Machine --no-builtwith \ # Disable BuiltWith analysis --no-dnsdumpster # Disable DNSDumpster

Table-only output mode

python3 WebRecon.py https://example.com --table-only

Detailed tabular output

python3 WebRecon.py https://example.com --detailed-tables

Output Customization

Custom proxy configuration

python3 WebRecon.py https://example.com --proxy http://proxy:8080

Specific crawl limits

python3 WebRecon.py https://large-site.com --max-pages 500 --max-depth 4

Save to specific location

python3 WebRecon.py https://example.com --output /path/to/report.json

Command Line Arguments

Basic Arguments

| Argument | Description | Default | |----------|-------------|---------| | url | Target URL for reconnaissance | Required | | --max-pages | Maximum pages to crawl | 100 | | --max-depth | Maximum crawl depth | 2 | | --output | Custom output file path | Auto-generated | |

Related Skills

View on GitHub
GitHub Stars256
CategoryDesign
Updated7d ago
Forks39

Security Score

80/100

Audited on Mar 12, 2026

No findings