SkillAgentSearch skills...

Scouter

Seo Crawler Saas Open source

Install / Use

/learn @lokoe-mehdi/Scouter
About this skill

Quality Score

0/100

Category

Marketing

Supported Platforms

Universal

README

Scouter

License: MIT PRs Welcome Made with Love

Professional SEO Crawler with web-based analysis interface, built by Lokoé.

Version PHP PostgreSQL Docker


<img width="1913" height="952" alt="image" src="https://github.com/user-attachments/assets/004ecad5-1479-468e-a34f-c6dc3bbae312" />

Quick Install

git clone https://github.com/lokoe-mehdi/scouter.git && cd scouter && chmod +x start.sh && ./start.sh

Requirements: Linux or WSL on Windows, with Docker installed.

Access: http://localhost:8080 On first launch, you'll be prompted to create an admin account.

Deployment

Scouter can also be easily deployed with Coolify using the provided docker-compose.yml.


Features

Crawl

  • Multi-depth: Configurable crawl depth (0 to N)
  • Robots.txt: Respects Allow/Disallow directives
  • Canonical: Detection and tracking of canonical tags
  • JavaScript: Rendering mode via Puppeteer (SPA support)
  • Parallelism: Configurable concurrent requests
  • Docker Workers: Distributed architecture with async workers

SEO Analysis

  • On-page: Title, H1, meta description, headings
  • Technical: HTTP status codes, response times, redirects
  • Content: Word count, duplicate detection (Simhash)
  • Structured Data: JSON-LD schema detection
  • Internal Linking: Inlinks, outlinks, internal PageRank

Custom Extractors

  • XPath: Extract any HTML element
  • Regex: Pattern matching on source code

Categorization

  • YAML Editor: Configure categorization rules
  • Visual Mode: Drag & drop interface for rules
  • Test Mode: Preview before applying
  • Default Template: cat.yml applied automatically

Interface

  • Dashboard: Overview with charts
  • Explorer: Filterable table of all URLs
  • SQL Explorer: Custom SQL queries
  • CSV Export: Data download
  • Multi-user Management: Admin/user/viewer roles

Architecture

scouter/
├── app/
│   ├── Analysis/           # SEO analysis
│   │   ├── PostProcessor.php   # Crawl post-processing
│   │   ├── RobotsTxt.php       # Robots.txt parser
│   │   └── Simhash.php         # Duplicate detection
│   ├── Auth/               # Authentication
│   │   └── Auth.php            # Session & permission management
│   ├── Cli/                # CLI tools
│   │   └── Cmder.php           # Crawl/resume/stop commands
│   ├── Core/               # Crawler core
│   │   ├── Crawler.php         # Main orchestrator
│   │   ├── DepthCrawler.php    # Depth-based crawling
│   │   ├── Page.php            # Page analysis
│   │   └── PageCrawler.php     # Single page crawl
│   ├── Database/           # Data layer (PostgreSQL)
│   │   ├── PostgresDatabase.php    # Singleton connection
│   │   ├── CrawlDatabase.php       # Crawl queries
│   │   ├── CrawlRepository.php     # Crawl CRUD
│   │   ├── ProjectRepository.php   # Project CRUD
│   │   ├── UserRepository.php      # User CRUD
│   │   ├── CategoryRepository.php  # Category CRUD
│   │   ├── PageRepository.php      # Page CRUD
│   │   └── LinkRepository.php      # Link CRUD
│   ├── Http/               # HTTP layer (REST API)
│   │   ├── Router.php          # REST router
│   │   ├── Request.php         # HTTP request wrapper
│   │   ├── Response.php        # JSON responses
│   │   ├── Controller.php      # Base controller class
│   │   └── Controllers/        # API controllers
│   │       ├── ProjectController.php
│   │       ├── CrawlController.php
│   │       ├── UserController.php
│   │       ├── CategoryController.php
│   │       ├── CategorizationController.php
│   │       ├── JobController.php
│   │       ├── QueryController.php
│   │       ├── ExportController.php
│   │       └── MonitorController.php
│   ├── Job/                # Async job management
│   │   └── JobManager.php      # Queue and job status
│   ├── Util/               # Utilities
│   │   ├── HtmlParser.php      # XPath/Regex parsing
│   │   ├── JsRenderer.php      # JavaScript rendering client
│   │   └── UrlHelper.php       # URL manipulation
│   └── bin/                # Executable scripts
│       └── worker.php          # Docker worker
├── web/                    # Web interface
│   ├── api/
│   │   └── index.php       # Single REST API entry point
│   ├── pages/              # HTML pages
│   ├── components/         # Reusable components
│   └── assets/             # CSS/JS
├── docker/                 # Docker configuration
├── migrations/             # PostgreSQL migrations
├── tests/                  # Tests (Pest)
│   ├── Unit/               # Unit tests
│   └── Feature/            # Feature tests
├── docs/                   # Documentation
│   ├── phpdoc/             # PHP documentation (Doctum)
│   ├── ARCHITECTURE.md     # Technical architecture
│   ├── ROUTER.md           # Router documentation
│   ├── TESTING.md          # Testing guide
│   └── ...
└── cat.yml                 # Default categorization template

Tech Stack

| Component | Technology | |-----------|------------| | Backend | PHP 8.1+ | | Database | PostgreSQL 15+ | | Frontend | HTML/CSS/JS vanilla | | Containerization | Docker + Docker Compose | | Tests | Pest PHP | | Documentation | Doctum | | JS Rendering | Go + Chromedp |


Documentation

User Guides

Technical Documentation

Generate PHP Documentation

./generate-docs.sh
# Output in docs/phpdoc/

Useful Commands

Docker

./start.sh                              # Start (rebuild + up)
docker-compose down                     # Stop
docker-compose logs -f app              # Application logs
docker-compose logs -f worker           # Worker logs
docker exec -it scouter bash            # Container shell

Database

docker exec scouter php /app/migrations/migrate.php  # Run migrations

Tests

./vendor/bin/pest                       # Run tests (local)
docker exec scouter ./vendor/bin/pest   # Run tests (Docker)

Documentation

./generate-docs.sh                      # Generate API docs

REST API

The API uses a centralized router (web/api/index.php) with the following endpoints:

Projects

| Method | Endpoint | Description | |--------|----------|-------------| | GET | /api/projects | List projects | | POST | /api/projects | Create a project/crawl | | DELETE | /api/projects/{id} | Delete a project | | POST | /api/projects/duplicate | Duplicate a crawl |

Crawls

| Method | Endpoint | Description | |--------|----------|-------------| | GET | /api/crawls/info | Crawl info | | GET | /api/crawls/running | Running crawls | | POST | /api/crawls/start | Start a crawl | | POST | /api/crawls/stop | Stop a crawl | | POST | /api/crawls/resume | Resume a crawl |

Categorization

| Method | Endpoint | Description | |--------|----------|-------------| | POST | /api/categorization/save | Save and apply | | POST | /api/categorization/test | Test without applying | | GET | /api/categorization/stats | Statistics |

Users (admin)

| Method | Endpoint | Description | |--------|----------|-------------| | GET | /api/users | List users | | POST | /api/users | Create a user | | PUT | /api/users/{id} | Update a user | | DELETE | /api/users/{id} | Delete a user |


Main Classes

| Namespace | Class | Description | |-----------|-------|-------------| | App\Core | Crawler | Main crawl orchestrator | | App\Core | DepthCrawler | Depth-based crawling with parallel requests | | App\Core | Page | Page data analysis and extraction | | App\Database | PostgresDatabase | PostgreSQL singleton connection | | App\Database | CrawlRepository | Crawl CRUD operations | | App\Database | ProjectRepository | Project CRUD operations | | App\Auth | Auth | Authentication and access control | | App\Analysis | RobotsTxt | Robots.txt parsing and interpretation | | App\Analysis | Simhash | Duplicate content detection | | App\Util | JsRenderer | JavaScript rendering client | | App\Http | Router | REST API router | | App\Job | JobManager | Async job management |


License

This project is licensed under the MIT License - see the LICENSE file for details.

Copyright (c) 2026 Lokoé SASU

Scouter - Professional SEO Crawler by Lokoé

View on GitHub
GitHub Stars56
CategoryMarketing
Updated10d ago
Forks5

Languages

PHP

Security Score

95/100

Audited on Mar 30, 2026

No findings