Sputnik
Crawling since 1957
Install / Use
/learn @welaika/SputnikREADME

Sputnik
by weLaika
Sputnik is a website crawler written in Elixir.
It crawls a website following all internal links and makes a report of all pages' status codes.
With query flags you can pass one ore more css selector to produce pages report about that.
Build
Sputnik can be built with:
mix deps.get
mix escript.build
Usage
Sputnik takes the url to crawl and optional query to perform on the crawled pages:
Options
- query: valid css selectors, separated by commas, that you want to analyze all over the website
- connections: max number of concurrent HTTP connections (default is 10)
sputnik [--query <Q> --query <Q1> ...] [--connections <N>] <url>
Examples
running
./sputnik "http://spawnfest.github.io" --query "div" --query "a" --query "h1,h2,h3,h4,h5,h6" --connections 10
produces the following output
#################### Pages ####################
Pages found: 19
status_code 200: 12
status_code 301: 7
#################### Queries ####################
## query `a` ##
327 result(s)
Min 18 result(s) per page
Max 57 result(s) per page
## query `div` ##
347 result(s)
Min 13 result(s) per page
Max 53 result(s) per page
## query `h1,h2,h3,h4,h5,h6` ##
95 result(s)
Min 0 result(s) per page
Max 31 result(s) per page
and it opens the browser with a page like this

Requirements
Documentation can be generated with ExDoc and published on HexDocs. Once published, the docs can be found at https://hexdocs.pm/sputnik.
Testing
To run tests:
$ mix test --cover
To run credo:
$ mix credo
Documentation
To generate the documentation:
$ mix docs && open doc/index.html
Releasing
Bump the version in mix.exs, commit && push, and run mix hex.publish
Please read https://hex.pm/docs/publish for help.
Related Skills
node-connect
343.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
90.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
