SkillAgentSearch skills...

Webscraper

A multi-threaded web image scraper written in C that extracts and downloads all images from a webpage. Uses libcurl for HTTP requests, libxml2 for HTML parsing, and POSIX threads for parallel downloads.

Install / Use

/learn @7etsuo/Webscraper
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Web Image Scraper

$TETSUO on Solana

Contract Address: 8i51XNNpGaKaj4G4nDdmQh95v4FKAxw8mhtaRoKd9tE8

Twitter Discord


A fast, multi-threaded utility to extract and download all images from a webpage.

image

Features

  • Extracts all image URLs from a target webpage
  • Resolves relative URLs to absolute URLs
  • Avoids duplicate downloads with O(1) lookup
  • Uses multiple threads for parallel downloading
  • Preserves original file extensions when possible

Requirements

  • libcurl (HTTP requests)
  • libxml2 (HTML parsing)
  • POSIX threads

Installation

Ubuntu/Debian

sudo apt-get install libcurl4-openssl-dev libxml2-dev

Fedora/RHEL/CentOS

sudo dnf install libcurl-devel libxml2-devel

macOS (with Homebrew)

brew install curl libxml2

Compilation

gcc -o webscraper webscraper.c $(curl-config --cflags --libs) $(xml2-config --cflags --libs) -pthread

Usage

./webscraper <url>

Example:

./webscraper https://example.com

Downloaded images will be saved in the downloaded_images directory.

License

MIT

View on GitHub
GitHub Stars9
CategoryDevelopment
Updated1mo ago
Forks3

Languages

C

Security Score

70/100

Audited on Feb 26, 2026

No findings