Zimfarm
Farm operated by bots to grow and harvest new zim files
Install / Use
/learn @openzim/ZimfarmREADME
ZIM Farm
The ZIM farm (zimfarm) is a semi-decentralised software solution to build ZIM files efficiently. This means scraping Web contents, packaging them into a ZIM file and uploading the result to an online ZIM files repository.
Documentation
- Editors Guide: For recipe managers - learn about recipes, tasks, workers, and how to manage content creation
- Contributing Guide: How to contribute to Zimfarm development
How does it work?
The Zimfarm platform is a combination of different tools:
backend
The backend is a central database and API that records recipes (metadata of ZIM to produce) and tasks. It decides when a ZIM file should be recreated (based on the recipe), creates and assigns tasks to workers.
frontend
The frontend, available at farm.openzim.org is a simple consumer of the backend API.
It is used to create, clone and edit recipes, but also to monitor the evolution of tasks and workers.
Anybody can use it in read-only mode.
workers
Workers are always-running computers which gets assigned ZIM creation tasks by the dispatcher. If you are interested in providing us worker resources, please read these instructions.
Note: The repository contains both a
worker/folder (singular) and aworkers/folder (plural). Theworker/folder is the current and actively maintained location for all worker-related code and documentation. Theworkers/folder is kept for backwards compatibility as it contains the legacyzimfarm.shscript that existing worker installations may still reference. If you're setting up a new worker or updating an existing one, you should use the script and documentation from theworker/folder. Old clients using the legacy script fromworkers/contrib/zimfarm.shshould update their installation to use the new location atworker/contrib/zimfarm.sh.
A worker is made of two software components:
worker-manager
The manager is responsible for declaring its available resources and configuration and receives tasks assigned to it by the dispatcher. It's a very-low resources container whose job is to spawn task-worker ones.
task-worker
The task-worker is responsible for running a specific task. It's also a very-low resources container but contrary to the manager, one is spawned for each task assigned to the worker (the manager defines the concurrency based on resources).
The task-worker's role is to start and monitor the scraper's container for the task and to spawn uploader containers for both created ZIM files and logs.
uploader
The uploader is instantiated by the task-worker to upload, individually, each created ZIM files, as well as the scraper's container log.
The uploader supports both SCP and SFTP. We are currently using SFTP for all uploads due to a slight speed gain.
Uploader is very fast and convenient (can watch and resumes files) but works only off files at the moment.
dnscache
The dnscache is a dnsmasq server instantiated by the task-worker that ensures specific nameservers are used and caching of DNS results. This ensures that, if DNS becomes unstable, running tasks will not be affected
receiver
The receiver is a jailed OpenSSH-server that receives scraper logs and ZIM files and either put them aside (if file is not at root of source directory) or move them to the public download server.
scrapers
Scrapers are the tools used to actually convert a scraping request (recorded in a Zimfarm recipe) into one or several ZIM files.
The most important one is the Mediawiki scraper, called mwoffliner but there are many of them for Stack-Exchange, Project Gutenberg, PhET and others.
Scrapers are not part of the Zimfarm. Those are completely independent projects for which the requirements to integrate into the Zimfarm are minimal:
- Works completely off a docker image
- Arguments should be set on the command line
- ZIM output folder should be settable via an argument
How do I request a ZIM file?
ZIM file requests are handled on zim-requests repository.
If there's already a scraper for the website you want to convert to ZIM, someone with editor access to the Zimfarm will create the recipe and in a few days, a ZIM file should be available.
Getting Started
- Recipe Managers/Editors: Start with the Editors Guide to learn how to create and manage recipes
- Workers: Follow the Worker README to set up a worker node
- Contributors: Check CONTRIBUTING.md for development setup and guidelines
Support
For questions or issues:
- Email: contact+zimfarm@kiwix.org
- GitHub Issues: github.com/openzim/zimfarm/issues
Related Skills
node-connect
339.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
339.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.8kCommit, push, and open a PR
