Hercules
Gaining advanced insights from Git repository history.
Install / Use
/learn @src-d/HerculesREADME
Table of Contents
- Overview
- Installation
- Contributions
- License
- Usage
- Roadmap
Overview
Hercules is an amazingly fast and highly customizable Git repository analysis engine written in Go. Batteries are included. Powered by go-git.
Notice (November 2020): the main author is back from the limbo and is gradually resuming the development. See the roadmap.
There are two command-line tools: hercules and labours. The first is a program
written in Go which takes a Git repository and executes a Directed Acyclic Graph (DAG) of analysis tasks over the full commit history.
The second is a Python script which shows some predefined plots over the collected data. These two tools are normally used together through
a pipe. It is possible to write custom analyses using the plugin system. It is also possible
to merge several analysis results together - relevant for organizations.
The analyzed commit history includes branches, merges, etc.
Hercules has been successfully used for several internal projects at source{d}. There are blog posts: 1, 2 and a presentation. Please contribute by testing, fixing bugs, adding new analyses, or coding swagger!


Installation
Grab hercules binary from the Releases page.
labours is installable from PyPi:
pip3 install labours
pip3 is the Python package manager.
Numpy and Scipy can be installed on Windows using http://www.lfd.uci.edu/~gohlke/pythonlibs/
Build from source
You are going to need Go (>= v1.11) and protoc.
git clone https://github.com/src-d/hercules && cd hercules
make
pip3 install -e ./python
GitHub Action
It is possible to run Hercules as a GitHub Action: Hercules on GitHub Marketplace. Please refer to the sample workflow which demonstrates how to setup.
Contributions
...are welcome! See CONTRIBUTING and code of conduct.
License
Usage
The most useful and reliably up-to-date command line reference:
hercules --help
Some examples:
# Use "memory" go-git backend and display the burndown plot. "memory" is the fastest but the repository's git data must fit into RAM.
hercules --burndown https://github.com/go-git/go-git | labours -m burndown-project --resample month
# Use "file system" go-git backend and print some basic information about the repository.
hercules /path/to/cloned/go-git
# Use "file system" go-git backend, cache the cloned repository to /tmp/repo-cache, use Protocol Buffers and display the burndown plot without resampling.
hercules --burndown --pb https://github.com/git/git /tmp/repo-cache | labours -m burndown-project -f pb --resample raw
# Now something fun
# Get the linear history from git rev-list, reverse it
# Pipe to hercules, produce burndown snapshots for every 30 days grouped by 30 days
# Save the raw data to cache.yaml, so that later is possible to labours -i cache.yaml
# Pipe the raw data to labours, set text font size to 16pt, use Agg matplotlib backend and save the plot to output.png
git rev-list HEAD | tac | hercules --commits - --burndown https://github.com/git/git | tee cache.yaml | labours -m burndown-project --font-size 16 --backend Agg --output git.png
labours -i /path/to/yaml allows to read the output from hercules which was saved on disk.
Caching
It is possible to store the cloned repository on disk. The subsequent analysis can run on the corresponding directory instead of cloning from scratch:
# First time - cache
hercules https://github.com/git/git /tmp/repo-cache
# Second time - use the cache
hercules --some-analysis /tmp/repo-cache
GitHub Action
The action produces the artifact named
hercules_charts. Since it is currently impossible to pack several files in one artifact, all the
charts and Tensorflow Projector files are packed in the inner tar archive. In order to view the embeddings,
go to projector.tensorflow.org, click "Load" and choose the two TSVs. Then use UMAP or T-SNE.
Docker image
docker run --rm srcd/hercules hercules --burndown --pb https://github.com/git/git | docker run --rm -i -v $(pwd):/io srcd/hercules labours -f pb -m burndown-project -o /io/git_git.png
Built-in analyses
Project burndown
hercules --burndown
labours -m burndown-project
Line burndown statistics for the whole repository. Exactly the same what git-of-theseus does but much faster. Blaming is performed efficiently and incrementally using a custom RB tree tracking algorithm, and only the last modification date is recorded while running the analysis.
All burndown analyses depend on the values of granularity and sampling. Granularity is the number of days each band in the stack consists of. Sampling is the frequency with which the burnout state is snapshotted. The smaller the value, the more smooth is the plot but the more work is done.
There is an option to resample the bands inside labours, so that you can
define a very precise distribution and visualize it different ways. Besides,
resampling aligns the bands across periodic boundaries, e.g. months or years.
Unresampled bands are apparently not aligned and start from the project's birth date.
Files
hercules --burndown --burndown-files
labours -m burndown-file
Burndown statistics for every file in the repository which is alive in the latest revision.
Note: it will generate separate graph for every file. You don't want to run it on repository with many files.
People
hercules --burndown --burndown-people [--people-dict=/path/to/identities]
labours -m burndown-person
Burndown statistics for the repository's contributors. If --people-dict is not specified, the identities are
discovered by the following algorithm:
- We start from the root commit towards the HEAD. E
Related Skills
apple-reminders
337.3kManage Apple Reminders via remindctl CLI (list, add, edit, complete, delete). Supports lists, date filters, and JSON/plain output.
gh-issues
337.3kFetch GitHub issues, spawn sub-agents to implement fixes and open PRs, then monitor and address PR review comments. Usage: /gh-issues [owner/repo] [--label bug] [--limit 5] [--milestone v1.0] [--assignee @me] [--fork user/repo] [--watch] [--interval 5] [--reviews-only] [--cron] [--dry-run] [--model glm-5] [--notify-channel -1002381931352]
healthcheck
337.3kHost security hardening and risk-tolerance configuration for OpenClaw deployments
node-connect
337.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
