ReZipDoc
Repack uncompressed & diff visualizer for ZIP based files stored in git repos
Install / Use
/learn @hoijui/ReZipDocREADME
ReZipDoc
A repack uncompressed & diff visualizer for ZIP based files stored in git repos.
Most <img alt="git" src="https://upload.wikimedia.org/wikipedia/commons/e/e0/Git-logo.svg" height="20" align="center" /> repos hosting <img alt="Open Source Hardware" src="https://upload.wikimedia.org/wikipedia/commons/f/fd/Open-source-hardware-logo.svg" height="80" align="center" /> should use ReZipDoc.
What is this?
git does not like binary files. They make the repo grow fast in size in MB (see delta compression), and when you try to see what changed in a commit, you only get this:
Binary files A and B differ!
... not very useful!
ReZipDoc solves both of these issues, though only for ZIP based files, which includes for example FreeCAD and LibreOffice files.
NOTE It does not work for all binary files!
HINT If you are unsure whether a file format is ZIP based, just try to look at it with a software that can peak into ZIP files.
On Linux or OSX:unzip -l someFile.xyz
So if you are storing ZIP based files in your git repo,
you probably want to use ReZipDoc.
Index
- Project state
- How to use
- Installation
- Filter repo history
- Culprits
- Motivation
- How it works
- Benefits
- Observations
- Based on
Project state
This repo contains a heavily revised, refined version of ReZip (and ZipDoc), plus unit tests and helper scripts, which were not available in the original.
How to use
If your git repo makes heavy use of ZIP based files, then you probably want to use ReZipDoc in one of these three ways:
-
install ZipDoc diff viewer - This allows you to see changes within you ZIP based files when looking at git history in a human-readable way. It does not change your past nor future git history.
To use this, install with
--diffonly. -
install ReZip filter - This will change your future git repos history, storing ZIP based files without compression.
To use this, install with
--commit --diff --renormalize. -
install ReZip filter & filter repo - This changes both the past (<- Caution!) and future history of your repo.
To use this, create a copy of the repo with filtered history.
Installation
The filter and diff tool require Java 8 or newer.
The helper scripts - which are mostly used for installing the filter - require a POSIX (~= Unix) environment. This is the case on OSX, Linux, BSD, Unix and even Windows, if git is installed.
The recommended procedure is to install the helper scripts once, and then use them to comfortably install the filter into local git repos.
NOTE
This downloads and executes an online script onto your machine, which is a potential security risk. You may want to check-out the script before running it.
Install helper scripts
NOTE
This has to be done once per developer machine.
They get installed into ~/bin/,
and if the directory did not exist before,
it will get added to PATH.
To install:
curl --silent --location \
https://raw.githubusercontent.com/hoijui/ReZipDoc/master/scripts/rezipdoc-scripts-tool.sh \
| sh -s install --path
To update (to latest development version):
curl --silent --location \
https://raw.githubusercontent.com/hoijui/ReZipDoc/master/scripts/rezipdoc-scripts-tool.sh \
| sh -s update --dev
To remove:
curl --silent --location \
https://raw.githubusercontent.com/hoijui/ReZipDoc/master/scripts/rezipdoc-scripts-tool.sh \
| sh -s remove
Install diff viewer or filter
NOTE
This has to be done once per repo.
This installs the latest release of ReZipDoc into your local git repo.
Make sure you already have installed the helper scripts on your machine.
Switch to the local git repo you want to install this filter to, for example:
cd ~/src/myRepo/
As explained in How to use, you now want to use one of the following:
-
Install the diff viewer
rezipdoc-repo-tool.sh install --diff -
Install the filter
rezipdoc-repo-tool.sh install --commit --renormalize -
Filter the history & install the filter
If you filter the repo history, the freshly created, filtered repo will already have the filter installed as above.
To uninstall the diff viewer and/or filter, run:
rezipdoc-repo-tool.sh remove
Install filter manually
Only use this if you can not use the above, for some reason.
-
Build the JAR
Run this in bash:
cd mkdir -p src cd src git clone git@github.com:hoijui/ReZipDoc.git cd ReZipDoc mvn package echo "Created ReZipDoc binary:" ls -1 $PWD/target/rezipdoc-*.jar -
Install the JAR
Store rezipdoc-*.jar somewhere locally, either:
- (global) in your home directory, for example under ~/bin/
- (repo - tracked) in your repository, tracked, for example under <repo-root>/tools/
- (repo - local) recommended in your repository, locally only, under <repo-root>/.git/
-
Install the Filter(s)
execute these lines:
# Install the add/commit filter git config --replace-all filter.reZip.clean "java -cp .git/rezipdoc-*.jar io.github.hoijui.rezipdoc.ReZip --uncompressed" # (optionally) Install the checkout filter git config --replace-all filter.reZip.smudge "java -cp .git/rezipdoc-*.jar io.github.hoijui.rezipdoc.ReZip --compressed" # (optionally) Install the diff filter git config --replace-all diff.zipDoc.textconv "java -cp .git/rezipdoc-*.jar io.github.hoijui.rezipdoc.ZipDoc" -
Enable the filters
In one of these files:
- (global) ${HOME}/.gitattributes
- (repo - tracked) <repo-root>/.gitattributes
- (repo - local) recommended <repo-root>/.git/info/attributes
Assign attributes to paths:
# This forces git to treat files as if they were text-based (for example in diffs) [attr]textual diff merge text # This makes git re-zip ZIP files uncompressed on commit # NOTE See the ReZipDoc README for how to install the required git filter [attr]reZip textual filter=reZip # This makes git visualize ZIP files as uncompressed text with some meta info # NOTE See the ReZipDoc README for how to install the required git filter [attr]zipDoc textual diff=zipDoc # This combines in-history decompression and uncompressed view of ZIP files [attr]reZipDoc reZip zipDoc # MS Office *.docx reZipDoc *.xlsx reZipDoc *.pptx reZipDoc # OpenOffice *.odt reZipDoc *.ods reZipDoc *.odp reZipDoc # Misc *.mcdx reZipDoc *.slx reZipDoc # Archives *.zip reZipDoc # Java archives *.jar reZipDoc # FreeCAD files *.fcstd reZipDoc
Filter repo history
This always creates a new copy of the repository.
NOTE
This only filters a single branch.
Make sure you have the helper scripts installed and in your PATH.
This filters the master branch of the repo at ~/src/myRepo
into a new local repo ~/src/myRepo_filtered,
using the original commit messages, authors and dates:
rezipdoc-history-filter.sh \
--source ~/src/myRepo \
--branch master \
--orig \
--target ~/src/myRepo_filtered
It also works with an online source:
rezipdoc-history-filter.sh \
--source "https://github.com/case06/ZACplus.git" \
--branch master \
--orig \
--target /tmp/ZACplus_filtered
After doing this, the new, filtered repo will already have the filter installed, so future commits will be filtered.
Filtering example
We are going to run a script that filters the Zinc-Oxide Open Hardware battery (ZAC+) project repo, which has a header comment explaining what it does in detail.
In short, it

