Hsynz
hsynz is a library for delta update using sync algorithm, like zsync. rsync over http(s); implement the sync algorithm on client side, and server side only need http(s) cdn. support compressor zstd & libdeflate & zlib, support large file & directory(folder), support muti-thread.
Install / Use
/learn @sisong/HsynzREADME
hsynz
english | 中文版
hsynz is a library for delta update using sync algorithm, like zsync.
rsync over http(s); implement the sync algorithm on client side, and server side only need http(s) cdn. support compressor zstd & libdeflate & zlib, support large file & directory(folder), support multi-thread.
hsynz defines its own file format (.hsyni and .hsynz files), and this library is also compatible with the file format of zsync (including apply and create .zsync and their .gz files).
Recommended scenarios: Very large number of older versions or where older versions are not available (not saved or modified, etc.) so that all deltas cannot be calculated in advance.
The server uses hsync_make to process the latest version of the data once, generating a summary info file(hsyni) of the new version of the data in chunks, and optionally compressing the new version of the data in chunks to get the release file(hsynz), which would be the hsynz equivalent if the new version of the original file were not compressed.
The client first downloads the hsyni file from the server or another user's share, calculates the updated blocks it needs to download based on its old version, and learns the location of these blocks in hsynz based on the information in hsyni, selects a communication method to download them on demand from the server's hsynz file, and merges the downloaded blocks with the existing data locally to get the latest version of the data.
hsync_demo provides a test client demo for local file testing.
hsync_http provides a download client demo with http(s) support for sync update from a server that provides an http(s) file download service(e.g CDN, support HTTP/1.1 multi range Requests).
Tip: You can also customise other communication methods for sync.
additional, if you have the new file locally & not the old file, but can get a hash certificate file(.hsyni) of the old file, you can also create a hpatchz format patch file(usage scenario like [rsync]); see the demo cmdline hsign_diff.
Compare with zsync
- In addition to supporting source and target as files, support is also provided for directories(folders).
- In addition to supporting compressed release package by zlib; also supported libdeflate & zstd compressor, providing better compression ratio, i.e. smaller downloaded patch package.
- The server-side make support multi-threaded parallel acceleration.
- The client-side diff speed has been optimized, and also support multi-threaded parallel acceleration.
Releases/Binaries
Download from latest release : Command line app for Windows, Linux, MacOS; and .so lib for Android.
( release files build by projects in path hsynz/builds )
Build it yourself
Linux or MacOS X
$ cd <dir>
$ git clone --recursive https://github.com/sisong/hsynz.git
$ cd hsynz
$ make
Windows
$ cd <dir>
$ git clone --recursive https://github.com/sisong/hsynz.git
build hsynz/builds/vc/hsynz.sln with Visual Studio
libhsynz.so for Android
- install Android NDK
$ cd <dir>/hsynz/builds/android_ndk_jni_mk$ build_libs_static.sh(or$ build_libs_static.baton windows, then got *.so files)- import file
com/github/sisong/hsynz.java(fromhsynz/builds/android_ndk_jni_mk/java/) & .so files, java code can call the sync patch function in libhsynz.so
hsync_make command line usage:
hsync_make: [options] newDataPath out_hsyni_file [out_hsynz_file]
newDataPath can be file or directory(folder),
if newDataPath is a file & no -c-... option, out_hsynz_file can empty.
options:
-s-matchBlockSize
matchBlockSize>=128, DEFAULT -s-2k, recommended 1024,4k,...
-b-safeBit
set allow patch fail hash clash probability: 1/2^safeBit;
safeBit>=14, DEFAULT -b-24, recommended 20,32...
-p-parallelThreadNumber
DEFAULT -p-4;
if parallelThreadNumber>1 then open multi-thread Parallel mode;
-c-compressType[-compressLevel]
set out_hsynz_file Compress type & level, DEFAULT uncompress;
support compress type & level:
-c-zlib[-{1..9}[-dictBits]] DEFAULT level 9
dictBits can 9--15, DEFAULT 15.
-c-gzip[-{1..9}[-dictBits]] DEFAULT level 9
dictBits can 9--15, DEFAULT 15.
compress by zlib, out_hsynz_file is .gz file format.
-c-ldef[-{1..12}] DEFAULT level 12 (dictBits always 15).
compress by libdeflate, compatible with zlib's deflate encoding.
-c-lgzip[-{1..12}] DEFAULT level 12 (dictBits always 15)
compress by libdeflate, out_hsynz_file is .gz file format.
-c-zstd[-{10..22}[-dictBits]] DEFAULT level 21
dictBits can 15--30, DEFAULT 24.
-C-checksumType
set strong Checksum type for block data, DEFAULT -C-xxh128;
support checksum type:
-C-xxh128
-C-md5
-C-sha512
-C-sha256
-C-crc32
WARNING: crc32 is not strong & secure enough!
-zsync[#KeY#=...#ValuE#=...[#KeY#=...#ValuE#=...]]
create out_hsyni_file(.zsync file format) or out_hsynz_file(.gz file
format) compatible with zsync.
checksum default used sha1 & md4, not need set -C
-s-matchBlockSize size must 2^N; if used -c-gzip or -c-lgzip (out
.gz file), recommend set 4k at most, >=8k may fail.
key-value string pairs will be write in out_hsyni_file; if needed,
you can set Filename,Z-Filename,URL,Z-URL,MTime,Recompress,...
zsync project https://zsync.moria.org.uk
-n-maxOpenFileNumber
limit Number of open files at same time when newDataPath is directory;
maxOpenFileNumber>=8, DEFAULT -n-48, the best limit value by different
operating system.
-g#ignorePath[#ignorePath#...]
set iGnore path list in newDataPath directory; ignore path list such as:
#.DS_Store#desktop.ini#*thumbs*.db#.git*#.svn/#cache_*/00*11/*.tmp
# means separator between names; (if char # in name, need write #: )
* means can match any chars in name; (if char * in name, need write *: );
/ at the end of name means must match directory;
-f Force overwrite, ignore write path already exists;
DEFAULT (no -f) not overwrite and then return error;
if used -f and write path is exist directory, will always return error.
--patch
swap to hsync_demo mode.
-v output Version info.
-h or -?
output Help info (this usage).
hsync_http command line usage:
download : [options] -dl#hsyni_file_url hsyni_file
local diff: [options] oldPath hsyni_file hsynz_file_url -diff#outDiffFile
local patch: [options] oldPath hsyni_file -patch#diffFile outNewPath
sync infos: [options] oldPath hsyni_file [-diffi#cacheTempFile]
sync patch: [options] oldPath [-dl#hsyni_file_url] hsyni_file hsynz_file_url [-diffi#cacheTempFile] outNewPath
oldPath can be file or directory(folder),
if oldPath is empty input parameter ""
options:
-dl#hsyni_file_url
download hsyni_file from hsyni_file_url befor sync patch;
-diff#outDiffFile
create diffFile from ranges of hsynz_file_url befor local patch;
-diffi#cacheTempFile
saving diffInfo to cache file for optimize speed when continue sync patch;
-patch#diffFile
local patch(oldPath+diffFile) to outNewPath;
-U
set hsynz_file_url is the original file before compress, please ignore the
compress info in hsyni_file and directly access the data of hsynz file
without decompress.
-cdl-{0|1} or -cdl-{off|on}
continue download data from breakpoint;
DEFAULT -cdl-1 opened, need set -cdl-0 or -cdl-off to close continue mode;
-rdl-retryDownloadNumber
number of auto retry connection, when network disconnected while downloading;
DEFAULT -rdl-0 retry closed; recommended 5,1k,1g,...
-r-stepRangeNumber
DEFAULT -r-32, recommended 16,20,...
limit the maximum number of .hsynz data ranges that can be downloaded
in a single request step;
if http(s) server not support multi-ranges request, must set -r-1
-p-parallelThreadNumber
DEFAULT -p-4;
if parallelThreadNumber>1 then open multi-thread Parallel mode;
NOTE: now download data always used single-thread.
-n-maxOpenFileNumber
limit Number of open files at same time when oldPath is directory;
maxOpenFileNumber>=8, DEFAULT -n-24, the best limit value by different
operating system.
-g#ignorePath[#ignorePath#...]
set iGnore path list in oldPath directory; ignore path list such as:
#.DS_Store#desktop.ini#*thumbs*.db#.git*#.svn/#cache_*/00*11/*.tmp
# means separator between names; (if char # in name, need write #: )
* means can match any chars in name; (if char * in name, need write *: );
/ at the end of name means must match directory;
-f Force overwrite, ignore write path already exists;
DEFAULT (no -f) not overwrite and then return error;
not support oldPath outNewPath same path!
if used -f and outNewPath is exist file:
if patch
