Myhoard
MySQL Backup and Point-in-time Recovery service
Install / Use
/learn @Aiven-Open/MyhoardREADME
MyHoard

MyHoard is a daemon for creating, managing and restoring MySQL backups. The backup data can be stored in any of the supported cloud object storages. It is functionally similar to pghoard backup daemon for PostgreSQL.
Features
- Automatic periodic full backup
- Automatic binary log backup in near real-time
- Cloud object storage support (AWS S3, Google Cloud Storage, Azure)
- Encryption and compression
- Backup restoration from object storage
- Point-in-time-recovery (PITR)
- Automatic backup history cleanup based on number of backups and/or backup age
- Purging local binary logs once they're backed up and not needed by other MySQL servers (requires external system to provide executed GTID info for the standby servers)
- Almost no extra local disk space requirements for creating and restoring backups
- Incremental backups
Fault-resilience and monitoring:
- Handles temporary object storage connectivity issues by retrying all operations
- Metrics via statsd using Telegraf® tag extensions
- Unexpected exception reporting via Sentry
- State reporting via HTTP API
- Full internal state stored on local file system to cope with process and server restarts
Overview
There are a number existing tools and scripts for managing MySQL backups so why have yet another tool? As far as taking a full (or incremental) snapshot of MySQL goes, Percona XtraBackup does a very good job and is in fact what MyHoard is using internally as well. Where things usually get more complicated is when you want to back up and restore binary logs so that you can do point-in-time recovery and reduce data loss window. Also, as good as Percona XtraBackup is for taking and restoring the backup you still need all sorts of scripts and timers added around it to actually execute it and if anything goes wrong, e.g. because of network issues, it's up to you to retry.
Often binary log backup is based on just uploading the binary log files using some simple scheduled file copying mechanism and restoring them is left as an afterthought, usually just comprising of "download all the binlogs and then use mysqlbinlog to replay them". In addition to not having proper automation to do this to ensure it is repeatable and safe this approach also does not work in some cases: In order for binary log restoration with mysqlbinlog to be safe you need to have all binary logs on local disk. For change heavy environments this may be much more than the size of the actual database and if server disk is adjusted based on the database size the binary logs may simply not fit on the disk.
MyHoard uses an alternative approach for binary log restoration, which is based
on presenting the backed up binary logs as relay logs in batches via direct
relay log index manipulation and having the regular SQL slave thread apply them
as if they were replicated from another MySQL server. This allows applying them
in batches so there's very little extra disk space required during restoration
and this would also allow applying them in parallel (though that requires more
work, currently there are known issues with using slave-parallel-workers
value other than 0, i.e. multithreading must currently be disabled).
Existing tooling also doesn't pay much attention to real life HA environments and failovers where the backup responsibilities need to be switched from one server to another and getting uninterrupted sequence of backed up transactions that can be restored to any point in time, including the time around the failover. This requires something much more sophisticated than just blindly uploading all local binary logs.
MyHoard aims to provide a single solution daemon that takes care of all of your MySQL backup and restore needs. It handles creating, managing and restoring backups in multi-node setups where master nodes may frequently be going away (either because of rolling forward updates or actual server failure). You just need to create a fairly simple configuration file, start the systemd service on the master and any standby servers and make one or two HTTP requests to get the daemon into correct state and it will start automatically doing the right things.
Basic usage
On the very first master after you've initialized MySQL database and started up MyHoard you'd do this:
curl -XPUT -H "Content-Type: application/json" -d '{"mode": "active"}' \
http://localhost:16001/status
This tells MyHoard to switch to active mode where it starts backing up data on this server. If there are no existing backups it will immediately create the first one.
On a new standby server you'd first install MySQL and MyHoard but not start or
initialize MySQL (i.e. don't do mysqld --initialize). After starting the
MyHoard service you'd do this:
curl http://localhost:16001/backup # lists all available backups
curl -XPUT -H "Content-Type: application/json" \
-d '{"mode": "restore", "site": "mybackups", "stream_id": "backup_id", "target_time": null}' \
http://localhost:16001/status
This tells MyHoard to fetch the given backup, restore it, start the MySQL
server once finished, and switch to observe mode where it keeps on observing
what backups are available and what transactions have been backed up but
doesn't do any backups itself. Because binary logging is expected to be
enabled also on the standby server MyHoard does take care of purging any local
binary logs that contain only transactions that have been backed up. If you
wanted to restore to a specific point in time you'd just give a timestamp like
"2019-05-22T11:19:02Z" and restoration will be performed up until the last
transaction before the target time.
If the master server fails for any reason you'd do this on one of the standby servers:
curl -XPUT -H "Content-Type: application/json" -d '{"mode": "promote"}' \
http://localhost:16001/status
This updates the object storage to indicate this server is now the master and
any updates from the old master should be ignored by any other MyHoard
instances. (The old master could still be alive at this point but e.g.
responding so slowly that it is considered to be unavailable yet it might be
able to accept writes and back those up before going totally away and those
transactions must be ignored when restoring backups in the future because they
have not been replicated to the new master server.) After the initial object
storage state update is complete MyHoard switches itself to active mode and
resumes uploading binary logs to the currently active backup stream starting
from the first binary log that contains transactions that have not yet been
backed up.
Requirements
MyHoard requires Python 3.10 or later and some additional components to operate:
Currently MyHoard only works on Linux and expects MySQL service to be managed via systemd.
MyHoard requires MySQL to be used and configured in a specific manner in order for it to work properly:
- Single writable master, N read only standbys
- Binary logging enabled both on master and on standbys
binlog_formatset toROW- Global transaction identifiers (GTIDs) enabled
- Use of only InnoDB databases
Configuration options
myhoard.json has an example configuration that shows the structure of the
config file and has reasonable default values for many of the settings. Below
is full list the settings and the effect of each.
backup_settings.backup_age_days_max
Maximum age of backups. Any backup that has been closed (marked as final with no more binary logs being uploaded to it) more than this number of days ago will be deleted from storage, unless total number of backups is below the minimum number of backups.
backup_settings.backup_count_max
Maximum number of backups to keep. Because new backups can be requested
manually it is possible to end up with a large number backups. If the total
number goes above this backups will be deleted even if they are not older than
backup_age_days_max days.
backup_settings.backup_count_min
Minimum number of backups to keep. If for example the server is powered off and then back on a month later, all existing backups would be very old. However, in that case it is usually not desirable to immediately delete all old backups. This setting allows specifying a minimum number of backups that should always be preserved regardless of their age.
backup_hour
The hour of day at which to take new full backup. If backup interval is less than 24 hours this is used as base for calculating the backup times. E.g. if backup interval was 6 hours and backup hour was 4, backups would be taken at hours 4, 10, 16 and 22.
backup_minute The minute of hour at which to take new full backup.
backup_interval_minutes
The interval in minutes at which to take new backups. Individual binary logs are backed up as soon as they're created so there's usually no need to have very frequent full backups. Note: If this value is not does not have a factor of 1440 (1 day) then the backup_hour and backup_minute settings cannot be changed once the first backup has been taken, as having a cycle not as a multiple of days means that the hour and minute of the backup will not be the same each day.
forced_binlog_rotation_interval
How frequently, in seconds, to force creation of new binary log if one hasn't been created otherwise. Th
