backup

backup

immutable backups so simple that unborkable

github.com/nathants/backup

why

backups should be simple and easy.

how

easily create immutable, trustless backups with revision history, compression, and file deduplication.

what

  • the index, tracked in git, contains filesystem metadata.

  • the index is a sorted tsv file of: path, tarball, hash, size, mode

  • for every line of metadata in the index, there is one and only one tarball containing a file with that hash.

  • duplicate files, by blake2b hash, are never stored.

  • the index is encrypted with git-remote-gcrypt.

  • the tarballs are split into chunks, compressed with lz4, then encrypted with gpg.

  • all remote storage is handled via rclone on any backend it supports.

  • the ignore file, tracked in git, contains one regex per line of file paths to ignore.

  • a clean restore will clone the git repo, checkout a revision, select file paths by regex, gather needed tarball names, fetch tarballs from storage, and extract the selected files.

usage

  • backup-add - scan the filesystem for changes.
  • backup-diff - inspect the uncommitted backup diff.
  • backup-ignore - if needed, edit the ignore regexes, then goto backup-add.
  • backup-commit - commit the backup diff to remote storage.
  • backup-find - search for files in the index by regex at revision.
  • backup-restore - restore files from remote storage by regex at revision.

dependencies

  • awk
  • bash
  • cat
  • git
  • git-remote-gcrypt
  • gpg
  • grep
  • lz4
  • python3
  • rclone

installation

  • put bin/ on $PATH

or

  • sudo mv bin/* /usr/local/bin

setup

  • add some environment variables to your bashrc:

    export BACKUP_ROOT=~ - root directory to backup

    export BACKUP_RCLONE_REMOTE=$REMOTE - a remote setup with rclone config

    export BACKUP_DESTINATION=$BUCKET/backups/$(hostname) - where to rclone data to

    export BACKUP_CHUNK_MEGABYTES=100 - approximate size of each tarball before compression

  • have a gpg key and a gpg.conf that looks like the following:

    >> cat ~/.gnupg/gpg.conf
    
    default-key YOUR@EMAIL.COM
    default-recipient YOUR@EMAIL.COM
    
    personal-cipher-preferences AES256
    personal-digest-preferences SHA512
    personal-compress-preferences Uncompressed
    default-preference-list SHA512 AES256 Uncompressed
    cert-digest-algo SHA512
    s2k-cipher-algo AES256
    s2k-digest-algo SHA512
    s2k-mode 3
    s2k-count 65011712
    disable-cipher-algo 3DES
    weak-digest SHA1
    force-mdc
    

api

modify backup state:

  • backup-add - scan the filesystem for changes
  • backup-commit - commit the backup diff to remote storage
  • backup-ignore - edit the ignore file in $EDITOR
  • backup-reset - clear uncommited backup state

view backup state:

  • backup-additions-sizes - show large files in the uncommited backup diff
  • backup-additions - inspect the uncommited backup diff, additions only
  • backup-diff - inspect the uncommited backup diff
  • backup-find - find files by regex at revision
  • backup-index - view the backup index
  • backup-log - view the git log

restore backup content:

  • backup-restore - restore files from remote storage by regex at revision

test

export BACKUP_TEST_RCLONE_REMOTE=$REMOTE
export BACKUP_TEST_DESTINATION=$BUCKET/test
tox