Duplicity is a band-width efficient backup utility capable of providing encrypted, digitally signed, versioned, remote backups in a space efficient manner.
Duplicity creates an initial archive that is a full backup. All subsequent backups are incremental and only save the difference between the latest (full or incremental) backup. A full backup and corresponding series of incremental backups can be recovered to any point in time covered by the incremental backups. If an incremental backup is missing from the backup chain then any subsequent incremental backup file cannot be recovered.
Duplicity is released under the terms of the GNU General Public License (GPL), and as such is free software.
If you’re using a major Linux distribution, you should be able to find a pre-compiled package in the repositories. If not, then a tar file is available at Duplicity.
sudo apt-get update
sudo apt-get install duplicity
Because we are going to authenticate against Keystone, it is also necessary to
install python-keystoneclient.
sudo apt-get install python-keystoneclient
or
pip install python-keystoneclient
If you intend to create encrypted backups you will also require a GPG key. The
gpg --gen-key command line tool can create a local one for you, see
(GnuPG) for more information on this.
Duplicity requires certain environment variables to be set. One option would be to source a simple bash script like this. The data for these variables can be obtained from your OpenStack RC file.
#!/bin/bash
# Swift credentials for Duplicity
export SWIFT_USERNAME="somebody@example.org.nz"
export SWIFT_TENANTNAME="mycloudtenant"
export SWIFT_AUTHURL="https://api.nz-por-1.catalystcloud.io:5000/"
export SWIFT_AUTHVERSION="3"
export SWIFT_USER_DOMAIN_NAME="default"
export SWIFT_PROJECT_DOMAIN_NAME="default"
# With Keystone you pass the keystone password.
echo "Please enter your OpenStack Password: "
read -sr PASSWORD_INPUT
export SWIFT_PASSWORD=$PASSWORD_INPUT
In order to source this file, run the following from the command line
source <filename.sh>
This will need be done before each Duplicity run if the variables are not already set.
Firstly, lets check our connectivity to the object store. If we run the following for an existing empty container, in this case ‘first-container’, we should see something like this
$ duplicity collection-status swift://first-container
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: none
Collection Status
-----------------
Connecting with backend: BackendWrapper
Archive dir: /home/ubuntu/.cache/duplicity/cd3fc2f113a80b76b6xxxxxx7b16aee5
Found 0 secondary backup chains.
No backup chains with active signatures found
No orphaned or incomplete backup sets found.
Now we can run our first backup. For this example we will use a single local file called foo.sh.
Note
if you do not have a valid gpg key you will need to append --no-encryption
to the end of your duplicity commands.
$ duplicity foo.sh swift://first-container
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: none
GnuPG passphrase for decryption:
Retype passphrase for decryption to confirm:
No signatures found, switching to full backup.
--------------[ Backup Statistics ]--------------
StartTime 1484012914.11 (Tue Jan 10 01:48:34 2017)
EndTime 1484012914.11 (Tue Jan 10 01:48:34 2017)
ElapsedTime 0.01 (0.01 seconds)
SourceFiles 1
SourceFileSize 44 (44 bytes)
NewFiles 1
NewFileSize 44 (44 bytes)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 1
RawDeltaSize 44 (44 bytes)
TotalDestinationSizeChange 231 (231 bytes)
Errors 0
-------------------------------------------------
We can verify the state of our backups with:
$ duplicity collection-status swift://first-container
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Tue Jan 10 01:48:25 2017
Collection Status
-----------------
Connecting with backend: BackendWrapper
Archive dir: /home/ubuntu/.cache/duplicity/cd3fc2f113a80b76b6xxxxxx7b16aee5
Found 0 secondary backup chains.
Found primary backup chain with matching signature chain:
-------------------------
Chain start time: Tue Jan 10 01:48:25 2017
Chain end time: Tue Jan 10 01:48:25 2017
Number of contained backup sets: 1
Total number of contained volumes: 1
Type of backup set: Time: Num volumes:
Full Tue Jan 10 01:48:25 2017 1
-------------------------
No orphaned or incomplete backup sets found.
and check to see if there are local files that have not yet been backed up by running
duplicity verify swift://first-container .
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Tue Jan 10 01:48:25 2017
GnuPG passphrase for decryption:
Verify complete: 595 files compared, 0 differences found.
Warning
If you wish to back up the root ‘/’ directory, it is advisable to add
--exclude /proc as this may cause Duplicity to crash on the weird stuff
in there.
To have Duplicity really useful as a backup tool, we want to improve the process. This will consist of:
the backup script itself
the variables file to control the backup script and provide authentication information
the cron job to run the backup task
Here is the basic script to manage the running of a Duplicity backup.
Typically, this would be placed somewhere like /usr/local/bin.
#!/bin/bash
# Source SWIFT access variables required by duplicity
source /etc/duplicity/duplicity-vars.sh
BACKUP_DEFINITIONS_DIR="/etc/duplicity/backup_sources.d"
BACKUP_CONFIG="${1}"
if [ -z "${BACKUP_CONFIG}" ]; then
BACKUP_CONFIG='*'
fi
# Run backups defined in BACKUP_DEFINITIONS_DIR or only the one specified as $1
# The BACKUP_* variables need NOT to be double-quoted for the shell name expansion to work
for BACKUP_DEFINITION_FILE in ${BACKUP_DEFINITIONS_DIR}/${BACKUP_CONFIG}.conf
do
# Make sure we don't have any leftover variables set before next loop run
unset SRC
unset DEST
unset PRE_BACKUP_CMD
unset POST_BACKUP_CMD
unset DUPLICITY_BACKUP_RETENTION
unset DUPLICITY_BACKUP_CYCLE
unset DUPLICITY_VOLSIZE
unset DUPLICITY_NUM_RETRIES
# Source variables used on each loop run
if [ ! -f "${BACKUP_DEFINITION_FILE}" ]; then
INFO="No backups defined in ${BACKUP_DEFINITIONS_DIR}/ or ${BACKUP_DEFINITION_FILE} is not a file"
echo $INFO
continue
fi
# Source the main config file again as we overwrite some variables in backup definitions
source /etc/duplicity/duplicity.vars
source "${BACKUP_DEFINITION_FILE}"
# Check if the src and dest backup vars are not empty
if [ ! -z "${SRC}" ] && [ ! -z "${DEST}" ]; then
# Run defined tasks before doing the backup
if [ ! -z "${PRE_BACKUP_CMD}" ]; then
eval "${PRE_BACKUP_CMD}"
rc=$?
if [ ${rc} -gt 0 ]
then
# Error handling
INFO="Pre backup command failed with rc = ${rc}"
echo $INFO
continue
fi
fi
# Run backup
duplicity --verbosity Notice \
--full-if-older-than ${DUPLICITY_BACKUP_CYCLE} \
--num-retries ${DUPLICITY_NUM_RETRIES} \
--asynchronous-upload \
--no-encryption \
--volsize ${DUPLICITY_VOLSIZE} \
"${SRC}" "${DEST}"
rc=$?
if [ ${rc} -gt 0 ]
then
# Error handling
INFO="Backup failed with rc = ${rc}"
echo $INFO
continue
fi
# Duplicity cleanups
duplicity remove-older-than ${DUPLICITY_BACKUP_RETENTION} --verbosity notice --force "${DEST}"
rc=$?
if [ ${rc} -gt 0 ]
then
# Error handling
INFO="Deleting old backups failed with rc = ${rc}"
echo $INFO
continue
fi
# Duplicity collection status summary
duplicity collection-status "${DEST}"
rc=$?
if [ ${rc} -gt 0 ]
then
# Error handling
INFO="Collection status failed with rc = ${rc}"
echo $INFO
continue
fi
# Run a command after doing the backup
if [ ! -z "${POST_BACKUP_CMD}" ]; then
eval "${POST_BACKUP_CMD}"
rc=$?
if [ ${rc} -gt 0 ]
then
# Error handling
INFO="Post backup command failed with rc = ${rc}"
echo $INFO
continue
fi
fi
else
INFO="No backup source or destination defined in ${BACKUP_DEFINITION_FILE}"
echo $INFO
continue
fi
# If the script managed to reach this point all backup steps succeeded so we can report that to icinga
INFO="Backup succeeded"
echo $INFO
done
This script defines the control parameters such as retention and frequency for
the backup tasks as well as providing authentication information for object
storage. The previous script is expecting to find this in
/etc/duplicity/duplicity-vars.sh.
#!/bin/bash
# Variables used by the backup script
# Duplicity specific variables
export DUPLICITY_BACKUP_CYCLE='7D' #7 days
export DUPLICITY_BACKUP_RETENTION='14D' #14 days
export DUPLICITY_VOLSIZE='512' #object chunk size in bytes
export DUPLICITY_NUM_RETRIES='3'
# Catalyst Cloud object storage credential information
export SWIFT_USERNAME='<your-backup-user>@<your-project-name>'
export SWIFT_REGIONNAME='nz-por-1'
export SWIFT_TENANTNAME='<your-project-name>'
export SWIFT_PASSWORD='<your-openrc-password>'
export SWIFT_AUTHURL='https://api.nz-por-1.catalystcloud.io:5000/'
export SWIFT_AUTHVERSION='3'
export SWIFT_USER_DOMAIN_NAME="default"
export SWIFT_PROJECT_DOMAIN_NAME="default"
Then we need to define the backup definitions. Create a file with a name
relevant to the backup task in /etc/duplicity/backup_sources.d and add at
least the following two entries
SRC="/path/to/files/"
DEST="swift://<container-name>"
Depending on the nature of the thing you wish to back up, you may also need to include pre-backup commands such as the one shown below. This is to ensure that the data you wish to capture, in this case the contents of a gitlab repository, have been written to disk prior to the backup task running. Another example is taking database dumps.
PRE_BACKUP_CMD="CRON=1 /opt/gitlab/bin/gitlab-rake gitlab:backup:create"
Finally you’ll create a new file called duplicity-backup-cron
in /etc/cron.d/. This is the cron job that will be responsible for running the
backups. See (cron) for more information on this.
35 2 * * * root /usr/local/bin/duplicity-backup.sh >> /var/log/duplicity.log 2>&1