*Warning*: this has not been tested yet.

Again, TL;DR version at the end.

They say that backing up in C* really easy: you just run `nodetool
snapshot`, which only creates a hardlink for each data file somewhere
else in the filesystem, and then you just backup those hardlinks.
Optionally, when you're done, you simply remove them and that's it. 

But that's only the half of the story. The other half is taking those
snapshots and storing them somehwere else; let's say, a backup server,
so you can restore the data even in case of spontaneous combustion
followed by explosion due to shortcircuits caused by your dog peeing on
the machine. Not that that happens a lot in a datacenter, but one has to
plan for any contingency, right?

In our case we use [Amanda](http://wiki.zmanda.com/), which internally
uses an implementation of `tar` or GNU tar if asked for (yes, also other
tools if asked). The problems begin with how you define what to backup
and where does C* put those snapshots. The definitions are done by what
Amanda calls disklists, which are basically a list of directories to
backup entirely. In the other hand, for a column family Bar in a
keyspace Foo, whose data are normally stored in
`<data_file_directory>/Foo/Bar/`, a snapshot is located in
`<data_file_directory>/Foo/Bar/snapshots/<something>`, where something
can be a timestamp or a name defined by the user at snapshot time.

If you want to simplify your backup configuration, you'll probably will
want to say `<data_file_directory>/*/*/snapshots/` as the dirs to
backup, but Amanda merrily can't expand wildcards in disklists. A way to
solve this is to create a directory sibling of `<data_file_directory>`,
move the files in the snapshots there, and specify it in the disklists.
That kinda works...

... until your second backup pass comes and you find out that even when
you specified an incremental backup, it copies over all the snapshot
files again. This is because when a hardlink is created, the ctime of
the inode is changed. Guess what `tar` uses to see if a file has
changed... yes, ctime and mtime[^1].

So we're back to square one, or zero even. Seems like the only solution
is to use C*'s native 'support' for incrementality, but the docs are
just
[a couple of paragraphs](http://www.datastax.com/docs/1.0/operations/backup_restore)
that barely explain how they're done (suprise, the same way as the
snapshots) and how to activate it, which is the reason why we didn't
followed this path from the beginning. So in the end, it seems that you
can't use Amanda or `tar` to make incremental backups, even with the
native support.

But then there's a difference between the snapshot and the incremental
mode: with the snapshot method, you create the snapshot just before
backing it up, which sets all the ctimes to now. C*'s incremental mode
"hard-links each flushed SSTable to a backups directory under the
keyspace data directory", so they have roughly the same ctime as the
mtimes, and neither never ever changes (remember, SSTables are
inmutable) again (until we do a snapshot, of course).

One particularity that I noticed is that only new SSTables are backed
up, but not those that are the result of compactions. At the beginning I
thought this was wrong, but after discussing the issue with `driftx` in
the IRC channel and a confirmation by Tyler Hobbs in the mailing list,
we came to the following conclussion: with also compacted SSTables, at
restore time you would need to do a manual compaction to minimize data
duplication, which otherwise means more SStables associated by the Bloom
filters and more disk reads/seeks per get and more space used; but if
you don't backup/restore those SStables, the manual compaction is only
advisable. Also, as a consequence, you don't need to track which files
were deleted between backups.

So the remaining problem is to know which files have been backed up,
because C* backups, just like snapshots, are not automatically cleaned.
I came up with the following solution, which at the beginning it might
seem complicated, but it really isn't.

When we do a snapshot, which is perfect for full backups, we previously
remove all the files present in  the backup directory; incremental files
since the last incremental backup are not needed because we're doing a
full anyways. At the end of this we have the files ready for the full;
we do the backup, and we erase the files.

Then the following days we just add the dynamic backups so far, preceded
by a flush, so as to have the last data in the SSTables and not depend
on CommitLogs. As they're only the diff against the files in the full,
and not the intermediate compacted SSTables, they're as big as they
should (but also as small as they could, if you're worried about disk
ussage). Furthermore, the way we put files in the backup dir is via
symlinks, so it doesn't change the file's mtime or ctime, and we
configure Amanda to dereference symlinks.

Later, at restore time, the files are put in the backup directory, and
with a script that takes the KS and CF from the file's name, they're
'dealed' to the right directories.

# TL;DR version

## Full backup

* Remove old incremental files and symlinks.
* `nodetool snapshot`.
* Symlink all the snapshot files to a backup directory
* Backup that directory dereferencing symlinks.
* `nodetool clearsnapshot` and remove symlinks.


## Incremental backup

* `nodetool flush`.
* Symlink all incremental files into the bakup directory.
* Backup that directory dereferencing symlinks.


## Restore[^2]

* Restore the last full backup and all the incrementals.

[^1]: [`tar`'s docs](http://sepp.oetiker.ch/tar-1.16.x-mo/tar_37.html#SEC88)
      are not clear in what exactly it uses, ("Incremental dumps depend
      crucially on time stamps"), but
      [Amanda's](http://wiki.zmanda.com/index.php/Exclude_and_include_lists)
      seems to imply such a thing ("Tar has the ability to preserve the
      access times[;] however, doing so effectively disables incremental
      backups since resetting the access time alters the inode change
      time, which in turn causes the file to look like it needs to be
      archived again.")

[^2]: Actually is not that simple. [The previous post](link://slug/restoring-cassandra-online) in this series already shows how it
    could get more complicated.
