Restoring Cassandra online
Still playing with Cassandra, we setup a cluster of 5 nodes to test backing up and restoring. Datastax' doc only takes in account a simple case: where you only have to replace a node that's failing or whose files were corrupted. In this case the restore is quite straightforward: take the node out of the cluster, delete the commit logs, restore the data you have and re-add the node to the cluster, with an optional repair afterwards.
In our case in particular, we not only contemplate this kind of cases, but we also might need to rollback to a point in the past, which implies restoring the data on all the nodes. It's true that this is possible repeating the above algorithm node by node1, without the eventual repair. This means that while you're repairnig the CF or KS, your cluster is almost constantly one node less. This might mean nothing on big deployments, but our production cluster is a humble 4 nodes one, even smaller than the testing one! So having as less downtime as possible is highly needed.
So we set off to find a way to do it without stopping the nodes. Some
people were advising on using nodetool refresh
or sstableloader
, but
that seems to work only when restoring one node from scratch; that
is, the same case as at the beginning. In our case, sstableloader
was
making no difference. I assume that it's becasuse it's inserting the
data with their original timestamp, so the data with newer timestamps
still in the Mem/SSTables in the nodes take precedence. That is,
sstableloader
seems to not replace the data.
With nodetool refresh
the same happens, but you still have the option
of deleting the current SSTables after a nodetool flush
. But that
leads to a state where the node(s) where you have done this emit this
error on any operation on the CF or KS:
java.io.IOError: java.io.FileNotFoundException: /var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-13-Data.db (No such file or directory)
It's not obvious from the example I show, but that's exactly one of the
SSTables I just removed. That is, C* still tries to read the SSTables
that were there no more even after a nodetool refresh
. Maybe this is a
bug, but then that commmand's semantic is not clearly stated anywhere.
I found a simple workaround: as we're no longer interested in the data as it is in its current state, I can simply drop the KS of CF and rebuild it after with the data I get from the restore.
In the end, the procedure is like this:
- Drop and recreate the CF or KS.
- For all nodes, in parallel if posssible:
- Remove the snapshot created at drop time2.
- Restore the snapshot and move the data files to the right place.
-
nodetool refresh
.