Planning Your Data Protection
SHIELD can perform Cassandra backups in a few different ways, depending on your network throughput, access control, and personal preferences.
Local Agent
The simplest method is to run the SHIELD agent on the same box as
Cassandra, and connect to it over the loopback address, 127.0.0.1.
If you want to protect more than one host, you can install the agent on each. They will all then register with the SHIELD instance, with different identities, and you can configure backup jobs for each of them.
Remote Agent
You can also run the SHIELD agent somewhere else, and configure your backup jobs to connect to the Cassandra host(s) over TCP, using either their public IPs, or (if you have configured them) internal Linode IPs.
This can be useful if you'd rather not load additional software on your database host(s). You can also reuse the external agent to protect multiple Cassandra instances, since your backup jobs will specify the IP address of each.
This setup also allows you to provision large scratch spaces for the Cassandra SHIELD data plugin to use, and avoid having to reserve dedicated temporary disk space on a fleet of data nodes.
System Prerequisites
SHIELD uses native database tooling to perform backup and restore operations against Cassandra. For this to work, you will need to install the following tools, matched to the version of Cassandra you are running:
- nodetool for orchestrating Cassandra's built-in snapshot system.
- sstableloader for restoring table data.
- POSIX utilities like rm, chown, and tar; either GNU tar or BSD tar should suffice.
Configuring Backup Jobs
The Cassandra SHIELD plugin operates on a single keyspace, and snapshots all tables contained therein.
Authentication
Four parameters govern access to your Cassandra server: Cassandra Host, Cassandra Port, Cassandra Username, and Cassandra Password.
By default, the plugin will assume that Cassandra is accessible on
loopback (127.0.0.1
), on the default TCP port
(9042), by the user cassandra, with the password cassandra.
Backing up a Keyspace
When you set up your Cassandra data system inside of SHIELD, you will have to choose a keyspace to protect. There is currently no way to filter the tables within this keyspace — SHIELD snapshots everything.
See Configuration Reference for more detail.
Restoring Cassandra from a Snapshot
Cassandra SHIELD snapshots consist of a POSIX tar
archive
filled with table-level backups that will be reloaded via the sstableloader command.
Configuration Reference
This section details all available configuration parameters for the Cassandra data protection plugin.
- Cassandra Host
- The hostname or IP address of your Cassandra server.
- CassandraPort
- The 'native transport' TCP port that Cassandra server is bound to, listening for incoming connections.
- Cassandra Username
- Username to authenticate to Cassandra as.
- Cassandra Password
- Password to authenticate to Cassandra as.
- Keyspace to Backup
- The name of the keyspace to backup.
- Data Directory
- The absolute path to the
data/
directory containing the Cassandra database files.