Replace a dead node in Cassandra

Note (June 2020): this article is old and not really revelant anymore. If you use a modern version of cassandra, look at -Dcassandra.replace_address_first_boot option !

I want to share some tips about my experimentations with Cassandra (version 2.0.x).

I found some documentations on datastax website about replacing a dead node, but it is not suitable for our needs, because in case of hardware crash, we will set up a new node with exactly the same IP (replace “in place”). Update : the documentation in now up to date on datastax !

If you try to start the new node with the same IP, cassandra doesn’t start with :

java.lang.RuntimeException: A node with address / already exists, cancelling join. Use cassandra.replace_address if you want to replace this node.

So, we need to use the “cassandra.replace_address” directive (which is not really documented ? :() See this commit and this bug report, available since 1.2.11/2.0.0, it’s an easier solution and it works.

+    - New replace_address to supplant the (now removed) replace_token and
+      replace_node workflows to replace a dead node in place.  Works like the
+      old options, but takes the IP address of the node to be replaced.

It’s a JVM directive, so we can add it at the end of /etc/cassandra/ (debian package), for example:

JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address=" 

Of course, = ip of your dead/new node.

Now, start cassandra, and in logs you will see :

INFO [main] 2014-03-10 14:58:17,804 (line 941) JOINING: schema complete, ready to bootstrap
INFO [main] 2014-03-10 14:58:17,805 (line 941) JOINING: waiting for pending range calculation
INFO [main] 2014-03-10 14:58:17,805 (line 941) JOINING: calculation complete, ready to bootstrap
INFO [main] 2014-03-10 14:58:17,805 (line 941) JOINING: Replacing a node with token(s): [...]
INFO [main] 2014-03-10 14:58:17,844 (line 941) JOINING: Starting to bootstrap...
INFO [main] 2014-03-10 14:58:18,551 (line 82) [Stream #effef960-6efe-11e3-9a75-3f94ec5476e9] Executing streaming plan for Bootstrap

Node is in boostraping mode and will retrieve data from cluster. This may take lots of time.
If the node is a seed node, a warning will indicate that the node did not auto bootstrap. This is normal, you need to run a nodetool repair on the node.

On the new node :

# nodetools netstats

Bootstrap effef960-6efe-11e3-9a75-3f94ec5476e9
	Receiving 102 files, 17467071157 bytes total

After some time, you will see some informations on logs !
On the new node :

 INFO [STREAM-IN-/] 2014-03-10 15:15:40,363 (line 215) [Stream #effef960-6efe-11e3-9a75-3f94ec5476e9] All sessions completed
 INFO [main] 2014-03-10 15:15:40,366 (line 970) Bootstrap completed! for the tokens [...]
 INFO [main] 2014-03-10 15:15:40,412 (line 1371) Node / state jump to normal
 WARN [main] 2014-03-10 15:15:40,413 (line 1378) Not updating token metadata for / because I am replacing it
 INFO [main] 2014-03-10 15:15:40,419 (line 821) Startup completed! Now serving reads.

And on other nodes :

 INFO [GossipStage:1] 2014-03-10 15:15:40,625 (line 1371) Node / state jump to normal

Et voilà, dead node has been replaced !
Don’t forget to REMOVE modifications on after the complete bootstrap !

Enjoy !