Reinitializing an Accumulo cluster on CDH4 or CDH5

by Thomas Memenga on 2014-08-26

Reinitializing an Accumulo cluster on CDH4 or CDH5

How to manually reinitialize an existing Accumulo cluster on CDH4 or CDH5

Warning: This will erase all data stored in your existing Accumulo service. Think twice before executing these steps.

Sometimes you want to erase all data stored in your Accumulo service and start from scratch. Simply issuing Initialize Accumulo from the Cloudera Manager will not do the trick, as the Cloudera Manager will never drop already-existing metadata during initialisation. Even removing the Accumulo Service and creating a new one would only work if you were using a completely new configuration (paths on hdfs, instance name, etc). With some manual work, however, you can “re-” initialize an existing Accumulo Service on CDH4 and CDH5.

Preparations

You will need to write down some configuration parameters:

Accumulo -> Configuration -> Service Wide

Accumulo Instance Secret

instance.secret  = DEFAULT

Instance Name

accumulo_instance_name 	= accumulo

HDFS Directory

instance.dfs.dir = /accumulo

Reinitializing the accumulo service

  * Cloudera Manager: **Stop** (Accumulo 1.6 Service -> Actions)
  * wait and verify the successful shutdown of all daemons
  * Delete the accumulo data directory on the hdfs:


hadoop fs -rmr <instance.dfs.dir>
e.g
hadoop fs -rmr /accumulo




  * Delete the accumulo user home directory on the hdfs:


hadoop fs -rmr /user/<accumulo_user_name>
e.g.
hadoop fs -rmr /user/accumulo




  * Delete the accumulo data in zookeeper:


zookeeper-client -server <one-of-your-zookeeper-hosts>
e.g.
zookeeper-client -server zookeeper01.local

[zk: zookeeper01.local(CONNECTED) 0] ls /
[accumulo, storm, zookeeper]
# you need to add an authentication to be 
# able to remove the /accumulo subtree in your zookeeper instance
# addauth digest <accumulo-user-name>:<instance.secret>
[zk: zookeeper01.local(CONNECTED) 1] addauth digest accumulo:DEFAULT
[zk: zookeeper01.local(CONNECTED) 2] rmr /accumulo
[zk: zookeeper01.local(CONNECTED) 3] quit




  * Cloudera Manager: **Stop** (Accumulo 1.6 Service -> Actions)
  * Cloudera Manager: **Create Accumulo Home Dir** (Accumulo 1.6 Service -> Actions)
  * Cloudera Manager: **Create Accumulo User Dir** (Accumulo 1.6 Service -> Actions)
  * Cloudera Manager: **Initialize Accumulo** (Accumulo 1.6 Service -> Actions)
  * Cloudera Manager: **Start** (Accumulo 1.6 Service -> Actions)

Wait for the Accumulo cluster to start and check your monitor UI.