Monitoring Cassandra with groovy

Marcos Dione

2012-08-14 10:19

One of my job's new developments is that we'll start using Cassandra as the database for some of our webservices. The move was decided mainly because of the lack of SPoF and easy adition of a column, which happens rather often in our environment.

One of the tasks we've are in charge of is to monitor the system. Most of the interesting values to monitor in a Cassandra setup can be obtained with various commands of nodetool, but not values from the JVM running the Cassandra instance. So I turned to my closest Java guru, who recommended doing a script in groovy. After playing a little with the Java-like language, I got this:

import javax.management.ObjectName
import javax.management.remote.JMXConnector
import javax.management.remote.JMXConnectorFactory
import javax.management.remote.JMXServiceURL

jmxEnv = [(JMXConnector.CREDENTIALS): (String[])["user", "pass"]]

def serverUrl = 'service:jmx:rmi:///jndi/rmi://localhost:7199/jmxrmi'
def server = JMXConnectorFactory.connect(new JMXServiceURL (serverUrl), jmxEnv).MBeanServerConnection

mBeanNames= [ 
    "java.lang:type=ClassLoading", 
    "java.lang:type=Compilation", 
    "java.lang:type=Memory", 
    "java.lang:type=Threading",

    "org.apache.cassandra.db:type=Caches",
    "org.apache.cassandra.db:type=Commitlog",
    "org.apache.cassandra.db:type=CompactionManager",
    "org.apache.cassandra.db:type=StorageProxy",
    "org.apache.cassandra.db:type=StorageService",

    "org.apache.cassandra.internal:type=AntiEntropyStage",
    "org.apache.cassandra.internal:type=FlushWriter",
    "org.apache.cassandra.internal:type=GossipStage",
    "org.apache.cassandra.internal:type=HintedHandoff",
    "org.apache.cassandra.internal:type=InternalResponseStage",
    "org.apache.cassandra.internal:type=MemtablePostFlusher",
    "org.apache.cassandra.internal:type=MigrationStage",
    "org.apache.cassandra.internal:type=MiscStage",
    "org.apache.cassandra.internal:type=StreamStage",

    "org.apache.cassandra.metrics:type=ClientRequestMetrics,name=ReadTimeouts",
    "org.apache.cassandra.metrics:type=ClientRequestMetrics,name=ReadUnavailables",
    "org.apache.cassandra.metrics:type=ClientRequestMetrics,name=WriteTimeouts",
    "org.apache.cassandra.metrics:type=ClientRequestMetrics,name=WriteUnavailables",

    "org.apache.cassandra.net:type=FailureDetector",
    "org.apache.cassandra.net:type=MessagingService",
    "org.apache.cassandra.net:type=StreamingService",


    "org.apache.cassandra.request:type=MutationStage",
    "org.apache.cassandra.request:type=ReadRepairStage",
    "org.apache.cassandra.request:type=ReadStage",
    "org.apache.cassandra.request:type=ReplicateOnWriteStage",
    "org.apache.cassandra.request:type=RequestResponseStage",
    ]

def dumpMBean= { name ->
    println "$name:"

    // get a proxy MBean for the class
    bean= new GroovyMBean (server, name)
    // get the attributes
    attrs= bean.listAttributeNames ()
    // get an AttrlibuteList, previous cast (!) of Array<String> to String[]
    attrMap= server.getAttributes (bean.name(), (String [])attrs)

    attrMap.each { kv ->
        // kv is an Attribute
        key= kv.name
        // skip RangeKeySample, it can be 15MiB big or more...
        if (key!="RangeKeySample") {
            value= kv.value
            println "\t$key: $value"
        }
    }

    println ""
}

// dump singletons
mBeanNames.each { name ->
    dumpMBean (name)
}

// dump keyspaces and their column families
args.each { ks_cfs ->
    split= ks_cfs.tokenize ('=')
    ks= split[0]
    cfs= split[1].tokenize (',')

    cfs.each { cf ->
        dumpMBean ("org.apache.cassandra.db:type=ColumnFamilies,keyspace=$ks,columnfamily=$cf")
    }
}

In particular we dump its output to a text file and we process it afterwards to pick the values we want to monitor and graph. As we're not yet in production, we hadn't settled on which values we're going to monitor.