Oozie monitoring: installing GraphiteInstrumentationService

by Thomas Memenga on 2013-08-02

Oozie monitoring: installing GraphiteInstrumentationService

A step-by-step guide on how to install oozie-graphite’s GraphiteInstrumentationService.

This blog entry is a step-by-step guide for the GraphiteInstrumentationService module installation.

Prerequisites

You will need:

  * Oozie 3.3.x server
  * Graphite server

Installing the jar

You can grab a 1.x.y release from https://github.com/syscrest/oozie-graphite/releases and add it to your oozie instance:

Vanilla Oozie:

See Oozie 3.3.0 Documentation - Oozie server setup for details on how to do this (= put oozie-graphite-*.jar in your extlib directory and (re-)run bin/oozie-setup.sh).

Cloudera CDH 4.3.0:

Place the jar in /var/lib/oozie.

Configuring oozie

Configure logging /conf/oozie-log4j.properties:

cd <OOZIE-HOME>/conf
echo "log4j.logger.com.syscrest=INFO, oozie" >> oozie-log4j.properties

Add GraphiteInstrumentationService-specific configuration to /conf/oozie-site.xml.

Append com.syscrest.oozie.graphite.GraphiteInstrumentationService to your oozie.services.ext configuration:

<property>
        <name>oozie.services.ext</name>
        <value>
		....,
		com.syscrest.oozie.graphite.GraphiteInstrumentationService</value>
</property>

Define your metrics push interval (in seconds):

<property>
        <name>com.syscrest.oozie.graphite.GraphiteInstrumentationService.logging.interval</name>
        <value>60</value>
</property>

Configure your graphite hostname / ip-address, the port and transport type (TCP/UDP) your carbon daemon is listening on:

<property>
        <name>com.syscrest.oozie.graphite.GraphiteInstrumentationService.graphite.host</name>
        <value>graphite.your.org</value>
</property>
<property>
        <name>com.syscrest.oozie.graphite.GraphiteInstrumentationService.graphite.port</name>
        <value>2003</value>
</property>
<property>
        <name>com.syscrest.oozie.graphite.GraphiteInstrumentationService.graphite.transport</name>
        <value>UDP</value>
</property>

Choose a prefix to use for all metric names. Something like infra.YourOozieServerName.oozie or perhaps oozie.OozieServerName that matches your graphite metrics namespace conventions:

<property>
        <name>com.syscrest.oozie.graphite.GraphiteInstrumentationService.graphite.pathPrefix</name>
        <value>infra.yourServer.oozie</value>
</property>

Optional: If you want you could specify a black- and whitelist to filter distinct metrics. I would recommend skipping this for now to be able to explore the full tree.

[spoiler title=“click to expand black-/whitelist configuration options” open=“0” style=“1” color="#79b63a"]

<property>
        <name>com.syscrest.oozie.graphite.GraphiteInstrumentationService.metrics.whitelist</name>
        <value>.*</value>
		<description>
		comma separated regex pattern list to select metrics that should be push into graphite
		Examples:  
		*\.counter\..*,.*\.timer\..*
		*\.counter\.jobs\..*
		
		Note: OPTIONAL (Default value:   .*  (all metrics))
		</description>
</property>
<property>
        <name>com.syscrest.oozie.graphite.GraphiteInstrumentationService.metrics.blacklist</name>
        <value></value>
		<description>
		comma separated regex pattern list to blacklist metrics that are matched by the whitelist configuration.
		Examples:
		.*\.counter\..*
		.*\.totalSquareSum,.*\.ownSquareSum

		Note: OPTIONAL (Default: "" (empty string = no filtering))
		</description>
</property>

[/spoiler]

Configuring graphite

Add a new section configuring your chosen prefix in /conf/storage-schemas.conf and align the resolution with your push interval:

[oozie_instrumentation_yourServer]
pattern = infra.yourServer.oozie.*
retentions = 60s:180d

Restart

Go on and restart graphite and oozie (in that order), so they pick up the new configuration:

  * Restart graphite
  * Restart oozie

Unfiltered (no white-/blacklist) oozie instrumentation can lead to 100+ metrics in Graphite and your carbon configuration might prevent the instant creation of all metrics at once. In that case, just wait for a couple of minutes (depending on your configured push interval / carbon configuration).