Posts

fixing spark classpath issues on CDH5 accessing Accumulo 1.7.2

We experienced some strange NoSuchMethorError while migrating a Accumulo based application from 1.6.0 to 1.7.2 running on CDH5. A couple of code changes where necessary moving from 1.6.0 to 1.7.2, but these were pretty straightforward (members visibility changed, some getters were introduced). Everything compiled fine, but when we executed the spark application on the cluster we got an exception that was pointing directly to a line we changed during the migration:

Exception in thread "main" java.lang.NoSuchMethodError: 
        com.syscrest.HelloworldDriver$Opts.getPrincipal()Ljava/lang/String;
        at com.syscrest.HelloworldDriver.main(HelloworldDriver.java:29)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

We double checked our fat jar that we bundle the right version and then we checked the full classpath of the CDH5 spark service.

the distributed spark classpath on CDH5

We noticed that CDH5 (CDH 5.8.2 in our case) already bundles four accumulo jars:

/opt/cloudera/parcels/CDH/jars/accumulo-core-1.6.0.jar
/opt/cloudera/parcels/CDH/jars/accumulo-fate-1.6.0.jar
/opt/cloudera/parcels/CDH/jars/accumulo-start-1.6.0.jar
/opt/cloudera/parcels/CDH/jars/accumulo-trace-1.6.0.jar

And all jars in /opt/cloudera/parcels/cdh/jars are automatically inserted into the spark distributed classpath, as they are listed in /etc/spark/conf/classpath.txt.

spark.{driver,executor}.userClassPathFirst

Before tampering with the spark distributed classpath we tried to get it working using

spark-submit \
  --conf "spark.driver.userClassPathFirst=true" \
  --conf "spark.executor.userClassPathFirst=true" \
  ....

but this resulted in another mismatch of libraries on the classpath and did not solved our initial problem:

org.xerial.snappy.SnappyNative.maxCompressedLength(I)I
java.lang.UnsatisfiedLinkError: org.xerial.snappy.SnappyNative.maxCompressedLength(I)I
at org.xerial.snappy.SnappyNative.maxCompressedLength(Native Method)
at org.xerial.snappy.Snappy.maxCompressedLength(Snappy.java:316)
at org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:79)
at org.apache.spark.io.SnappyCompressionCodec.compressedOutputStream(CompressionCodec.scala:157)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$4.apply(TorrentBroadcast.scala:199)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$4.apply(TorrentBroadcast.scala:199)
at scala.Option.map(Option.scala:145)
at org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:199)
at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:101)
at org.apache.spark.broadcast.TorrentBroadcast.(TorrentBroadcast.scala:84)
at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29)
at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1051)
at org.apache.spark.rdd.NewHadoopRDD.(NewHadoopRDD.scala:77)
at org.apache.spark.SparkContext.newAPIHadoopRDD(SparkContext.scala:878)
at org.apache.spark.api.java.JavaSparkContext.newAPIHadoopRDD(JavaSparkContext.scala:516)

So we tried to fix the distributed spark classpath directly. Unfortunately /etc/spark/conf/classpath.txt is not modifiable via cloudera manager, and we did not want to change it manually on all nodes.

But as /etc/spark/conf/classpath.txt is being read in /etc/spark/conf/spark-env.sh

# Set distribution classpath. This is only used in CDH 5.3 and later.
export SPARK_DIST_CLASSPATH=$(paste -sd: "$SELF/classpath.txt")

we were able to use Spark Service Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh – Spark (Service-Wide) to tamper with SPARK_DIST_CLASSPATH.

In our case it was sufficient to add the correct accumulo jars (version 1.7.2 from the installed parcel) at the beginning of the list so they take precedence over the outdated ones:

SPARK_DIST_CLASSPATH=/opt/cloudera/parcels/ACCUMULO/lib/accumulo/lib/accumulo-core.jar:$SPARK_DIST_CLASSPATH
SPARK_DIST_CLASSPATH=/opt/cloudera/parcels/ACCUMULO/lib/accumulo/lib/accumulo-fate.jar:$SPARK_DIST_CLASSPATH
SPARK_DIST_CLASSPATH=/opt/cloudera/parcels/ACCUMULO/lib/accumulo/lib/accumulo-start.jar:$SPARK_DIST_CLASSPATH
SPARK_DIST_CLASSPATH=/opt/cloudera/parcels/ACCUMULO/lib/accumulo/lib/accumulo-trace.jar:$SPARK_DIST_CLASSPATH
export SPARK_DIST_CLASSPATH

Using Accumulos RangePartioner in a m/r job (and Oozie workflow)

How to use Accumulos RangePartioner to increase your mr-job ingest rate (and the neccessary pieces to include it into an oozie worflow)
Read more

Upgrading an existing Accumulo 1.6 CDH5 cluster to HDFS HA

Theses steps are neccessary for Accumulo 1.6 after upgrading an existing CDH5 cluster to HDFS HA.
Read more

Accumulo 1.6 (CDH5) not available for Ubuntu 14.04 (Trusty Tahr)

If you want to deploy a CDH5 Cluster including Accumulo 1.6 on Ubuntu, you will need to stick to Ubuntu 12.04 Precise Pangolin for the time being.
Read more

Reinitializing an Accumulo cluster on CDH4 or CDH5

How to manually reinitialize an existing Accumulo cluster on CDH4 or CDH5
Read more