Patching Oozie in a parcel-based CDH 5.8.0 Installation

by Charlotte.Rauch on 2016-09-04

Patching Oozie in a parcel-based CDH 5.8.0 Installation

This blogpost will guide you to the process of cloning, patching, building and deploying a custom version of the oozie workflow engine based on the cdh 5.8.0 source code that is available on github.

Sometimes it is necessary to manually fix a bug when working with the current version of software. In case of the Cloudera Hadoop Distribution you can find the Git repos of most components on Cloudera’s Github page, including CDH version tags.The issue we are dealing with is already documented in the Oozie-JIRA and there even is a patch available. So all we have to do is apply the patch not having to fix it ourselves.

What you need to build Oozie

  * Java (1.6+)
  * Maven3
  * Git
  * Patch

(see https://archive.cloudera.com/cdh5/cdh/5/oozie/ENG_Building.html  for further details)

cloning the project

To apply a patch available for your Oozie installation you need the Oozie Maven project available on your computer. You can find it on GitHub https://github.com/cloudera/oozie.git. So open a bash, go to the folder you want it to be in and clone the repository as follows:

(The build artifacts will be 800 MB++ in size so make sure you have enough free diskspace at that location.)

git clone https://github.com/cloudera/oozie.git

Checkout the branch corresponding to your installation:

#replace 'cdh5.8.0-release' with whatever version you are running on the server
git checkout tags/cdh5.8.0-release

Change into the projects directory:

cd oozie

Build the project to see if everything is working out so far:

# replace '2.6.0-cdh5.8.0' and '0.12.0' with the proper versions of your cdh version
#
# btw: we needed to disable the tests due to
# https://issues.apache.org/jira/browse/OOZIE-2443

mvn clean package assembly:single \
-Dhadoop.version=2.6.0-cdh5.8.0 \
-Dpig.version=0.12.0 \
-DskipTests

Download and applying the patch

In our case we luckily don’t have to actually fix the bug ourselves. There already is a patch available for OOZIE-2649 that can be downloaded via wget or curl.

wget https://issues.apache.org<code class="bash plain">/jira/secure/attachment/12826170/OOZIE-2649.5.patch</code>

Applying it to the oozie project is as simple as follows:

patch -p0 < OOZIE-2649.5.patch

If all worked out you should get the following output:

patching file core/src/main/java/org/apache/oozie/action/oozie/SubWorkflowActionExecutor.java
Hunk #1 succeeded at 180 (offset -2 lines).
Hunk #2 succeeded at 208 (offset -2 lines).
patching file core/src/main/java/org/apache/oozie/workflow/lite/LiteWorkflowAppParser.java
Hunk #1 succeeded at 101 (offset -3 lines).
Hunk #2 succeeded at 480 (offset -3 lines).
patching file core/src/test/java/org/apache/oozie/action/oozie/TestSubWorkflowActionExecutor.java

Check out which modules were actually patched. In our case this would be “core” which is, as you can see, the top level folder of all patched files. Now rebuild the project:

# replace '2.6.0-cdh5.8.0' and '0.12.0' with the proper versions of your cdh version
#
# btw: we needed to disable the tests due to
# https://issues.apache.org/jira/browse/OOZIE-2443

mvn clean package assembly:single \
-Dhadoop.version=2.6.0-cdh5.8.0 \
-Dpig.version=0.12.0 \
-DskipTests

Now you need to locate the module’s .jar file produced by the build. Look for it in the module’s target folder:

#replace 'core' with whatever module was patched
ls -ls  core/target

Adding the patched .jar Files to your Oozie Installation

Now you’re about to make actual changes to your set up. Be aware that you could possibly break it and you know what you are actually doing here. Do this at your own risk, SYSCREST is in no way responsible for any damages that might occur. If you are a paying cloudera customer with a valid support subscription we strongly discourage any custom modifications to your setup, please get in contact with cloudera first.

First step is finding out on which server Oozie is actually running. You can do this by having a look at the Cloudera Manager clicking on ‘Oozie’ and then on ‘Instances’. There might be more than one node running it. So the following steps need to be done for all of them. Stop the Oozie service via Cloudera Manager (on all nodes). Oozie uses softlinks pointing from cd /opt/cloudera/parcels/CDH/lib/oozie/libserver/ to the jars in /opt/cloudera/parcels/CDH/jars/. These locations are only valid for parcel based installations. Not for package based installations.

ls -la /opt/cloudera/parcels/CDH/lib/oozie/libserver/oozie-*


...
lrwxrwxrwx 1 root root 45 Jul 27 00:07 oozie-client-4.1.0-cdh5.8.0.jar -> ../../../jars/oozie-client-4.1.0-cdh5.8.0.jar
lrwxrwxrwx 1 root root 31 Jul 27 00:07 oozie-client.jar -> oozie-client-4.1.0-cdh5.8.0.jar
lrwxrwxrwx 1 root root 56 Sep  2 12:14 oozie-core-4.1.0-cdh5.8.0.jar -> /opt/cloudera-patched/jars/oozie-core-4.1.0-cdh5.8.0.jar
lrwxrwxrwx 1 root root 29 Jul 27 00:07 oozie-core.jar -> oozie-core-4.1.0-cdh5.8.0.jar
...

So an elegant way to enable the fix without having to delete the original .jar is to simply change the link’s destination. This makes it easy to go back to the original if necessary. Get back to your bash, go to the server you want to add the .jar to and make a new folder in /opt. Call it whatever you like e.g. ‘/opt/cloudera-patched/jars’. But it should probably be located outside of /opt/cloudera.

mkdir /opt/cloudera-patched/jars

Copy the fixed .jar file to your new directory:

# if did not build it directly on that machine 
# you need to scp it to the host

cp core/target/oozie-core-4.1.0-cdh5.8.0.jar /opt/coludera-patched/jars/oozie-core-4.1.0-cdh5.8.0.jar

Go to the /libserver folder on the server:

cd /opt/cloudera/parcels/CDH/lib/oozie/libserver

Have a look on the folder’s content:

ls -la oozie*

The output should look like this:

...
lrwxrwxrwx  1 root root   45 Jul 27 00:07 oozie-client-4.1.0-cdh5.8.0.jar -> ../../../jars/oozie-client-4.1.0-cdh5.8.0.jar
lrwxrwxrwx  1 root root   31 Jul 27 00:07 oozie-client.jar -> oozie-client-4.1.0-cdh5.8.0.jar
lrwxrwxrwx  1 root root   56 Sep  1 14:47 oozie-core-4.1.0-cdh5.8.0.jar -> ../../../jars/oozie-core-4.1.0-cdh5.8.0.jar oozie-core-4.1.0-cdh5.8.0.jar
lrwxrwxrwx  1 root root   29 Jul 27 00:07 oozie-core.jar -> oozie-core-4.1.0-cdh5.8.0.jar
....
lrwxrwxrwx  1 root root   57 Jul 27 00:07 oozie-sharelib-streaming-4.1.0-cdh5.8.0.jar -> ../../../jars/oozie-sharelib-streaming-4.1.0-cdh5.8.0.jar
lrwxrwxrwx  1 root root   43 Jul 27 00:07 oozie-sharelib-streaming.jar -> oozie-sharelib-streaming-4.1.0-cdh5.8.0.jar
...

Find the file you want to fix and note down the original location (just in case…). Here it would be:

../../../jars/oozie-core-4.1.0-cdh5.8.0.jar oozie-core-4.1.0-cdh5.8.0.jar.

Modify the link as follows:

ln -f -s /opt/cloudera-patched/jars/oozie-core-4.1.0-cdh5.8.0.jar oozie-core-4.1.0-cdh5.8.0.jar

Finally restart Oozie using the Cloudera Manager.

Now you’re oozie workflow manager is patched and back online.

how to revert the patch

If you want to go back to the original .jar just do as follows to reset the filesystem link on the oozie hosts:

ln -f -s ../../../jars/oozie-core-4.1.0-cdh5.8.0.jar oozie-core-4.1.0-cdh5.8.0.jar

And don’t forget to restart oozie after switching the jar.

Tags: