Who we are

We are located in Hanover (Germany), and we embrace big data technologies. We help our international customers to understand new technologies, to select the right building blocks and to tailor them to individual business cases.

What we do

We design systems for unstructured and structured data that scale. We build resilient connections between systems. We implement and coach in big data technologies.

Technologies

We scale your data processing with Hadoop, Spark or streaming technologies like Apache Flink oder Apache Storm. We create analytics tools based on Apache Hue, Apache Pig, Presto, Hive, Cassandra or HBase. We implement resilient interconnected data processing with Oozie, Airflow or Schedoscope. Read on…

Latest posts from our developer blog

fixing spark classpath issues on CDH5 accessing Accumulo 1.7.2

We experienced some strange NoSuchMethorError while migrating a Accumulo based application from 1.6.0 to 1.7.2 running on CDH5. A couple of code changes where necessary moving from 1.6.0 to 1.7.2, but these were pretty straightforward (members…

how to collect cloudera manager usage data with google analytics

The Cloudera Manager is already capable of tracking usage data via Google Analytics, but that data is beeing send to a cloudera account. This blog post is about configuring the cloudera manager and changing the tracking id so that these usage…

Patching Oozie in a parcel-based CDH 5.8.0 Installation

This blogpost will guide you to the process of cloning, patching, building and deploying a custom version of the oozie workflow engine based on the cdh 5.8.0 source code that is available on github. Sometimes it is necessary to manually…

how to access a remote ha-enabled hdfs in a (oozie) distcp action

how to inject the configuration of a remote ha-hdfs in a distcp call without modifing the local cluster configuration. Accessing a remote hdfs that has high availability enabled it not that straight forward as it used to be with non-ha…