Who we are

We are located in Hanover (Germany), and we embrace big data technologies. We help our international customers to understand new technologies, to select the right building blocks and to tailor them to individual business cases.

What we do

We design systems for unstructured and structured data that scale. We build resilient connections between systems. We implement and coach in big data technologies.

Technologies

We scale your data processing with Hadoop, Spark or streaming technologies like Apache Flink oder Apache Storm. We create analytics tools based on Apache Hue, Apache Pig, Presto, Hive, Cassandra or HBase. We implement resilient interconnected data processing with Oozie, Airflow or Schedoscope. Read on…

Latest posts from our developer blog

Patching Oozie in a parcel-based CDH 5.8.0 Installation

This blogpost will guide you to the process of cloning, patching, building and deploying a custom version of the oozie workflow engine based on the cdh 5.8.0 source code that is available on github. Sometimes it is necessary to manually…

how to access a remote ha-enabled hdfs in a (oozie) distcp action

how to inject the configuration of a remote ha-hdfs in a distcp call without modifing the local cluster configuration. Accessing a remote hdfs that has high availability enabled it not that straight forward as it used to be with non-ha…

Best Practices using PigServer (embedded pig)

things you should be aware of when executing pig scripts within your own java application using PigServer. Using PigServer in your own java application is a great way to leverage the simplicity of pig scripts, especially if you are generating…

Passing many parameters from Java action to Oozie workflow

oozie's 'capture-output' is a powerful method the pass dynamic configuration properties from action to action, but you may hit the maximum size limit quite fast. When executing Java actions in an Oozie workflow, there are going to be cases…