Articles published in 2016

Hive Metastore HA with DBTokenStore: Failed to initialize master key

Hive Metastore HA with DBTokenStore: Failed to initialize master key

Categories: Big Data, DevOps & SRE | Tags: Hive, Bug, Infrastructure

This article describes my little adventure around a startup error with the Hive Metastore. It shall be reproducable with any secure installation, meaning with Kerberos, with high availability enabled…

By David WORMS

Jul 21, 2016

EclairJS - Putting a Spark in Web Apps

EclairJS - Putting a Spark in Web Apps

Categories: Data Engineering, Front End | Tags: Spark, JavaScript, Jupyter

Presentation by David Fallside from IBM, images extracted from the presentation. Introduction Web Apps development has moved from Java to NodeJS and Javascript. It provides a simple and rich…

By David WORMS

Jul 17, 2016

Apache Apex with Apache SAMOA

Apache Apex with Apache SAMOA

Categories: Data Science, Events, Tech Radar | Tags: Apex, Flink, Samoa, Storm, Machine Learning, Tools, Hadoop

Traditional Machine Learning Batch Oriented Supervised - most common Training and Scoring One time model building Data set Training: Model building Holdout: Paremeter tuning Test: Accuracy Online…

By Pierre SAUVAGE

Jul 17, 2016

Apache Apex: next gen Big Data analytics

Apache Apex: next gen Big Data analytics

Categories: Data Science, Events, Tech Radar | Tags: Apex, Flink, Kafka, Storm, Data Science, Machine Learning, Tools, Hadoop

Below is a compilation of my notes taken during the presentation of Apache Apex by Thomas Weise from DataTorrent, the company behind Apex. Introduction Apache Apex is an in-memory distributed parallel…

By César BEREZOWSKI

Jul 17, 2016

Get in control of your workflows with Apache Airflow

Get in control of your workflows with Apache Airflow

Categories: Big Data, Tech Radar | Tags: Airflow, Cloud, DevOps, Python

Below is a compilation of my notes taken during the presentation of Apache Airflow by Christian Trebing from BlueYonder. Introduction Use case: how to handle data coming in regularly from customers…

By César BEREZOWSKI

Jul 17, 2016

Hive, Calcite and Druid

Hive, Calcite and Druid

Categories: Big Data | Tags: Analytics, Druid, Hive, Database, Hadoop

BI/OLAP requires interactive visualization of complex data streams: Real time bidding events User activity streams Voice call logs Network trafic flows Firewall events Application KPIs Traditionnal…

By David WORMS

Jul 14, 2016

Network Namespace without Docker

Network Namespace without Docker

Categories: Hack | Tags: DNS, Docker, Linux, Namespaces, Network, VLAN

Let’s imagine the following use case: I am connected to several networks (wlan0, eth0, usb0). I want to choose which network I’m gonna use when I launch apps. My app doesn’t allow me to choose a…

By Pierre SAUVAGE

Jul 6, 2016

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.