Blog, last published articles

Present and future of Hadoop workflow scheduling: Oozie 5.x

During the DataWorks Summit Europe 2018 in Berlin, I had the opportunity to attend a breakout session on Apache Oozie. It covers the new features released in Oozie 5.0, including future features of Oozie 5.X, which is the main subject of this article. They spent some time discussing the Apache Ambari’s Workflow Scheduler and its way [...]

By |2018-06-05T22:36:37+00:00May 23rd, 2018|Categories: Big Data, DataWorks Summit 2018|Tags: , |2 Comments

Essential questions about Time Series

Today, the bulk of Big Data is temporal. We see it in the media and among our customers: smart meters, banking transactions, smart factories, connected vehicles … IoT and Big Data go hand in hand. […]

By |2018-06-05T22:36:40+00:00March 19th, 2018|Categories: Big Data, Data Engineering|Tags: , , , , , |0 Comments

Notes after Katacoda Training on Kubernetes Container Orchestration

A few weeks ago, I dedicated two days to follow the turorials available on Katacoda, the interactive learning platform for Kubernetes or any other container orchestration platform. I’m sharing my notes which I happen to use regularly as a cheat sheet. […]

By |2018-06-05T22:36:42+00:00December 14th, 2017|Categories: Container|Tags: , , , |0 Comments

Open Source Summit 2017 – a week in Pragues

The Adaltas crew went to the Open Source Summit 2017 as well as the Mesos Summit 2017 held in Pragues about 3 weeks back. On this occasion, we compiled a series of articles about the conferences that have marked us most. Over the 3-day period of the Open Source Summit, there is no doubt [...]

By |2018-06-05T22:36:44+00:00November 23rd, 2017|Categories: Events|0 Comments

Scaling massive, real-time data pipelines with Go

Last week at the Open Source Summit in Prague, Jean de Klerk held a talk called Scaling massive, real-time data pipelines with Go. This article goes over the main points of the talk, detailing the steps Jean went through when optimising his pipelines, explaining critical parts of his code and reproducing his benchmark results. [...]

Mesos Introduction

Apache Mesos is an open source cluster management project designed to implement and optimize distributed systems. Mesos enables the management and sharing of resources in a fine and dynamic way between different nodes and for various applications. This article covers Mesos architecture, its fundamentals, and its support for NVIDIA GPUs . […]

By |2018-06-05T22:36:45+00:00November 15th, 2017|Categories: Open Source Summit Europe 2017|Tags: , , , , , |0 Comments

Micro Services

Back in the days, applications were monolithic and we could use an IP address to access a service. With virtual machines (VM), multiple hosts started to appear on the same machine with multiple apps. Things were still similar with VMs and physical machines as services were still accessible from an IP. With MicroServices, things changed [...]

By |2018-06-05T22:36:46+00:00November 14th, 2017|Categories: Open Source Summit Europe 2017|0 Comments