Home 2018-01-26T14:24:00+00:00


Data Engineering

Data collect, data preparation, data lake, data gouvernance

Data Science

Writing algorithms, Spark, machine learning, exploration, statistics, python, R

Data Streaming

Message Bus, Key Performance Indicator (KPI), Threshold Detection, Time Window Queries, Intelligent Behaviors

Data Analytics

Visualization, notebooks

Latest articles

Present and future of Hadoop workflow scheduling: Oozie 5.x

By | May 23rd, 2018|Categories: Big Data, DataWorks Summit 2018|Tags: , |

During the DataWorks Summit Europe 2018 in Berlin, I had the opportunity to attend a breakout session on Apache Ambari’s Workflow Scheduler and it’s way to design and visualize Apache Oozie workflows. The talk was [...]

Essential questions about Time Series

By | March 19th, 2018|Categories: Big Data, Data Engineering|Tags: , , , , , |

Today, the bulk of Big Data is temporal. We see it in the media and among our customers: smart meters, banking transactions, smart factories, connected vehicles … IoT and Big Data go hand in hand. [...]

Notes after Katacoda Training on Kubernetes Container Orchestration

By | December 14th, 2017|Categories: Container|Tags: , , , |

A few weeks ago, I dedicated two days to follow the turorials available on Katacoda, the interactive learning platform for Kubernetes or any other container orchestration platform. I’m sharing my notes which I happen to [...]

Scaling massive, real-time data pipelines with Go

By | November 21st, 2017|Categories: Open Source Summit Europe 2017|Tags: , , , , , , , |

Last week at the Open Source Summit in Prague, Jean de Klerk held a talk called Scaling massive, real-time data pipelines with Go. This article goes over the main points of the talk, detailing the [...]