Home 2018-01-26T14:24:00+00:00


Data Engineering

Data collect, data preparation, data lake, data gouvernance

Data Science

Writing algorithms, Spark, machine learning, exploration, statistics, python, R

Data Streaming

Message Bus, Key Performance Indicator (KPI), Threshold Detection, Time Window Queries, Intelligent Behaviors

Data Analytics

Visualization, notebooks

Latest articles

Omid: Scalable and highly available transaction processing for Apache Phoenix

By | May 24th, 2018|Categories: Big Data, DataWorks Summit 2018, Events|Tags: , , , , , |

This article is the result of my understanding of Apache Omid through online documentation and the conference given at the Dataworks Summit 2018 in Berlin. […]

Apache Beam: a unified programming model for data processing pipelines

By | May 24th, 2018|Categories: Big Data, Data Engineering, DataWorks Summit 2018, Events|Tags: , , , , , , , , |

In this article, we will review the concepts, the history and the future of Apache Beam, that may well become the new standard for data processing pipelines definition. […]

Present and future of Hadoop workflow scheduling: Oozie 5.x

By | May 23rd, 2018|Categories: Big Data, DataWorks Summit 2018|Tags: , |

During the DataWorks Summit Europe 2018 in Berlin, I had the opportunity to attend a breakout session on Apache Ambari’s Workflow Scheduler and it’s way to design and visualize Apache Oozie workflows. The talk was [...]

Essential questions about Time Series

By | March 19th, 2018|Categories: Big Data, Data Engineering|Tags: , , , , , |

Today, the bulk of Big Data is temporal. We see it in the media and among our customers: smart meters, banking transactions, smart factories, connected vehicles … IoT and Big Data go hand in hand. [...]

Notes after Katacoda Training on Kubernetes Container Orchestration

By | December 14th, 2017|Categories: Container|Tags: , , , |

A few weeks ago, I dedicated two days to follow the turorials available on Katacoda, the interactive learning platform for Kubernetes or any other container orchestration platform. I’m sharing my notes which I happen to [...]