About Gauthier Leonard

Gauthier Leonard is a Data Engineer in Big Data recently graduated. During his internship at Adaltas, he became familiar with the Hadoop ecosystem and the deployment of secure clusters by developing a cluster provisioning automation tool. Gauthier consolidated his skills during his first assignment as the Big Data referent in a Data Lake project. He helped the customer to design and install an HDP 3 cluster, and set up a first data pipeline using NiFi, Hive 3 (Hive ACID and Hive LLAP) and Oozie.

Running Apache Hive 3, new features and tips and tricks

Apache Hive 3 brings a bunch of new and nice features to the data warehouse. Unfortunately, like many major FOSS releases, it comes with a few bugs and not much documentation. It is available since July 2018 as part of HDP3 (Hortonworks Data Platform version 3). I will first review the new features available with [...]

By |2019-07-25T22:40:14+00:00July 25th, 2019|Categories: Big Data, DataWorks Summit 2019|Tags: , , , , , , , |0 Comments

Apache Beam: a unified programming model for data processing pipelines

In this article, we will review the concepts, the history and the future of Apache Beam, that may well become the new standard for data processing pipelines definition. […]