leo

About Leo Schoukroun

Léo is a Big Data & Hadoop solution architect with 2 years of experience on Hadoop and distributed systems. He is proficient with Big Data platforms: from planning and designing cluster architectures to deployment, administration, monitoring and industrialization of these clusters. Léo has worked on topics such as security, availability, replication and multi-tenancy of Hadoop clusters in collaboration with business users, analysts, data scientists, engineers and operations teams.

Auto-scaling Druid with Kubernetes

Apache Druid is an open-source analytics data store which could leverage the auto-scaling abilities of Kubernetes due to its distributed nature and its reliance on memory. I was inspired by the talk “Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes” by Jinchul Kim during DataWorks Summit 2019 Europe in Barcelona. […]

Hadoop cluster takeover with Apache Ambari

We recently migrated a large production Hadoop cluster from a “manual” automated install to Apache Ambari, we called this the Ambari Takeover. This is a risky process and we will detail why this operation was required and how we did it. […]

By |2018-11-20T13:54:41+00:00November 15th, 2018|Categories: Adaltas Summit 2018, Big Data|Tags: , , , |0 Comments

Present and future of Hadoop workflow scheduling: Oozie 5.x

During the DataWorks Summit Europe 2018 in Berlin, I had the opportunity to attend a breakout session on Apache Oozie. It covers the new features released in Oozie 5.0, including future features of Oozie 5.X, which is the main subject of this article. They spent some time discussing the Apache Ambari’s Workflow Scheduler and its way [...]

By |2018-06-05T22:36:37+00:00May 23rd, 2018|Categories: Big Data, DataWorks Summit 2018|Tags: , |2 Comments

Apache Thrift VS REST

Adaltas recently attended the Open Source Summit Europe 2017 in Prague. I had the opportunity to follow a presentation made by Randy Abernethy and Jens Geyer of RM-X, a cloud native consulting company, about the use of Apache Thrift in the building of high performance microservices. The focus was that Thrift is very fast and [...]

By |2019-08-04T20:56:43+00:00October 28th, 2017|Categories: Open Source Summit Europe 2017|0 Comments