cesar

About César Berezowski

César is a Big Data & Hadoop Solution Architect and Data Engineer with 2 years of hands-on experience in Hadoop and distributed systems. He's currently building a platform on GCP for Renault.

Apache Flink: past, present and future

Apache Flink is a little gem which deserves a lot more attention. Let’s dive into Flink’s past, its current state and the future it is heading to by following the keynotes and presentations at Flink Forward 2018. […]

By |2018-11-15T11:47:31+00:00November 5th, 2018|Categories: Big Data, Data Engineering|Tags: , , , , , , |0 Comments

From Dockerfile to Ansible Containers

Presentation by Tomas Tomecek from Red Hat’s containerization team. This talk was an introduction to the Dockerfile format and to Ansible container’s tool and then a comparison of both. […]

By |2018-06-05T22:36:50+00:00October 25th, 2017|Categories: Open Source Summit Europe 2017|Tags: , , |0 Comments

Exposing Kafka on two different networks

A Big Data setup usually requires you to have multiple networking interface, let’s see how to set up Kafka on more than one of them. Kafka is a open-source stream processing software platform system wich functions like a publish/subscribe distributed messaging. It is designed for high throughput with built-in partitioning, replication, and fault tolerance. [...]

By |2018-06-05T22:37:00+00:00July 22nd, 2017|Categories: Blog|Tags: , |0 Comments

Change Ambari’s topbar color

We recently had a client that has multiple environments (Production, Integration, Testing, ...) running on HDP and managed using one Ambari instance per cluster. One of the questions that came up was the folloging: We need a way to distinguish our environment when on Ambari and the cluster name is visually not enough, how can [...]

By |2018-06-05T22:37:01+00:00July 9th, 2017|Categories: Hack|Tags: , |1 Comment

MiNiFi: Data at Scales & the Values of Starting Small

This post is part of the Series of the Dataworks Summit 2017 (ex-Hadoop Summit) Speaker is Aldrin Piri from Hortonworks This conference presented rapidly Apache NiFi and explained where MiNiFi came from: basically it's a NiFi minimal agent to deploy on small devices to bring data to a cluster's NiFi pipeline (ex: IoT). Here are [...]

By |2018-06-05T22:37:03+00:00July 8th, 2017|Categories: Blog, Events|Tags: , , , , |0 Comments

Get in control of your workflows with Apache Airflow

Below is a compilation of my notes taken during the presentation of Airflow by Christian Trebing from BlueYonder. Introduction Use case : how to handle data coming in regularly from customers ? Option 1 : use CRON only time triggers hard error handling inconvenient when overlapping Option 2 : Writing a workflow processing tool start is easy [...]

By |2019-06-19T07:08:27+00:00July 17th, 2016|Categories: Events, Tech Radar|Tags: , , , |0 Comments