Pierre SAUVAGE

Big Data Solution Architect

Passionate about computer science since his childhood, and practicing programming in leisure since adolescence, Pierre joined an engineering school specializing in Information System, Big Data option.

He began his career in the IoT research laboratory, where he was able to study distributed systems, both theoretically and practically. Pierre then joined Adaltas. Today he is a Big Data & Hadoop Solution Architect and Data Engineer with over 4 years of hands-on experience in Hadoop and 5 years of experience with distributed systems. He has been designing, developing and maintaining, data processing workflows and real-time services as well as bringing to clients a unified and consistent vision on data management and workflows across their different data sources and business requirements. He steps in at all levels of the Data platforms, from planning, design and architecture to clusters deployment, administration, maintenance as well as prototyping and applications development in collaboration with business users, analysts, data scientists, engineering and operational teams.

He also has a good experience as educator for knowledge transfer and training.(He regularly gives courses and training around Big Data for various engineering and master schools) facilitating the transfer of knowledge and training of teams.

Published articles

TensorFlow installation on Docker

TensorFlow installation on Docker

Categories: Containers Orchestration, Data Science, Learning | Tags: AI, CPU, Deep Learning, Docker, Jupyter, Linux, TensorFlow

TensorFlow is an Open Source software from Google for numerical computation using a graph representation: Vertex (nodes) represent mathematical operations Edges represent N-dimensional data array…

By Pierre SAUVAGE

Aug 5, 2019

Druid and Hive integration

Druid and Hive integration

Categories: Big Data, Business Intelligence, Tech Radar | Tags: Druid, Hive, Data Analytics, Learning and tutorial, LLAP, OLAP, SQL

This article covers the integration between Hive Interactive (LDAP) and Druid. One can see it as a complement of the Ultra-fast OLAP Analytics with Apache Hive and Druid article. Tools description…

By Pierre SAUVAGE

Jun 17, 2019

Kubernetes Storage Primitives for Stateful Workloads

Kubernetes Storage Primitives for Stateful Workloads

Categories: Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags: Docker, Kubernetes, Container Storage Interface (CSI), PVC, Azure, Storage, GCE

This article is based on the presentation “Introduction to Kubernetes Storage Primitives for Stateful Workloads”from the OSS Convention Prague 2017 by the {Code} team. So, let’s start, what is…

By Pierre SAUVAGE

Oct 28, 2017

Advanced multi-tenant Hadoop and Zookeeper protection

Advanced multi-tenant Hadoop and Zookeeper protection

Categories: Big Data, Infrastructure | Tags: Zookeeper, Clustering, DoS, iptables, Operation, Scalability

Zookeeper is a critical component to Hadoop’s high availability operation. The latter protects itself by limiting the number of maximum connections (maxConns = 400). However Zookeeper does not protect…

By Pierre SAUVAGE

Jul 5, 2017

Apache Apex with Apache SAMOA

Apache Apex with Apache SAMOA

Categories: Data Science, Events, Tech Radar | Tags: Apex, Flink, Samoa, Storm, Machine Learning, Tools, Hadoop

Traditional Machine Learning Batch Oriented Supervised - most common Training and Scoring One time model building Data set Training: Model building Holdout: Paremeter tuning Test: Accuracy Online…

By Pierre SAUVAGE

Jul 17, 2016

Network Namespace without Docker

Network Namespace without Docker

Categories: Hack | Tags: DNS, Docker, Linux, Namespaces, Network, VLAN

Let’s imagine the following use case: I am connected to several networks (wlan0, eth0, usb0). I want to choose which network I’m gonna use when I launch apps. My app doesn’t allow me to choose a…

By Pierre SAUVAGE

Jul 6, 2016

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.