Data Engineering

Data Collect, Data Preparation, Data Lake, Data Governance

Data Science

Writing algorithms, Spark, Machine Learning, exploration, statistics, Python, R

Data Streaming

Message Bus, Key Performance Indicator (KPI), Threshold Detection, Time Window Queries, Intelligent Behaviors

Data Analytics

Visualization, notebooks

Latest articles

Multihoming on Hadoop

By |March 5th, 2019|Categories: Adalas Summit 2018, Big Data, Data Engineering|Tags: , , |

Multihoming, which means having multiple networks attached to one node, is one of the main components to manage the heterogeneous network usage of an Apache Hadoop cluster. This article is an introduction to the concept [...]

Introduction to Cloudera Data Science Workbench

By |February 28th, 2019|Categories: Big Data, Data Engineering, Data Science, ML|Tags: , , , , , |

Cloudera Data Science Workbench is a platform that allows Data Scientists to create, manage, run and schedule data science workflows from their browser. Thus it enables them to focus on their main task that is deriving [...]

Apache Knox made easy!

By |February 4th, 2019|Categories: Adalas Summit 2018, Big Data, Cyber Security, Data Governance|Tags: , , , , , , , , |

Apache Knox is the secure entry point of a Hadoop cluster, but can it also be the entry point for my REST applications? […]

Installing Kubernetes on CentOS 7

By |January 29th, 2019|Categories: Adalas Summit 2018, Container, DevOps, Uncategorized|Tags: , , , |

This article explains how to install a Kubernetes cluster. I will dive into what each step does so you can build a thorough understanding of what is going on. […]

Self-sovereign identities with verifiable claims

By |January 23rd, 2019|Categories: Adalas Summit 2018, Cyber Security, Data Governance|Tags: , , , , , , , , , |

Towards a trusted, personal, persistent, and portable digital identity for all. […]

Applying Deep Reinforcement Learning to Poker

By |January 9th, 2019|Categories: Data Science, Deep Learning|Tags: , , |

We will cover the subject of Deep Reinforcement Learning, more specifically the Deep Q Learning algorithm introduced by DeepMind, and then we'll apply a version of this algorithm to the game of Poker. Reinforcement learning [...]

Monitoring a production Hadoop cluster with Kubernetes

By |December 21st, 2018|Categories: Container, Data Engineering, DevOps|Tags: , , , , , , , |

Monitoring a production grade Hadoop cluster is a real challenge and needs to be constantly evolving. The software we use today is based on Nagios. Very efficient when it comes to the simplest surveillance, it [...]

CodaLab – Data Science competitions

By |December 17th, 2018|Categories: Big Data, Data Science|Tags: , , , , |

CodaLab Competition is a platform for code execution in the field of Data Science. It is a web interface on which a user can submit code or results and compare themselves to others. Let’s see [...]