Infrastructure

Define and implement the appropriate hardware, software, network resources and services on bare-metal, cloud, virtualized and containerized environments.

Nobody* puts Java in a Container

This talk was about the issues of putting Java in a container and how, in its latest version, the JDK is now more aware of the container it is running in. The presentation is led by @joerg_schad, Distributed Software Engineer from Mesosphere, at the OpenSource Summit 2017 in Prague. […]

MariaDB integration with Hadoop

During a workshop with one of our customers, Adaltas has identified a potential risk to use MariaDB's High Availability (HA) strategy. Since the customer selected Cloudera's CDH 5 distribution, the reasoning below is based on Cloudera's official documentation. However, it applies to all Hadoop distributions including Hortonworks. Cloudera lists the various databases supported in HA [...]

By |2019-08-05T21:03:36+00:00July 31st, 2017|Categories: Big Data, Infrastructure|Tags: , , , , |0 Comments

Exposing Kafka on two different networks

A Big Data setup usually requires you to have multiple networking interface, let’s see how to set up Kafka on more than one of them. Kafka is a open-source stream processing software platform system wich functions like a publish/subscribe distributed messaging. It is designed for high throughput with built-in partitioning, replication, and fault tolerance. [...]

By |2019-08-05T21:04:15+00:00July 22nd, 2017|Categories: Big Data, Infrastructure|Tags: , , , , |0 Comments

MiNiFi: Data at Scales & the Values of Starting Small

This conference presented rapidly Apache NiFi and explained where MiNiFi came from: basically it's a NiFi minimal agent to deploy on small devices to bring data to a cluster's NiFi pipeline (ex: IoT). This post is part of the Series of the Dataworks Summit 2017 (ex-Hadoop Summit) and the speaker is Aldrin Piri from Hortonworks. [...]

By |2019-08-05T21:05:24+00:00July 8th, 2017|Categories: Blog, Data Engineering, Events, Infrastructure|Tags: , , , , |0 Comments

HDP cluster supervision

With the current growth of BigData technologies, more and more companies are building their own clusters in hope to get some value of their data. One main concern while building these infrastructures is the capacity to continuously monitor the cluster's health and report issues as fast as possible. This is where supervision comes in. There [...]

By |2019-08-05T21:05:58+00:00July 5th, 2017|Categories: Big Data, DevOps, Infrastructure|Tags: , , , , , |2 Comments

Advanced multi-tenant Hadoop and Zookeeper protection

Zookeeper is a critical component to Hadoop's high availability operation. The latter protects itself by limiting the number of maximum connections (maxConns = 400). However Zookeeper does not protect himself intelligently, he refuses connections once the threshold is reached. In such case, the core components (HBase RegionServers / HDFS ZKFC) will no longer be able [...]

By |2019-08-05T21:06:41+00:00July 5th, 2017|Categories: Big Data, Infrastructure|Tags: , , , |0 Comments