Consensus

The consensus problemes can be summarize as:

Multiple clients can each send a request to a system.
The system must choose which one to handle next.
The system is implemented by multiple computers.
They must choose a single request even is some computers fail.

Learn more: Leslie Lamport on the Plaxos algorithm (video)
Related tags: Algorithm; Apache Zookeeper

State of the Hadoop open-source ecosystem in early 2013

Categories: Big Data | Tags: Flume, Mesos, Phoenix, Pig, Hadoop, Kafka, Mahout, Data Science

Hadoop is already a large ecosystem and my guess is that 2013 will be the year where it grows even larger. There are some pieces that we no longer need to present. ZooKeeper, hbase, Hive, Pig, Flume…

By David WORMS

Jul 8, 2013

Advanced multi-tenant Hadoop and Zookeeper protection

Categories: Big Data, Infrastructure | Tags: DoS, iptables, Operation, Scalability, Zookeeper, Clustering, Consensus

Zookeeper is a critical component to Hadoop’s high availability operation. The latter protects itself by limiting the number of maximum connections (maxConns = 400). However Zookeeper does not protect…

By Pierre SAUVAGE

Jul 5, 2017

Lightweight containerization with Tupperware

Categories: Containers Orchestration, Open Source Summit Europe 2017, Infrastructure | Tags: Btrfs, LXD, Red Hat, Systemd, Zookeeper, Cloud, Consensus

In this article, I will present lightweight containerization set up by Facebook called Tupperware. What is Tupperware Tupperware is a homemade framework written and used internally at Facebook…

By Lucas BAKALIAN

Nov 3, 2017

A CoreOS development cluster with Vagrant and VirtualBox

Categories: Hack, Infrastructure | Tags: Arch Linux, CoreOS, Linux, VirtualBox, etcd, Vagrant

Following CoreOS’s instructions on how to set up a development environment in VirtualBox did not work out well for me. Here are the steps I followed to get Container Linux up and running with Vagrant…

By Arthur BUSSER

Jun 20, 2018

Spark Streaming part 2: run Spark Structured Streaming pipelines in Hadoop

Categories: Data Engineering, Learning | Tags: Spark, Apache Spark Streaming, Python, Streaming

Spark can process streaming data on a multi-node Hadoop cluster relying on HDFS for the storage and YARN for the scheduling of jobs. Thus, Spark Structured Streaming integrates well with Big Data…

By Oskar RYNKIEWICZ

May 28, 2019

Blockchain 101: Blockchains and Consensus Mechanisms

Categories: Adaltas Summit 2021, Infrastructure, Learning | Tags: Cryptography, Infrastructure, Blockchain, Consensus

Cryptocurrencies are booming in 2021, with a market cap moving from 750 to more than 3,000 billion dollars. Let’s face it, this is mainly due to speculation. A lot of people involved do not have a…

By Gauthier LEONARD

Jan 18, 2022

Blockchain 102: Cryptocurrencies, Wallets and DApps

Categories: Adaltas Summit 2021, Infrastructure | Tags: Cryptography, Infrastructure, Blockchain, Consensus

A lot of people own cryptocurrencies today. But holding some tokens on an exchange does not mean interacting with the blockchain. The assets you trade are only numbers stored inside the exchange’s…