Apache Zookeeper

Apache ZooKeeper is a coordination service built to manage large distributed systems. It coordinates the activities of different hosts and the use of common data with robust synchronization techniques.

While presenting itself externally as a single service, ZooKeeper forms a cluster of multiple nodes and server instances. This so-called Zookeeper Ensemble organizes itself by choosing a master node which takes the lead in synchronizing the cluster and managing consistency.

In a cluster, ZooKeeper provides the following services:

  • Naming service to identify and address nodes in a cluster
  • Cluster management to add or remove individual nodes
  • Synchronization service to manage the saving and changing of distributed data
  • Redundancy service to ensure high availability of data and services despite individual node failures
  • Information service to provide real-time node status information
  • Configuration service to provide real-time node configuration data
  • Procedure for appointing a Master Node

Originally developed by Yahoo, ZooKeeper became a sub-project of Hadoop at Apache before becoming a standalone project in 2008. Today, ZooKeeper is a kind of standard for organizing distributed services and is used by HBase, Hadoop and similar frameworks.

Related articles

Internship in Big Data infrastructure with TDP

Internship in Big Data infrastructure with TDP

Categories: Infrastructure, Learning | Tags: Cyber Security, DevOps, Java, Hadoop, IaC, Internship, TDP

Job Description Big Data and distributed computing is at Adaltas’ core. We support our partners in the deployment, maintenance and optimization of some of France’s largest clusters. Adaltas is also an…

Daniel HARTY

By Daniel HARTY

Oct 25, 2021

Lightweight containerization with Tupperware

Lightweight containerization with Tupperware

Categories: Containers Orchestration, Open Source Summit Europe 2017, Infrastructure | Tags: Btrfs, LXD, Red Hat, Systemd, Zookeeper, Cloud, Consensus

In this article, I will present lightweight containerization set up by Facebook called Tupperware. What is Tupperware Tupperware is a homemade framework written and used internally at Facebook…



Nov 3, 2017

Advanced multi-tenant Hadoop and Zookeeper protection

Advanced multi-tenant Hadoop and Zookeeper protection

Categories: Big Data, Infrastructure | Tags: DoS, iptables, Operation, Scalability, Zookeeper, Clustering, Consensus

Zookeeper is a critical component to Hadoop’s high availability operation. The latter protects itself by limiting the number of maximum connections (maxConns = 400). However Zookeeper does not protect…



Jul 5, 2017

Composants for CDH and HDP

Composants for CDH and HDP

Categories: Big Data | Tags: Flume, Hortonworks, Hadoop, Hive, Oozie, Sqoop, Zookeeper, Cloudera, CDH, HDP

I was interested to compare the different components distributed by Cloudera and HortonWorks. This also gives us an idea of the versions packaged by the two distributions. At the time of this writting…


By David WORMS

Sep 22, 2013

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain