Zookeeper is a critical component to Hadoop’s high availability operation. The latter protects itself by limiting the number of maximum connections (maxConns = 400). However Zookeeper does not protect himself intelligently, he refuses connections once the threshold is reached. In such case, the core components (HBase RegionServers / HDFS ZKFC) will no longer be able to initialize a connection and the service will be degraded or unavailable!

It’s very easy to do a DoS attack on Zookeeper. On the other hand, because most of Zookeeper installation are inside trusted or semi-trusted zones, these attacks are often involuntary. It’s enough for a developer to be unresponsive and launch a custom code that opens looping zookeeper sessions without closing them. In this case, zookeeper is compromised, and all the components with it.

Solutions

Several workarounds can be set up independently or jointly.

Utilisation des Observers

The observers are particular zookeepers nodes:

  • They do not participate in Quorum
  • They synchronize on participating nodes
  • They transfer write requests to participating nodes

They therefore make it possible to increase the number of nodes without slowing down the election process.

Using iptables

It is possible to protect yourself from external DoS via iptables. Indeed we can limit the number of connections on the port of zookeeper (2181) by IP address. This makes it possible to put a lower limit on the Zookeeper maxConns and thus block a particular address without blocking access from another machine.

Example

Imagine the following cluster topology:

  • 3 edge nodes: edge1.adaltas.com, edge2.adaltas.com edge3.adaltas.com
  • 3 master nodes : master1.adaltas.com, master2.adaltas.com, master3.adaltas.com
  • n worker nodes: n’interviennent pas dans ce cas
  • These machines are located in the subnet: 10.10.10.0/24

The masters nodes are used as the elective quorum and the edges as observers in order to increase the load.

NB: the even number of nodes is not a problem, only 3 are participants

Zookeeper configuration

We set up these configurations (/etc/zookeeper/conf/zoo.cfg):

On the master nodes:

On the edge nodes:

On the master nodes, it is forbidden to communicate with external machines on port 2181 (only the local network is allowed) via the following iptables rule:

Thus these zookeeper instances are only accessed by our internal services and processes.

Edge nodes limit communication with external machines on port 2181 to 100 simultaneous IP connections via the rule:

Hadoop  configuration

For Hadoop services (HDFS ZKFC, HBase Master, etc.), the following connection string is specified:

For “client” configurations (YARN containers, client hbase, third-party applications, etc.) the string is specified:

Thus, when an external job or application launches, it can not saturate the quorum and does not compromise the state of the cluster.

Go further, silotage of observers nodes

If an external “fraudulent” application uses the string edge1.adaltas.com:2181,edge2.adaltas.com:2181,edge3.adaltas.com:2181 then it is possible that it saturates the 3 observed nodes. Thus, although the cluster remains stable, some services will be unavailable, since customers will no longer be able to view Zookeeper.
We can limit the impact of the saturation of the observers by decomposing the chain into several substrings which will be specified in the client configurations. Example:

  • Chain 1 : edge1.adaltas.com, edge2.adaltas.com
  • Chain 2 : edge1.adaltas.com, edge3.adaltas.com
  • Chain 3 : edge2.adaltas.com, edge3.adaltas.com

Thus, if string 1 is saturated, edge3 remains available. Applications targeting channel 2 and 3 will not be blocked.