Cluster

A cluster is a group of two or more nodes which work together. Each node is identified in the cluster by its IP address and/or domain name and has its own storage, RAM, CPU, etc. Server clusters provide access to better resource availability, scalability and reliability.

Related articles

Hadoop Ozone part 3: advanced replication strategy with Copyset

Categories: Infrastructure | Tags: HDFS, Ozone, Cluster, Kubernetes, Node

Hadoop Ozone provide a way of setting a ReplicationType for every write you make on the cluster. Right now is supported HDFS and Ratis but more advanced replication strategies can be achieved. In this…

Hadoop Ozone part 2: tutorial and getting started of its features

Categories: Infrastructure | Tags: HDFS, CLI, Learning and tutorial, REST, Ozone, Amazon S3, Cluster

The releases of Hadoop Ozone come with a handy docker-compose file to try out Ozone. The below instructions provide details on how to use it. You can also use the Katacoda training sandbox which…

Hadoop Ozone part 1: an introduction of the new filesystem

Categories: Infrastructure | Tags: HDFS, Ozone, Cluster, Kubernetes

Hadoop Ozone is an object store for Hadoop. It is designed to scale to billions of objects of varying sizes. It is currently in development. The roadmap is available on the project wiki. This article…

Rook with Ceph doesn't provision my Persistent Volume Claims!

Categories: DevOps & SRE | Tags: PVC, Linux, Rook, Ubuntu, Ceph, Cluster, Kubernetes

Ceph installation inside Kubernetes can be provisionned using Rook. Currently doing an internship at Adaltas, I was in charge of participating in the setup of a Kubernetes (k8s) cluster. To avoid…

Eyal CHOJNOWSKI

By Eyal CHOJNOWSKI

Sep 9, 2019

Monitoring a production Hadoop cluster with Kubernetes

Categories: DevOps & SRE | Tags: Thrift, Docker, Elasticsearch, Graphana, Node.js, Prometheus, Shinken, Hadoop, Knox, Cluster, Kubernetes, Node, Python

Monitoring a production grade Hadoop cluster is a real challenge and needs to be constantly evolving. The software we use today is based on Nagios. Very efficient when it comes to the simplest…

Paul-Adrien CORDONNIER

By Paul-Adrien CORDONNIER

Dec 21, 2018

Jumbo, the Hadoop cluster bootstrapper

Categories: Infrastructure | Tags: Ansible, Ambari, Automation, HDP, REST, Cluster, Vagrant

Introducing Jumbo, a Hadoop cluster bootstrapper for developers. Jumbo helps you deploy development environments for Big Data technologies. It takes a few minutes to get a custom virtualized Hadoop…

Gauthier LEONARD

By Gauthier LEONARD

Nov 29, 2018

Hadoop cluster takeover with Apache Ambari

Categories: Big Data, DevOps & SRE, Adaltas Summit 2018 | Tags: Ambari, Automation, HDP, iptables, Kerberos, Nikita, Node.js, REST, Systemd, Cluster, Node

We recently migrated a large production Hadoop cluster from a “manual” automated install to Apache Ambari, we called this the Ambari Takeover. This is a risky process and we will detail why this…

Leo SCHOUKROUN

By Leo SCHOUKROUN

Nov 15, 2018

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.