Kubernetes

Kubernetes is an open source system released in 2015 that allows the orchestration and management of containers on server clusters, whether physical or virtual, public or private. It can support multiple Linux host machines or servers called nodes, each hosting one or more containers. Kubernetes allows you to create application services on several containers, plan their execution in a cluster, guarantee their integrity and ensure their monitoring. A server called "Master" will be the conductor of the cluster and will run the containers of one or more nodes according to the availability of resources on each server. Kubernetes therefore provides a routing service between the client and the containers that have deployed the service he’s looking for.

Related articles

Comparaison of different file formats in Big Data

Categories: Big Data, Data Engineering | Tags: Analytics, Avro, HDFS, Hive, Kafka, MapReduce, ORC, Spark, Batch processing, Big Data, CSV, Data Analytics, Data structures, Database, JSON, Protocol Buffers, Hadoop, Parquet, Kubernetes, XML

In data processing, there are different types of files formats to store your data sets. Each format has its own pros and cons depending upon the use cases and exists to serve one or several purposes…

Aida NGOM

By Aida NGOM

Jul 23, 2020

Expose a Rook-based Ceph cluster outside of Kubernetes

Categories: Containers Orchestration | Tags: Container, Debug, Docker, Rook, Ceph, Kubernetes

We recently deployed a LXD based Hadoop cluster and we wanted to be able to apply size quotas on some filesystems (ie: service logs, user homes). Quota is a built in feature of the Linux kernel used…

Leo SCHOUKROUN

By Leo SCHOUKROUN

Apr 16, 2020

Optimisation of Spark applications in Hadoop YARN

Categories: Data Engineering, Learning | Tags: Spark, Tuning, Hadoop, Python

Apache Spark is an in-memory data processing tool widely used in companies to deal with Big Data issues. Running a Spark application in production requires user-defined resources. This article…

Ferdinand DE BAECQUE

By Ferdinand DE BAECQUE

Mar 30, 2020

Install and debug Kubernetes inside LXD

Categories: Containers Orchestration | Tags: Container, Debug, Docker, Linux, LXD, Kubernetes, Node

We recently deployed a Kubernetes cluster with the need to maintain clusters isolation on our bare metal nodes across our infrastructure. We knew that Virtual Machines would provide the required…

Leo SCHOUKROUN

By Leo SCHOUKROUN

Feb 4, 2020

Policy enforcing with Open Policy Agent

Categories: Cyber Security, Data Governance | Tags: Kafka, Ranger, Authorization, REST, Cloud, Kubernetes, SSL/TLS

Open Policy Agent is an open-source multi-purpose policy engine. Its main goal is to unify policy enforcement across the cloud native stack. The project was created by Styra and it is currently…

Leo SCHOUKROUN

By Leo SCHOUKROUN

Jan 22, 2020

Should you move your Big Data and Data Lake to the Cloud

Categories: Big Data, Cloud Computing | Tags: DevOps, AWS, Cloud, CDP, Databricks, GCP, Azure

Should you follow the trend and migrate your data, workflows and infrastructure to GCP, AWS and Azure? During the Strata Data Conference in New-York, a general focus was put on moving customer’s Big…

Joris RUMMENS

By Joris RUMMENS

Dec 9, 2019

Hadoop Ozone part 3: advanced replication strategy with Copyset

Categories: Infrastructure | Tags: HDFS, Ozone, Cluster, Kubernetes, Node

Hadoop Ozone provide a way of setting a ReplicationType for every write you make on the cluster. Right now is supported HDFS and Ratis but more advanced replication strategies can be achieved. In this…

Hadoop Ozone part 1: an introduction of the new filesystem

Categories: Infrastructure | Tags: HDFS, Ozone, Cluster, Kubernetes

Hadoop Ozone is an object store for Hadoop. It is designed to scale to billions of objects of varying sizes. It is currently in development. The roadmap is available on the project wiki. This article…

Internship Data Science & Data Engineer - ML in production and streaming data ingestion

Categories: Data Engineering, Data Science | Tags: Flink, Kafka, Spark, DevOps, Hadoop, HBase, Kubernetes, Python

Context The exponential evolution of data has turned the industry upside down by redefining data storage, processing and data ingestion pipelines. Mastering these methods considerably facilitates…

David WORMS

By David WORMS

Nov 26, 2019

InfraOps & DevOps Internship - build a Big Data & Kubernetes PaaS

Categories: Big Data, Containers Orchestration | Tags: Kafka, Spark, DevOps, LXD, NoSQL, Hadoop, Ceph, Kubernetes

Context The acquisition of a high-capacity cluster is in line with Adaltas’ desire to build a PAAS-type offering to use and to provide Big Data and container orchestration platforms. The platforms are…

David WORMS

By David WORMS

Nov 26, 2019

Machine Learning model deployment

Categories: Big Data, Data Engineering, Data Science, DevOps & SRE | Tags: DevOps, Operation, Schema, AI, Cloud, Machine Learning, MLOps, On-premises

“Enterprise Machine Learning requires looking at the big picture … from a data engineering and a data platform perspective,” lectured Justin Norman during the talk on the deployment of Machine…

Oskar RYNKIEWICZ

By Oskar RYNKIEWICZ

Sep 30, 2019

Rook with Ceph doesn't provision my Persistent Volume Claims!

Categories: DevOps & SRE | Tags: PVC, Linux, Rook, Ubuntu, Ceph, Cluster, Kubernetes

Ceph installation inside Kubernetes can be provisionned using Rook. Currently doing an internship at Adaltas, I was in charge of participating in the setup of a Kubernetes (k8s) cluster. To avoid…

Eyal CHOJNOWSKI

By Eyal CHOJNOWSKI

Sep 9, 2019

Users and RBAC authorizations in Kubernetes

Categories: Containers Orchestration, Data Governance | Tags: Authentication, Authorization, Cyber Security, RBAC, Kubernetes, SSL/TLS

Having your Kubernetes cluster up and running is just the start of your journey and you now need to operate. To secure its access, user identities must be declared along with authentication and…

Robert Walid SOARES

By Robert Walid SOARES

Aug 7, 2019

Auto-scaling Druid with Kubernetes

Categories: Big Data, Business Intelligence, Containers Orchestration | Tags: EC2, Druid, CNCF, Container Orchestration, Data Analytics, Helm, Metrics, OLAP, Operation, Prometheus, Cloud, Kubernetes, Python

Apache Druid is an open-source analytics data store which could leverage the auto-scaling abilities of Kubernetes due to its distributed nature and its reliance on memory. I was inspired by the talk…

Leo SCHOUKROUN

By Leo SCHOUKROUN

Jul 16, 2019

Google Cloud Summit Paris Notes

Categories: Events | Tags: AWS, Cloud, GCP, Kubernetes, Azure, On-premises

Google organized its yearly Summit edition 2019 in Paris on the 18th of June. This year’s event was the biggest yet in Paris, which reflect Google’s commitment to position itself in the French market…

Tariq SAHNOUNI

By Tariq SAHNOUNI

Jun 26, 2019

Introduction to Cloudera Data Science Workbench

Categories: Data Science | Tags: Cloudera, Docker, Git, Kubernetes, Machine Learning, Azure, Notebook

Cloudera Data Science Workbench is a platform that allows Data Scientists to create, manage, run and schedule data science workflows from their browser. Thus it enables them to focus on their main…

Mehdi ELALAMI

By Mehdi ELALAMI

Feb 28, 2019

Installing Kubernetes on CentOS 7

Categories: Containers Orchestration | Tags: CentOS, cgroups, CNCF, DevOps, Docker, Infrastructure, Namespaces, Red Hat, VM, Ceph, Kubernetes

This article explains how to install a Kubernetes cluster. I will dive into what each step does so you can build a thorough understanding of what is going on. This article is based on my talk from the…

Arthur BUSSER

By Arthur BUSSER

Jan 29, 2019

LXD: The Missing Piece

Categories: Containers Orchestration | Tags: CPU, Docker, Linux, LXD, VM, Kubernetes

LXD stands for Linux Container Daemon. Yet another container technology. But LXD is very different. It stands apart from the pack. It is not necessarily better nor much faster nor more secure! But it…

Tariq SAHNOUNI

By Tariq SAHNOUNI

Dec 28, 2018

Monitoring a production Hadoop cluster with Kubernetes

Categories: DevOps & SRE | Tags: Thrift, Docker, Elasticsearch, Graphana, Node.js, Prometheus, Shinken, Hadoop, Knox, Cluster, Kubernetes, Node, Python

Monitoring a production grade Hadoop cluster is a real challenge and needs to be constantly evolving. The software we use today is based on Nagios. Very efficient when it comes to the simplest…

Paul-Adrien CORDONNIER

By Paul-Adrien CORDONNIER

Dec 21, 2018

Microsoft introduces Cloud Native Application Bundles

Categories: Containers Orchestration | Tags: CLI, Docker, Helm, Packaging, Kubernetes

At DockerCon EU 2018 in Barcelona, Matt Butcher, Principal Engineer at Microsoft and inventor of Helm, introduced CNAB, Cloud Native Application Bundles, a packaging format for distributed…

Arthur BUSSER

By Arthur BUSSER

Dec 4, 2018

Apache Flink: past, present and future

Categories: Data Engineering | Tags: Flink, Pipeline, Streaming, Kubernetes, Machine Learning, SQL

Apache Flink is a little gem which deserves a lot more attention. Let’s dive into Flink’s past, its current state and the future it is heading to by following the keynotes and presentations at Flink…

César BEREZOWSKI

By César BEREZOWSKI

Nov 5, 2018

One week to discuss technology in a Moroccan riad

Categories: Adaltas Summit 2018, Learning | Tags: Flink, CDSW, Deep Learning, Gatsby, Node.js, React.js, Hadoop, Knox, Kubernetes

Adaltas organise the year its first conference between the 22 and 26 of October. On the agenda of these 5 days of conference: discuss technology in one of the most beautiful riad of Marrakech. Mix the…

David WORMS

By David WORMS

Oct 11, 2018

Lando: Deep Learning used to summarize conversations

Categories: Data Science, Learning | Tags: Deep Learning, Micro Services, Node.js, Open API, Kubernetes, Neural Network

Lando is an application to summarize conversations using Speech To Text to translate the written record of a meeting into text and Deep Learning technics to summarize contents. It allows users to…

Yliess HATI

By Yliess HATI

Sep 18, 2018

TensorFlow on Spark 2.3: The Best of Both Worlds

Categories: Data Science, DataWorks Summit 2018 | Tags: Mesos, Spark, YARN, C++, CPU, GPU, JavaScript, Tuning, Keras, Kubernetes, Machine Learning, Python, TensorFlow

The integration of TensorFlow With Spark has a lot of potential and creates new opportunities. This article is based on a conference seen at the DataWorks Summit 2018 in Berlin. It was about the new…

Yliess HATI

By Yliess HATI

May 29, 2018

What's new in Apache Spark 2.3?

Categories: Data Engineering, DataWorks Summit 2018 | Tags: Arrow, ORC, Spark, PySpark, Docker, Streaming, Tuning, Spark MLlib, Kubernetes, pandas

Let’s dive into the new features offered by the 2.3 distribution of Apache Spark. This article is a composition of the following talks seen at the DataWorks Summit 2018 and additional research: Apache…

César BEREZOWSKI

By César BEREZOWSKI

May 23, 2018

Notes after Katacoda Training on Kubernetes Container Orchestration

Categories: Containers Orchestration, Learning | Tags: Helm, Ingress, Kubeadm, CNI, Micro Services, Minikube, Kubernetes

A few weeks ago, I dedicated two days to follow the turorials available on Katacoda, the interactive learning platform for Kubernetes or any other container orchestration platform. I’m sharing my…

David WORMS

By David WORMS

Dec 14, 2017

Micro Services

Categories: Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags: Mesos, CNCF, DNS, Encryption, gRPC, Istio, Linkerd, Micro Services, MITM, Proxy, Service Mesh, Kubernetes, SPOF, SSL/TLS

Back in the days, applications were monolithic and we could use an IP address to access a service. With virtual machines (VM), multiple hosts started to appear on the same machine with multiple apps…

David WORMS

By David WORMS

Nov 14, 2017

Kubernetes Storage Primitives for Stateful Workloads

Categories: Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags: Docker, Container Storage Interface (CSI), PVC, GCE, Kubernetes, Azure, Storage

This article is based on the presentation “Introduction to Kubernetes Storage Primitives for Stateful Workloads” from the OSS Convention Prague 2017 by the {Code} team. So, let’s start, what is…

Pierre SAUVAGE

By Pierre SAUVAGE

Oct 28, 2017

Kubernetes 1.8

Categories: Containers Orchestration, Open Source Summit Europe 2017 | Tags: containerd, CRD, Network, OCI, RBAC, Release and features, Kubernetes

The 1.8 release of Kubernetes brings a lot of new things. With 2500+ pull request, 2000+ commits, 400+ commiters, Kubernetes added 39 new features in this version. This is the richest release in terms…

Younes YASSINE

By Younes YASSINE

Oct 24, 2017

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.