Articles published in 2022

Big data infrastructure internship

Big data infrastructure internship

Categories: Big Data, Data Engineering, DevOps & SRE, Infrastructure | Tags: YARN, Data Engineering, DevOps, Infrastructure, Ansible, Hadoop, Big Data, Cluster, GitOps, Vagrant, Internship, Kubernetes, TDP

Job description Big Data and distributed computing are at the core of Adaltas. We accompagny our partners in the deployment, maintenance, and optimization of some of the largest clusters in France…

Stephan BAUM

By Stephan BAUM

Dec 2, 2022

Traefik, Docker and dnsmasq to simplify container networking

Traefik, Docker and dnsmasq to simplify container networking

Categories: Containers Orchestration, Infrastructure, Tech Radar | Tags: Gatsby, JAMstack, Linux, Docker, Network

Good tech adventures start with some frustration, a need, or a requirement. This is the story of how I simplified the management and access of my local web applications with the help of Traefik and…

David WORMS

By David WORMS

Nov 17, 2022

WasmEdge: WebAssembly runtimes are coming for the edge

WasmEdge: WebAssembly runtimes are coming for the edge

Categories: Containers Orchestration, Adaltas Summit 2021, Infrastructure, Tech Radar | Tags: JAMstack, Linux, Docker, Rust Lang, WebAssembly

With many security challenges solved by design in its core conception, lots of projects benefit from using WebAssembly. WasmEdge runtime is an efficient Virtual Machine optimized for edge computing…

Guillaume BOUTRY

By Guillaume BOUTRY

Sep 29, 2022

Ingresses and Load Balancers in Kubernetes with MetalLB and nginx-ingress

Ingresses and Load Balancers in Kubernetes with MetalLB and nginx-ingress

Categories: Containers Orchestration, Infrastructure, Tech Radar | Tags: Ingress, Kubeadm, Cluster, Deployment, Kubernetes

When it comes to exposing services from a Kubernetes cluster and making it accessible from outside the cluster, the recommended option is to use a load-balancer type service to redirect incoming…

Kellian COTTART

By Kellian COTTART

Sep 8, 2022

Spark on Hadoop integration with Jupyter

Spark on Hadoop integration with Jupyter

Categories: Adaltas Summit 2021, Infrastructure, Tech Radar | Tags: YARN, HDP, Infrastructure, Jupyter, Kerberos, LDAP, R, Spark, CDP, Notebook, Python, Scala, TDP

For several years, Jupyter notebook has established itself as the notebook solution in the Python universe. Historically, Jupyter is the tool of choice for data scientists who mainly develop in Python…

Aargan COINTEPAS

By Aargan COINTEPAS

Sep 1, 2022

Framework laptop with NixOS, a user feedback

Framework laptop with NixOS, a user feedback

Categories: Learning, Tech Radar | Tags: Arch Linux, CentOS, CLI, DevOps, Learning and tutorial, Linux, OS X, Packaging, Ubuntu, NixOS, Open source

A new job comes with a new laptop. As such, I was given a Framework Laptop DIY Edition with the objective to install and configure it entirely with NixOS. I will share my first impressions after…

Carlos JESUS CARO

By Carlos JESUS CARO

Aug 22, 2022

Ceph object storage within a Kubernetes cluster with Rook

Ceph object storage within a Kubernetes cluster with Rook

Categories: Big Data, Data Governance, Learning | Tags: Amazon S3, Big Data, Ceph, Cluster, Data Lake, Kubernetes, Storage

Ceph is a distributed all-in-one storage system. Reliable and mature, its first stable version was released in 2012 and has since then been the reference for open source storage. Ceph’s main perk is…

Luka BIGOT

By Luka BIGOT

Aug 4, 2022

MinIO object storage within a Kubernetes cluster

MinIO object storage within a Kubernetes cluster

Categories: Big Data, Data Governance, Learning | Tags: Amazon S3, Big Data, Cluster, Data Lake, Kubernetes, Storage

MinIO is a popular object storage solution. Often recommended for its simple setup and ease of use, it is not only a great way to get started with object storage: it also provides excellent…

Luka BIGOT

By Luka BIGOT

Jul 9, 2022

Architecture of object-based storage and S3 standard specifications

Architecture of object-based storage and S3 standard specifications

Categories: Big Data, Data Governance | Tags: Database, API, Amazon S3, Big Data, Data Lake, Storage

Object storage has been growing in popularity among data storage architectures. Compared to file systems and block storage, object storage faces no limitations when handling petabytes of data. By…

Luka BIGOT

By Luka BIGOT

Jun 20, 2022

TDP workshop: Become a TDP power user from your terminal

TDP workshop: Become a TDP power user from your terminal

Categories: Events, Learning | Tags: DevOps, Ansible, Hadoop, Open source, TDP

The TDP CLI is used to deploy and operate your TDP services. It relies on tdp-lib to provide control and flexibility at your fingertips. Some time ago, we announced the public release of TDP - Trunk…

Paul FARAULT

By Paul FARAULT

Jun 17, 2022

Comparison of database architectures: data warehouse, data lake and data lakehouse

Comparison of database architectures: data warehouse, data lake and data lakehouse

Categories: Big Data, Data Engineering | Tags: Data Governance, Infrastructure, Hive, Iceberg, ORC, Parquet, Spark, Data Lake, Lakehouse, Data Warehouse, Delta Lake, File Format

Database architectures have experienced constant innovation, evolving with the appearence of new use cases, technical constraints, and requirements. From the three database structures we are comparing…

Gonzalo ETSE

By Gonzalo ETSE

May 17, 2022

NixOS: Enabling LXD virtual machines using Flakes

NixOS: Enabling LXD virtual machines using Flakes

Categories: Hack, Learning | Tags: GitHub, Learning and tutorial, Linux, LXD, Packaging, VM, NixOS, Open source

Nixpkgs is an ever-increasing collection of software packages for Nix and NixOS. Even with more than 80,000 packages, you easily run in a situation where there is a functionality that is not yet…

Kellian COTTART

By Kellian COTTART

May 13, 2022

Databricks logs collection with Azure Monitor at a Workspace Scale

Databricks logs collection with Azure Monitor at a Workspace Scale

Categories: Cloud Computing, Data Engineering, Adaltas Summit 2021 | Tags: DevOps, Metrics, Monitoring, Spark, Azure, Databricks, Log4j, SRE, Streaming

Databricks is an optimized data analytics platform based on Apache Spark. Monitoring Databricks plateform is crucial to ensure data quality, job performance, and security issues by limiting access to…

Claire PLAYE

By Claire PLAYE

May 10, 2022

Introducing Trunk Data Platform: the Open-Source Big Data Distribution Curated by TOSIT

Introducing Trunk Data Platform: the Open-Source Big Data Distribution Curated by TOSIT

Categories: Big Data, DevOps & SRE, Infrastructure | Tags: Ranger, DevOps, Hortonworks, Ansible, Hadoop, HBase, Knox, Spark, Cloudera, CDP, CDH, Open source, TDP

Ever since Cloudera and Hortonworks merged, the choice of commercial Hadoop distributions for on-prem workloads essentially boils down to CDP Private Cloud. CDP can be seen as the “best of both worlds…

Leo SCHOUKROUN

By Leo SCHOUKROUN

Apr 14, 2022

Blockchain 102: Cryptocurrencies, Wallets and DApps

Blockchain 102: Cryptocurrencies, Wallets and DApps

Categories: Adaltas Summit 2021, Infrastructure | Tags: Cryptography, Infrastructure, Blockchain, Consensus

A lot of people own cryptocurrencies today. But holding some tokens on an exchange does not mean interacting with the blockchain. The assets you trade are only numbers stored inside the exchange’s…

Gauthier LEONARD

By Gauthier LEONARD

Apr 12, 2022

JS monorepos in prod 7: Continuous Integration and Continuous Deployment with GitHub Actions

JS monorepos in prod 7: Continuous Integration and Continuous Deployment with GitHub Actions

Categories: DevOps & SRE, Front End | Tags: CI/CD, Monorepo, Node.js, Unit tests

The value of CI/CD lies in the ability to control and coordinate changes and feature addition in multiple, iterative releases while simultaneously having multiple services being actively developed in…

Alexander HOFFMANN

By Alexander HOFFMANN

Apr 6, 2022

Nix package creation: install a not yet supported font

Nix package creation: install a not yet supported font

Categories: Hack | Tags: Learning and tutorial, Linux, Packaging, GitOps, NixOS, Open source

The Nix packages collection is large with over 60 000 packages. However, chances are that sometimes the package you need is not available. You must integrate it yourself. I needed for some fonts which…

David WORMS

By David WORMS

Mar 29, 2022

Deploy your containerized AI applications with nvidia-docker

Deploy your containerized AI applications with nvidia-docker

Categories: Containers Orchestration, Data Science | Tags: containerd, DevOps, Learning and tutorial, NVIDIA, Container, Deep Learning, Docker, Docker Compose, Keras, TensorFlow

More and more products and services are taking advantage of the modeling and prediction capabilities of AI. This article presents the nvidia-docker tool for integrating AI (Artificial Intelligence…

Robert Walid SOARES

By Robert Walid SOARES

Mar 24, 2022

Ansible variables: choosing the right location

Ansible variables: choosing the right location

Categories: DevOps & SRE | Tags: Infrastructure, Ansible, IaC, Python, YAML

Defining variables for your Ansible playbooks and roles can become challenging as your project grows. Browsing the Ansible documentation, the diversity of Ansible variables location is confusing, to…

Xavier HERMAND

By Xavier HERMAND

Mar 15, 2022

Apache HBase: RegionServers co-location

Apache HBase: RegionServers co-location

Categories: Big Data, Adaltas Summit 2021, Infrastructure | Tags: Ambari, Database, HDP, Infrastructure, Tuning, Hadoop, HBase, Big Data, Storage

RegionServers are the processes that manage the storage and retrieval of data in Apache HBase, the non-relational column-oriented database in Apache Hadoop. It is through their daemons that any CRUD…

Pierre BERLAND

By Pierre BERLAND

Feb 22, 2022

Reliable and reproducible Linux installation with NixOS

Reliable and reproducible Linux installation with NixOS

Categories: Infrastructure, Learning | Tags: Arch Linux, CentOS, Linux, OS X, Packaging, Ubuntu, VM, NixOS, TDP

When using an operating system, upgrading packages or installing new ones are common tasks that introduce the risk of affecting the stability of the system. NixOS is a Linux distribution that ensures…

Florent MOUAFFO

By Florent MOUAFFO

Feb 8, 2022

Nix introduction, main concepts and commands

Nix introduction, main concepts and commands

Categories: Infrastructure, Learning | Tags: Arch Linux, CentOS, Linux, OS X, Packaging, Ubuntu, NixOS, TDP

Nix is a functional package manager for Linux and other Unix systems, making the management of packages more reliable and easy to reproduce. With a traditional package manager, when updating a package…

Florent MOUAFFO

By Florent MOUAFFO

Feb 1, 2022

Blockchain 101: Blockchains and Consensus Mechanisms

Blockchain 101: Blockchains and Consensus Mechanisms

Categories: Adaltas Summit 2021, Infrastructure, Learning | Tags: Cryptography, Infrastructure, Blockchain, Consensus

Cryptocurrencies are booming in 2021, with a market cap moving from 750 to more than 3,000 billion dollars. Let’s face it, this is mainly due to speculation. A lot of people involved do not have a…

Gauthier LEONARD

By Gauthier LEONARD

Jan 18, 2022

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain