Adaltas is a team of consultants with a focus on Open Source, Big Data and distributed systems based in France, Canada and Morocco.
- Architecture, audit and digital transformation
- Cloud and on-premise operation
- Complex application and ingestion pipelines
- Efficient and reliable solutions delivery
Latest articles

Dive into tdp-lib, the SDK in charge of TDP cluster management
Categories: Big Data, Infrastructure | Tags: Programming, Ansible, Hadoop, Python, TDP
All the deployments are automated and Ansible plays a central role. With the growing complexity of the code base, a new system was needed to overcome the Ansible limitations which will enable us toā¦
Jan 24, 2023

Adaltas Summit 2022 Morzine
Categories: Big Data, Adaltas Summit 2022 | Tags: Data Engineering, Infrastructure, Iceberg, Container, Lakehouse, Docker, Kubernetes
For its third edition, the whole Adaltas crew is gathering in Morzine for a whole week with 2 days dedicated to technology the 15th and the 16Th of september 2022. The speakers choose one of theā¦
By David WORMS
Jan 13, 2023

How to build your OCI images using Buildpacks
Categories: Containers Orchestration, DevOps & SRE | Tags: CNCF, OCI, CI/CD, Container, Docker, Kubernetes
Docker has become the new standard for building your application. In a Docker image we place our source code, its dependencies, some configurations and our application is almost ready to be deployedā¦
Jan 9, 2023

Big data infrastructure internship
Categories: Big Data, Data Engineering, DevOps & SRE, Infrastructure | Tags: Infrastructure, Hadoop, Big Data, Cluster, Internship, Kubernetes, TDP
Job description Big Data and distributed computing are at the core of Adaltas. We accompagny our partners in the deployment, maintenance, and optimization of some of the largest clusters in Franceā¦
By Stephan BAUM
Dec 2, 2022

Traefik, Docker and dnsmasq to simplify container networking
Categories: Containers Orchestration, Infrastructure, Tech Radar | Tags: Gatsby, JAMstack, Linux, Docker, Network
Good tech adventures start with some frustration, a need, or a requirement. This is the story of how I simplified the management and access of my local web applications with the help of Traefik andā¦
By David WORMS
Nov 17, 2022

WasmEdge: WebAssembly runtimes are coming for the edge
Categories: Containers Orchestration, Adaltas Summit 2021, Infrastructure, Tech Radar | Tags: JAMstack, Linux, Docker, Rust Lang, WebAssembly
With many security challenges solved by design in its core conception, lots of projects benefit from using WebAssembly. WasmEdge runtime is an efficient Virtual Machine optimized for edge computingā¦
Sep 29, 2022

Ingresses and Load Balancers in Kubernetes with MetalLB and nginx-ingress
Categories: Containers Orchestration, Infrastructure, Tech Radar | Tags: Ingress, Kubeadm, Cluster, Deployment, Kubernetes
When it comes to exposing services from a Kubernetes cluster and making it accessible from outside the cluster, the recommended option is to use a load-balancer type service to redirect incomingā¦
Sep 8, 2022

Spark on Hadoop integration with Jupyter
Categories: Adaltas Summit 2021, Infrastructure, Tech Radar | Tags: YARN, HDP, Infrastructure, Jupyter, Spark, CDP, Notebook, TDP
For several years, Jupyter notebook has established itself as the notebook solution in the Python universe. Historically, Jupyter is the tool of choice for data scientists who mainly develop in Pythonā¦
Sep 1, 2022

Framework laptop with NixOS, a user feedback
Categories: Learning, Tech Radar | Tags: CLI, DevOps, Learning and tutorial, Linux, Packaging, NixOS, Open source
A new job comes with a new laptop. As such, I was given a Framework Laptop DIY Edition with the objective to install and configure it entirely with NixOS. I will share my first impressions afterā¦
Aug 22, 2022

Ceph object storage within a Kubernetes cluster with Rook
Categories: Big Data, Data Governance, Learning | Tags: Amazon S3, Big Data, Ceph, Cluster, Data Lake, Kubernetes, Storage
Ceph is a distributed all-in-one storage system. Reliable and mature, its first stable version was released in 2012 and has since then been the reference for open source storage. Cephās main perk isā¦
By Luka BIGOT
Aug 4, 2022

MinIO object storage within a Kubernetes cluster
Categories: Big Data, Data Governance, Learning | Tags: Amazon S3, Big Data, Cluster, Data Lake, Kubernetes, Storage
MinIO is a popular object storage solution. Often recommended for its simple setup and ease of use, it is not only a great way to get started with object storage: it also provides excellentā¦
By Luka BIGOT
Jul 9, 2022

Architecture of object-based storage and S3 standard specifications
Categories: Big Data, Data Governance | Tags: Database, API, Amazon S3, Big Data, Data Lake, Storage
Object storage has been growing in popularity among data storage architectures. Compared to file systems and block storage, object storage faces no limitations when handling petabytes of data. Byā¦
By Luka BIGOT
Jun 20, 2022

TDP workshop: Become a TDP power user from your terminal
Categories: Events, Learning | Tags: DevOps, Ansible, Hadoop, Open source, TDP
The TDP CLI is used to deploy and operate your TDP services. It relies on tdp-lib to provide control and flexibility at your fingertips. Some time ago, we announced the public release of TDP - Trunkā¦
By Paul FARAULT
Jun 17, 2022

Comparison of database architectures: data warehouse, data lake and data lakehouse
Categories: Big Data, Data Engineering | Tags: Data Governance, Infrastructure, Iceberg, Parquet, Spark, Data Lake, Lakehouse, Data Warehouse, File Format
Database architectures have experienced constant innovation, evolving with the appearence of new use cases, technical constraints, and requirements. From the three database structures we are comparingā¦
By Gonzalo ETSE
May 17, 2022

NixOS: Enabling LXD virtual machines using Flakes
Categories: Hack, Learning | Tags: GitHub, Learning and tutorial, Linux, LXD, Packaging, VM, NixOS, Open source
Nixpkgs is an ever-increasing collection of software packages for Nix and NixOS. Even with more than 80,000 packages, you easily run in a situation where there is a functionality that is not yetā¦
May 13, 2022

Databricks logs collection with Azure Monitor at a Workspace Scale
Categories: Cloud Computing, Data Engineering, Adaltas Summit 2021 | Tags: Metrics, Monitoring, Spark, Azure, Databricks, Log4j
Databricks is an optimized data analytics platform based on Apache Spark. Monitoring Databricks plateform is crucial to ensure data quality, job performance, and security issues by limiting access toā¦
By Claire PLAYE
May 10, 2022

Introducing Trunk Data Platform: the Open-Source Big Data Distribution Curated by TOSIT
Categories: Big Data, DevOps & SRE, Infrastructure | Tags: Ranger, DevOps, Hortonworks, Ansible, Hadoop, HBase, Knox, Spark, Cloudera, CDP, CDH, Open source, TDP
Ever since Cloudera and Hortonworks merged, the choice of commercial Hadoop distributions for on-prem workloads essentially boils down to CDP Private Cloud. CDP can be seen as the ābest of both worldsā¦
Apr 14, 2022

Blockchain 102: Cryptocurrencies, Wallets and DApps
Categories: Adaltas Summit 2021, Infrastructure | Tags: Cryptography, Infrastructure, Blockchain, Consensus
A lot of people own cryptocurrencies today. But holding some tokens on an exchange does not mean interacting with the blockchain. The assets you trade are only numbers stored inside the exchangeāsā¦
Apr 12, 2022