Adaltas is a team of consultants with a focus on Open Source, Big Data and distributed systems based in France, Canada and Morocco.
- Architecture, audit and digital transformation
- Cloud and on-premise operation
- Complex application and ingestion pipelines
- Efficient and reliable solutions delivery
Latest articles

Architecture of object-based storage and S3 standard specifications
Categories: Big Data, Data Governance | Tags: Database, API, Amazon S3, Big Data, Data Lake, Storage
Object storage has been growing in popularity among data storage architectures. Compared to file systems and block storage, object storage faces no limitations when handling petabytes of data. By…
By Luka BIGOT
Jun 20, 2022

TDP workshop: Become a TDP power user from your terminal
Categories: Events, Learning | Tags: Ansible, DevOps, Hadoop, Open source, TDP
The TDP CLI is used to deploy and operate your TDP services. It relies on tdp-lib to provide control and flexibility at your fingertips. Some time ago, we announced the public release of TDP - Trunk…
By Paul FARAULT
Jun 17, 2022

Comparison of database architectures: data warehouse, data lake and data lakehouse
Categories: Big Data, Data Engineering | Tags: Data Governance, Infrastructure, Iceberg, Parquet, Spark, Data Lake, Data Warehouse, File Format
Database architectures have experienced constant innovation, evolving with the appearence of new use cases, technical constraints, and requirements. From the three database structures we are comparing…
By Gonzalo ETSE
May 17, 2022

NixOS: Enabling LXD virtual machines using Flakes
Categories: Hack, Learning | Tags: GitHub, Learning and tutorial, Linux, LXD, Packaging, VM, NixOS, Open source
Nixpkgs is an ever-increasing collection of software packages for Nix and NixOS. Even with more than 80,000 packages, you easily run in a situation where there is a functionality that is not yet…
May 13, 2022

Databricks logs collection with Azure Monitor at a Workspace Scale
Categories: Cloud Computing, Data Engineering, Adaltas Summit 2021 | Tags: Metrics, Monitoring, Spark, Azure, Databricks, Log4j
Databricks is an optimized data analytics platform based on Apache Spark. Monitoring Databricks plateform is crucial to ensure data quality, job performance, and security issues by limiting access to…
By Claire PLAYE
May 10, 2022

Introducing Trunk Data Platform: the Open-Source Big Data Distribution Curated by TOSIT
Categories: Big Data, DevOps & SRE, Infrastructure | Tags: Ansible, Ranger, DevOps, Hortonworks, Hadoop, HBase, Knox, Spark, Cloudera, CDP, CDH, Open source, TDP
Ever since Cloudera and Hortonworks merged, the choice of commercial Hadoop distributions for on-prem workloads essentially boils down to CDP Private Cloud. CDP can be seen as the “best of both worlds…
Apr 14, 2022

Blockchain 102: Cryptocurrencies, Wallets and DApps
Categories: Adaltas Summit 2021, Infrastructure | Tags: Cryptography, Infrastructure, Blockchain, Consensus
A lot of people own cryptocurrencies today. But holding some tokens on an exchange does not mean interacting with the blockchain. The assets you trade are only numbers stored inside the exchange’s…
Apr 12, 2022

JS monorepos in prod 7: Continuous Integration and Continuous Deployment with GitHub Actions
Categories: DevOps & SRE, Front End | Tags: Unit tests, CI/CD, Monorepo, Node.js
The value of CI/CD lies in the ability to control and coordinate changes and feature addition in multiple, iterative releases while simultaneously having multiple services being actively developed in…
Apr 6, 2022

Nix package creation: install a not yet supported font
Categories: Hack | Tags: Learning and tutorial, Linux, Packaging, GitOps, NixOS, Open source
The Nix packages collection is large with over 60 000 packages. However, chances are that sometimes the package you need is not available. You must integrate it yourself. I needed for some fonts which…
By David WORMS
Mar 29, 2022

Deploy your containerized AI applications with nvidia-docker
Categories: Containers Orchestration, Data Science | Tags: containerd, DevOps, Learning and tutorial, NVIDIA, Container, Docker, Keras, TensorFlow
More and more products and services are taking advantage of the modeling and prediction capabilities of AI. This article presents the nvidia-docker tool for integrating AI (Artificial Intelligence…
Mar 24, 2022

Ansible variables: choosing the right location
Categories: DevOps & SRE | Tags: Ansible, Infrastructure, YAML, IaC
Defining variables for your Ansible playbooks and roles can become challenging as your project grows. Browsing the Ansible documentation, the diversity of Ansible variables location is confusing, to…
Mar 15, 2022

Apache HBase: RegionServers co-location
Categories: Big Data, Adaltas Summit 2021, Infrastructure | Tags: Ambari, Database, HDP, Infrastructure, Tuning, Hadoop, HBase, Big Data, Storage
RegionServers are the processes that manage the storage and retrieval of data in Apache HBase, the non-relational column-oriented database in Apache Hadoop. It is through their daemons that any CRUD…
Feb 22, 2022

Reliable and reproducible Linux installation with NixOS
Categories: Infrastructure, Learning | Tags: Arch Linux, CentOS, Linux, OS X, Packaging, Ubuntu, VM, NixOS, TDP
When using an operating system, upgrading packages or installing new ones are common tasks that introduce the risk of affecting the stability of the system. NixOS is a Linux distribution that ensures…
Feb 8, 2022

Nix introduction, main concepts and commands
Categories: Infrastructure, Learning | Tags: Arch Linux, CentOS, Linux, OS X, Packaging, Ubuntu, NixOS, TDP
Nix is a functional package manager for Linux and other Unix systems, making the management of packages more reliable and easy to reproduce. With a traditional package manager, when updating a package…
Feb 1, 2022

Blockchain 101: Blockchains and Consensus Mechanisms
Categories: Adaltas Summit 2021, Infrastructure, Learning | Tags: Cryptography, Infrastructure, Blockchain, Consensus
Cryptocurrencies are booming in 2021, with a market cap moving from 750 to more than 3,000 billion dollars. Let’s face it, this is mainly due to speculation. A lot of people involved do not have a…
Jan 18, 2022

GitOps in practice, deploy Kubernetes applications with ArgoCD
Categories: Containers Orchestration, DevOps & SRE, Adaltas Summit 2021 | Tags: Argo CD, CI/CD, Git, GitOps, IaC, Kubernetes
GitOps is a set of practices to deploy applications using Git. Application definitions, configurations, and connectivity are to be stored in a version control software such as Git. Git then serves as…
Dec 16, 2021

JS monorepos in prod 6: CI/CD, continuous integration and deployment with Travis CI
Categories: DevOps & SRE, Front End | Tags: Unit tests, CI/CD, Monorepo, Node.js
Implementing continuous integration CI and continuous deployment (CD) on a monorepo is quite complex due to the diversity of multiple responsibilities between developers and the need to coordinate…
By David WORMS
Dec 6, 2021

Spring 2022 internship - building a Data Lab
Categories: Data Science, Learning | Tags: MongoDB, Spark, Argo CD, Elasticsearch, Internship, Kubernetes, OpenID Connect, PostgreSQL
Job Description Over the last few years, we developed the ability to use computers to process large amounts of data. The ecosystem evolved over a large offering of tools and libraries and the creation…
By David WORMS
Nov 24, 2021