Adaltas Logo

Adaltas Talented Open Source consultants
collaborating with your teams.

Cloud and Data Lake
  • UI
  • Front-end
  • Data Science
  • Data Engineering
  • Micro Services
  • RDBMS
  • Containers
  • NoSQL
  • Big Data
  • DevOps
  • Cloud
  • On-premise

Adaltas is a team of consultants with a focus on Open Source, Big Data and distributed systems based in France, Canada and Morocco.

  • Architecture, audit and digital transformation
  • Cloud and on-premise operation
  • Complex application and ingestion pipelines
  • Efficient and reliable solutions delivery

Latest articles

Snowflake, the Data Warehouse for the Cloud, introduction and tutorial

Categories: Business Intelligence, Cloud Computing | Tags: Cloud, Data Lake, Data Science, Data Warehouse, Snowflake

Snowflake is a SaaS-based data-warehousing platform that centralizes, in the cloud, the storage and processing of structured and semi-structured data. The increasing generation of data produced over…

Jules HAMELIN-BOYER

By Jules HAMELIN-BOYER

Apr 7, 2020

Optimisation of Spark applications in Hadoop YARN

Categories: Data Engineering, Learning | Tags: Spark, Tuning, Hadoop, Python

Apache Spark is an in-memory data processing tool widely used in companies to deal with Big Data issues. Running a Spark application in production requires user-defined resources. This article…

Ferdinand DE BAECQUE

By Ferdinand DE BAECQUE

Mar 30, 2020

MLflow tutorial: an open source Machine Learning (ML) platform

Categories: Data Engineering, Data Science, Learning | Tags: Deep Learning, AWS, Databricks, Deployment, Machine Learning, Azure, MLflow, Python, Scikit-learn, MLOps

Introduction and principles of MLflow With increasingly cheaper computing power and storage and at the same time increasing data collection in all walks of life, many companies integrated Data Science…

Introduction to Ludwig and how to deploy a Deep Learning model via Flask

Categories: Data Science, Tech Radar | Tags: Deep Learning, Learning and tutorial, Ludwig Deep Learning Toolbox, Machine Learning, Python

Over the past decade, Machine Learning and deep learning models have proven to be very effective in performing a wide variety of tasks such as fraud detection, product recommendation, autonomous…

Robert Walid SOARES

By Robert Walid SOARES

Mar 2, 2020

Install and debug Kubernetes inside LXD

Categories: Containers Orchestration | Tags: Container, Debug, Docker, Linux, LXD, Kubernetes

We recently deployed a Kubernetes cluster with the need to maintain clusters isolation on our bare metal nodes across our infrastructure. We knew that Virtual Machines would provide the required…

Leo SCHOUKROUN

By Leo SCHOUKROUN

Feb 4, 2020

Policy enforcing with Open Policy Agent

Categories: Cyber Security, Data Governance | Tags: Kafka, Ranger, Authorization, Cloud, REST, Kubernetes, SSL/TLS

Open Policy Agent is an open-source multi-purpose policy engine. Its main goal is to unify policy enforcement across the cloud native stack. The project was created by Styra and it is currently…

Leo SCHOUKROUN

By Leo SCHOUKROUN

Jan 22, 2020

Cloudera CDP and Cloud migration of your Data Warehouse

Categories: Big Data, Cloud Computing | Tags: Cloudera, Data Hub, Data Lake, Data Warehouse, Azure

While one of our customer is anticipating a move to the Cloud and with the recent announcement of Cloudera CDP availability mi-september during the Strata conference, it seems like the appropriate…

David WORMS

By David WORMS

Dec 16, 2019

Logstash pipelines remote configuration and self-indexing

Categories: Data Engineering, Infrastructure | Tags: Docker, Elasticsearch, Kibana, Logstash, Log4j

Logstash is a powerful data collection engine that integrates in the Elastic Stack (Elasticsearch - Logstash - Kibana). The goal of this article is to show you how to deploy a fully managed Logstash…

Paul-Adrien CORDONNIER

By Paul-Adrien CORDONNIER

Dec 13, 2019

Should you move your Big Data and Data Lake to the Cloud

Categories: Big Data, Cloud Computing | Tags: Cloud, DevOps, AWS, CDP, Databricks, GCP, Azure

Should you follow the trend and migrate your data, workflows and infrastructure to GCP, AWS and Azure? During the Strata Data Conference in New-York, a general focus was put on moving customer’s Big…

Joris RUMMENS

By Joris RUMMENS

Dec 9, 2019

Hadoop Ozone part 3: advanced replication strategy with Copyset

Categories: Infrastructure | Tags: HDFS, Ozone, Kubernetes

Hadoop Ozone provide a way of setting a ReplicationType for every write you make on the cluster. Right now is supported HDFS and Ratis but more advanced replication strategies can be achieved. In this…

Hadoop Ozone part 2: tutorial and getting started of its features

Categories: Infrastructure | Tags: HDFS, CLI, Learning and tutorial, REST, Ozone, Amazon S3

The releases of Hadoop Ozone come with a handy docker-compose file to try out Ozone. The below instructions provide details on how to use it. You can also use the Katacoda training sandbox which…

Hadoop Ozone part 1: an introduction of the new filesystem

Categories: Infrastructure | Tags: HDFS, Ozone, Kubernetes

Hadoop Ozone is an object store for Hadoop. It is designed to scale to billions of objects of varying sizes. It is currently in development. The roadmap is available on the project wiki. This article…

Internship Data Science & Data Engineer - ML in production and streaming data ingestion

Categories: Data Engineering, Data Science | Tags: Flink, Kafka, Spark, DevOps, Hadoop, HBase, Kubernetes, Python

Context The exponential evolution of data has turned the industry upside down by redefining data storage, processing and data ingestion pipelines. Mastering these methods considerably facilitates…

David WORMS

By David WORMS

Nov 26, 2019

InfraOps & DevOps Internship - build a Big Data & Kubernetes PaaS

Categories: Big Data, Containers Orchestration | Tags: Kafka, Spark, DevOps, LXD, NoSQL, Hadoop, Ceph, Kubernetes

Context The acquisition of a high-capacity cluster is in line with Adaltas’ desire to build a PAAS-type offering to use and to provide Big Data and container orchestration platforms. The platforms are…

David WORMS

By David WORMS

Nov 26, 2019

Insert rows in BigQuery tables with complex columns

Categories: Cloud Computing, Data Engineering | Tags: Schema, GCP, BigQuery, SQL

Google’s BigQuery is a cloud data warehousing system designed to process enormous volumes of data with several features available. Out of all those features, let’s talk about the support of Struct…

César BEREZOWSKI

By César BEREZOWSKI

Nov 22, 2019

Avoid Bottlenecks in distributed Deep Learning pipelines with Horovod

Categories: Data Science | Tags: Deep Learning, GPU, Horovod, Keras, TensorFlow

The Deep Learning training process can be greatly speed up using a cluster of GPUs. When dealing with huge amounts of data, distributed computing quickly becomes a challenge. A common obstacle which…

Grégor JOUET

By Grégor JOUET

Nov 15, 2019

Kerberos and Spnego authentication on Windows with Firefox

Categories: Cyber Security | Tags: Firefox, FreeIPA, HTTP, Kerberos

In Greek mythology, Kerberos, also called Cerberus, guards the gates of the Underworld to prevent the dead from leaving. He is commonly described as a three-headed dog, a serpent’s tail, mane of…

David WORMS

By David WORMS

Nov 4, 2019

Notes on the Cloudera Open Source licensing model

Categories: Big Data | Tags: CDSW, License, Open source, Cloudera Manager

Following the publication of its Open Source licensing strategy on July 10, 2019 in an article called “our Commitment to Open Source Software”, Cloudera broadcasted a webinar yesterday October 2…

David WORMS

By David WORMS

Oct 25, 2019

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.