Events

As fervent supporter and active contributors to the Open Source community, we attend several meetups and conferences. Every consultant participate to a minimum of two international conferences every year. We even organize our own event which we open to anyone wanting to join us.

Whenever we find the time, we write feedbacks about the events and detailed articles about the presented technologies. This include new products being introduced and additionnal functionnalities present in future releases.

Latest events coverage

WasmEdge: WebAssembly runtimes are coming for the edge

WasmEdge: WebAssembly runtimes are coming for the edge

Categories: Containers Orchestration, Adaltas Summit 2021, Infrastructure, Tech Radar | Tags: JAMstack, Linux, Docker, Rust Lang, WebAssembly

With many security challenges solved by design in its core conception, lots of projects benefit from using WebAssembly. WasmEdge runtime is an efficient Virtual Machine optimized for edge computing…

Guillaume BOUTRY

By Guillaume BOUTRY

Sep 29, 2022

Spark on Hadoop integration with Jupyter

Spark on Hadoop integration with Jupyter

Categories: Adaltas Summit 2021, Infrastructure, Tech Radar | Tags: YARN, HDP, Infrastructure, Jupyter, Spark, CDP, Notebook, TDP

For several years, Jupyter notebook has established itself as the notebook solution in the Python universe. Historically, Jupyter is the tool of choice for data scientists who mainly develop in Python…

Aargan COINTEPAS

By Aargan COINTEPAS

Sep 1, 2022

TDP workshop: Become a TDP power user from your terminal

TDP workshop: Become a TDP power user from your terminal

Categories: Events, Learning | Tags: DevOps, Ansible, Hadoop, Open source, TDP

The TDP CLI is used to deploy and operate your TDP services. It relies on tdp-lib to provide control and flexibility at your fingertips. Some time ago, we announced the public release of TDP - Trunk…

Paul FARAULT

By Paul FARAULT

Jun 17, 2022

Databricks logs collection with Azure Monitor at a Workspace Scale

Databricks logs collection with Azure Monitor at a Workspace Scale

Categories: Cloud Computing, Data Engineering, Adaltas Summit 2021 | Tags: Metrics, Monitoring, Spark, Azure, Databricks, Log4j

Databricks is an optimized data analytics platform based on Apache Spark. Monitoring Databricks plateform is crucial to ensure data quality, job performance, and security issues by limiting access to…

Claire PLAYE

By Claire PLAYE

May 10, 2022

Blockchain 102: Cryptocurrencies, Wallets and DApps

Blockchain 102: Cryptocurrencies, Wallets and DApps

Categories: Adaltas Summit 2021, Infrastructure | Tags: Cryptography, Infrastructure, Blockchain, Consensus

A lot of people own cryptocurrencies today. But holding some tokens on an exchange does not mean interacting with the blockchain. The assets you trade are only numbers stored inside the exchange’s…

Gauthier LEONARD

By Gauthier LEONARD

Apr 12, 2022

Apache HBase: RegionServers co-location

Apache HBase: RegionServers co-location

Categories: Big Data, Adaltas Summit 2021, Infrastructure | Tags: Ambari, Database, HDP, Infrastructure, Tuning, Hadoop, HBase, Big Data, Storage

RegionServers are the processes that manage the storage and retrieval of data in Apache HBase, the non-relational column-oriented database in Apache Hadoop. It is through their daemons that any CRUD…

Pierre BERLAND

By Pierre BERLAND

Feb 22, 2022

Blockchain 101: Blockchains and Consensus Mechanisms

Blockchain 101: Blockchains and Consensus Mechanisms

Categories: Adaltas Summit 2021, Infrastructure, Learning | Tags: Cryptography, Infrastructure, Blockchain, Consensus

Cryptocurrencies are booming in 2021, with a market cap moving from 750 to more than 3,000 billion dollars. Let’s face it, this is mainly due to speculation. A lot of people involved do not have a…

Gauthier LEONARD

By Gauthier LEONARD

Jan 18, 2022

GitOps in practice, deploy Kubernetes applications with ArgoCD

GitOps in practice, deploy Kubernetes applications with ArgoCD

Categories: Containers Orchestration, DevOps & SRE, Adaltas Summit 2021 | Tags: Argo CD, CI/CD, Git, GitOps, IaC, Kubernetes

GitOps is a set of practices to deploy applications using Git. Application definitions, configurations, and connectivity are to be stored in a version control software such as Git. Git then serves as…

Paul-Adrien CORDONNIER

By Paul-Adrien CORDONNIER

Dec 16, 2021

Adaltas Summit 2021, 2nd edition in corsica

Adaltas Summit 2021, 2nd edition in corsica

Categories: Adaltas Summit 2021, Learning | Tags: Ansible, Hadoop, Spark, Azure, Blockchain, Deep Learning, Docker, Terraform, Kubernetes, Node.js

For its second edition, the whole Adaltas crew is gathering in Corsica for a whole week with 2 days dedicated to technology the 23rd and the 24th of september 2021. After a year and a half of sanitary…

David WORMS

By David WORMS

Sep 21, 2021

Data versioning and reproducible ML with DVC and MLflow

Data versioning and reproducible ML with DVC and MLflow

Categories: Data Science, DevOps & SRE, Events | Tags: Data Engineering, Databricks, Delta Lake, Git, Machine Learning, MLflow, Storage

Our talk on data versioning and reproducible Machine Learning proposed to the Data + AI Summit (formerly known as Spark+AI) is accepted. The summit will take place online the 17-19th November…

Running Apache Hive 3, new features and tips and tricks

Running Apache Hive 3, new features and tips and tricks

Categories: Big Data, Business Intelligence, DataWorks Summit 2019 | Tags: Druid, JDBC, LLAP, Hadoop, Hive, Kafka, Release and features

Apache Hive 3 brings a bunch of new and nice features to the data warehouse. Unfortunately, like many major FOSS releases, it comes with a few bugs and not much documentation. It is available since…

Gauthier LEONARD

By Gauthier LEONARD

Jul 25, 2019

Google Cloud Summit Paris Notes

Google Cloud Summit Paris Notes

Categories: Events | Tags: AWS, Azure, Cloud, GCP, Kubernetes, On-premises

Google organized its yearly Summit edition 2019 in Paris on the 18th of June. This year’s event was the biggest yet in Paris, which reflect Google’s commitment to position itself in the French market…

Tariq SAHNOUNI

By Tariq SAHNOUNI

Jun 26, 2019

Gatsby.js, React and GraphQL for documentation websites

Gatsby.js, React and GraphQL for documentation websites

Categories: Adaltas Summit 2018, Front End | Tags: Gatsby, HTTP, JAMstack, Markdown, React.js, SEO, API, GitOps, GraphQL, JavaScript, Node.js

In the last few months, I have started to redesign some of our Open Source project websites. This includes the websites of the Node.js CSV project, the Node.js HBase client and the Nikita project, our…

David WORMS

By David WORMS

Apr 1, 2019

Apache Knox made easy!

Apache Knox made easy!

Categories: Big Data, Cyber Security, Adaltas Summit 2018 | Tags: Ranger, Kerberos, LDAP, Active Directory, REST, Knox

Apache Knox is the secure entry point of a Hadoop cluster, but can it also be the entry point for my REST applications? Apache Knox overview Apache Knox is an application gateway for interacting in a…

Michael HATOUM

By Michael HATOUM

Feb 4, 2019

CodaLab – Data Science competitions

CodaLab – Data Science competitions

Categories: Data Science, Adaltas Summit 2018, Learning | Tags: Database, Infrastructure, MySQL, Machine Learning, Node.js, Python

CodaLab Competition is a platform for code execution in the field of Data Science. It is a web interface on which a user can submit code or results and compare themselves to others. Let’s see how it…

Robert Walid SOARES

By Robert Walid SOARES

Dec 17, 2018

Native modules for Node.js with N-API

Native modules for Node.js with N-API

Categories: Adaltas Summit 2018, Front End | Tags: C++, Kerberos, NPM, JavaScript, Node.js

How to create native modules for Node.js? How to use N-API, the future of native addons development? Writing C/C++ addon is a useful and powerful feature of the Node.js runtime. Let’s explore them…

Xavier HERMAND

By Xavier HERMAND

Dec 12, 2018

Hadoop cluster takeover with Apache Ambari

Hadoop cluster takeover with Apache Ambari

Categories: Big Data, DevOps & SRE, Adaltas Summit 2018 | Tags: Ambari, Automation, HDP, iptables, Kerberos, Nikita, REST, Systemd, Cluster, Node, Node.js

We recently migrated a large production Hadoop cluster from a “manual” automated install to Apache Ambari, we called this the Ambari Takeover. This is a risky process and we will detail why this…

Leo SCHOUKROUN

By Leo SCHOUKROUN

Nov 15, 2018

One week to discuss technology in a Moroccan riad

One week to discuss technology in a Moroccan riad

Categories: Adaltas Summit 2018, Learning | Tags: Flink, CDSW, Gatsby, React.js, Hadoop, Knox, Data Science, Deep Learning, Kubernetes, Node.js

Adaltas organise the year its first conference between the 22 and 26 of October. On the agenda of these 5 days of conference: discuss technology in one of the most beautiful riad of Marrakech. Mix the…

David WORMS

By David WORMS

Oct 11, 2018

Accelerating query processing with materialized views in Apache Hive

Accelerating query processing with materialized views in Apache Hive

Categories: Business Intelligence, DataWorks Summit 2018 | Tags: Calcite, Druid, OLAP, Hive, Release and features, SQL

The new materialized view feature is coming in Apache Hive 3.0. Jesus Camacho Rodriguez from Hortonworks held a talk ”Accelerating query processing with materialized views in Apache Hive” about it…

Paul-Adrien CORDONNIER

By Paul-Adrien CORDONNIER

May 31, 2018

Apache Hadoop YARN 3.0 – State of the union

Apache Hadoop YARN 3.0 – State of the union

Categories: Big Data, DataWorks Summit 2018 | Tags: YARN, GPU, Hortonworks, Hadoop, HDFS, MapReduce, Cloudera, Data Science, Docker, Release and features

This article covers the ”Apache Hadoop YARN: state of the union” talk held by Wangda Tan from Hortonworks during the Dataworks Summit 2018. What is Apache YARN? As a reminder, YARN is one of the two…

Lucas BAKALIAN

By Lucas BAKALIAN

May 31, 2018

YARN and GPU Distribution for Machine Learning

YARN and GPU Distribution for Machine Learning

Categories: Data Science, DataWorks Summit 2018 | Tags: YARN, GPU, Machine Learning, Neural Network, Storage

This article goes over the fundamental principles of Machine Learning and what tools are currently used to run machine learning algorithms. We will then see how a resource manager such as YARN can be…

Grégor JOUET

By Grégor JOUET

May 30, 2018

Apache Metron in the Real World

Apache Metron in the Real World

Categories: Cyber Security, DataWorks Summit 2018 | Tags: Algorithm, NiFi, Solr, Storm, pcap, RDBMS, HDFS, Kafka, Metron, Spark, Data Science, Elasticsearch, SQL

Apache Metron is a storage and analytic platform specialized in cyber security. This talk was about demonstrating the usages and capabilities of Apache Metron in the real world. The presentation was…

Michael HATOUM

By Michael HATOUM

May 29, 2018

TensorFlow on Spark 2.3: The Best of Both Worlds

TensorFlow on Spark 2.3: The Best of Both Worlds

Categories: Data Science, DataWorks Summit 2018 | Tags: Mesos, YARN, C++, CPU, GPU, Tuning, Spark, JavaScript, Keras, Kubernetes, Machine Learning, Python, TensorFlow

The integration of TensorFlow With Spark has a lot of potential and creates new opportunities. This article is based on a conference seen at the DataWorks Summit 2018 in Berlin. It was about the new…

Yliess HATI

By Yliess HATI

May 29, 2018

Running Enterprise Workloads in the Cloud with Cloudbreak

Running Enterprise Workloads in the Cloud with Cloudbreak

Categories: Big Data, Cloud Computing, DataWorks Summit 2018 | Tags: Cloudbreak, HDP, Operation, Hadoop, AWS, Azure, GCP, OpenStack

This article is based on Peter Darvasi and Richard Doktorics’ talk Running Enterprise Workloads in the Cloud at the DataWorks Summit 2018 in Berlin. It presents Hortonworks’ automated deployment tool…

Joris RUMMENS

By Joris RUMMENS

May 28, 2018

Omid: Scalable and highly available transaction processing for Apache Phoenix

Omid: Scalable and highly available transaction processing for Apache Phoenix

Categories: Big Data, DataWorks Summit 2018 | Tags: Omid, Phoenix, Transaction, ACID, HBase, SQL

Apache Omid provides a transactional layer on top of key/value NoSQL databases. In practice, it is usually used on top of Apache HBase. Credits to Ohad Shacham for his talk and his work for Apache…

Xavier HERMAND

By Xavier HERMAND

May 24, 2018

Apache Beam: a unified programming model for data processing pipelines

Apache Beam: a unified programming model for data processing pipelines

Categories: Data Engineering, DataWorks Summit 2018 | Tags: Apex, Beam, Flink, Pipeline, Spark

In this article, we will review the concepts, the history and the future of Apache Beam, that may well become the new standard for data processing pipelines definition. At Dataworks Summit 2018 in…

Gauthier LEONARD

By Gauthier LEONARD

May 24, 2018

Present and future of Hadoop workflow scheduling: Oozie 5.x

Present and future of Hadoop workflow scheduling: Oozie 5.x

Categories: Big Data, DataWorks Summit 2018 | Tags: Sqoop, HDP, REST, Hadoop, Hive, Oozie, CDH

During the DataWorks Summit Europe 2018 in Berlin, I had the opportunity to attend a breakout session on Apache Oozie. It covers the new features released in Oozie 5.0, including future features of…

Leo SCHOUKROUN

By Leo SCHOUKROUN

May 23, 2018

What's new in Apache Spark 2.3?

What's new in Apache Spark 2.3?

Categories: Data Engineering, DataWorks Summit 2018 | Tags: Arrow, PySpark, Tuning, ORC, Spark, Spark MLlib, Data Science, Docker, Kubernetes, pandas, Streaming

Let’s dive into the new features offered by the 2.3 distribution of Apache Spark. This article is a composition of the following talks seen at the DataWorks Summit 2018 and additional research: Apache…

César BEREZOWSKI

By César BEREZOWSKI

May 23, 2018

Scaling massive, real-time data pipelines with Go

Scaling massive, real-time data pipelines with Go

Categories: Open Source Summit Europe 2017, Learning | Tags: Algorithm, Data structures, Go Lang, Pipeline, Protocols, Network

Last week at the Open Source Summit in Prague, Jean de Klerk held a talk called Scaling massive, real-time data pipelines with Go. This article goes over the main points of the talk, detailing the…

Arthur BUSSER

By Arthur BUSSER

Nov 21, 2017

Mesos Introduction

Mesos Introduction

Categories: Containers Orchestration, Open Source Summit Europe 2017 | Tags: Mesos, Container Orchestration, GPU, Container, CUDA, Data Science, Docker

Apache Mesos is an open source cluster management project designed to implement and optimize distributed systems. Mesos enables the management and sharing of resources in a fine and dynamic way…

Louis BIANCHERIN

By Louis BIANCHERIN

Nov 15, 2017

Micro Services

Micro Services

Categories: Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags: Mesos, CNCF, DNS, Encryption, gRPC, Istio, Linkerd, Micro Services, MITM, Proxy, Service Mesh, Kubernetes, SPOF, SSL/TLS

Back in the days, applications were monolithic and we could use an IP address to access a service. With virtual machines (VM), multiple hosts started to appear on the same machine with multiple apps…

David WORMS

By David WORMS

Nov 14, 2017

Lightweight containerization with Tupperware

Lightweight containerization with Tupperware

Categories: Containers Orchestration, Open Source Summit Europe 2017, Infrastructure | Tags: Btrfs, LXD, Red Hat, Systemd, Zookeeper, Cloud, Consensus

In this article, I will present lightweight containerization set up by Facebook called Tupperware. What is Tupperware Tupperware is a homemade framework written and used internally at Facebook…

Lucas BAKALIAN

By Lucas BAKALIAN

Nov 3, 2017

Multi-Repo, Multi-Node Gating at Massive Scale

Multi-Repo, Multi-Node Gating at Massive Scale

Categories: Cloud Computing, DevOps & SRE, Open Source Summit Europe 2017 | Tags: Infrastructure, Jenkins, Red Hat, Zuul, Ansible, CI/CD, OpenStack

This is a recap and personal review of Monty Taylor’s presentation of OpenStack’s Continuous Integration tool Zuul at the OpenSource Summit 2017 in Prague (not to mix with Netflix’ Zuul project…

Joris RUMMENS

By Joris RUMMENS

Oct 28, 2017

Apache Thrift vs REST

Apache Thrift vs REST

Categories: DevOps & SRE, Open Source Summit Europe 2017 | Tags: Thrift, gRPC, HTTP, REST, JavaScript Object Notation (JSON)

Adaltas recently attended the Open Source Summit Europe 2017 in Prague. I had the opportunity to follow a presentation made by Randy Abernethy and Jens Geyer of RM-X, a cloud native consulting company…

Leo SCHOUKROUN

By Leo SCHOUKROUN

Oct 28, 2017

Kubernetes Storage Primitives for Stateful Workloads

Kubernetes Storage Primitives for Stateful Workloads

Categories: Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags: Container Storage Interface (CSI), PVC, Azure, Docker, GCE, Kubernetes, Storage

This article is based on the presentation “Introduction to Kubernetes Storage Primitives for Stateful Workloads” from the OSS Convention Prague 2017 by the {Code} team. So, let’s start, what is…

Pierre SAUVAGE

By Pierre SAUVAGE

Oct 28, 2017

Nobody* puts Java in a Container

Nobody* puts Java in a Container

Categories: Containers Orchestration, Open Source Summit Europe 2017, Infrastructure | Tags: cgroups, Java, JRE, JVM, Namespaces, Docker

This talk was about the issues of putting Java in a container and how, in its latest version, the JDK is now more aware of the container it is running in. The presentation is led by Joerg Schad…

Paul-Adrien CORDONNIER

By Paul-Adrien CORDONNIER

Oct 28, 2017

From Dockerfile to Ansible Containers

From Dockerfile to Ansible Containers

Categories: Containers Orchestration, DevOps & SRE, Open Source Summit Europe 2017 | Tags: pip, Shell, Ansible, Docker, Docker Compose, YAML

This talk was an introduction to the Dockerfile format and to Ansible container’s tool and then a comparison of both. It was hold by Tomas Tomecek from Red Hat’s containerization team. The Dockerfile…

César BEREZOWSKI

By César BEREZOWSKI

Oct 25, 2017

Kubernetes 1.8

Kubernetes 1.8

Categories: Containers Orchestration, Open Source Summit Europe 2017 | Tags: containerd, CRD, OCI, RBAC, Kubernetes, Network, Release and features

The 1.8 release of Kubernetes brings a lot of new things. With 2500+ pull request, 2000+ commits, 400+ commiters, Kubernetes added 39 new features in this version. This is the richest release in terms…

Younes YASSINE

By Younes YASSINE

Oct 24, 2017

Cloudera Sessions Paris 2017

Cloudera Sessions Paris 2017

Categories: Big Data, Events | Tags: Altus, CDSW, SDX, PaaS, EC2, Azure, Cloudera, CDH, Data Science

Adaltas was at the Cloudera Sessions on October 5, where Cloudera showcased their new products and offerings. Below you’ll find a summary of what we witnessed. Note: the information were aggregated in…

César BEREZOWSKI

By César BEREZOWSKI

Oct 16, 2017

Apache Apex: next gen Big Data analytics

Apache Apex: next gen Big Data analytics

Categories: Data Science, Events, Tech Radar | Tags: Apex, Flink, Storm, Tools, Hadoop, Kafka, Data Science, Machine Learning

Below is a compilation of my notes taken during the presentation of Apache Apex by Thomas Weise from DataTorrent, the company behind Apex. Introduction Apache Apex is an in-memory distributed parallel…

César BEREZOWSKI

By César BEREZOWSKI

Jul 17, 2016

Apache Apex with Apache SAMOA

Apache Apex with Apache SAMOA

Categories: Data Science, Events, Tech Radar | Tags: Apex, Flink, Samoa, Storm, Tools, Hadoop, Machine Learning

Traditional Machine Learning Batch Oriented Supervised - most common Training and Scoring One time model building Data set Training: Model building Holdout: Paremeter tuning Test: Accuracy Online…

Pierre SAUVAGE

By Pierre SAUVAGE

Jul 17, 2016

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain