Articles published in 2020

Snowflake, the Data Warehouse for the Cloud, introduction and tutorial

Categories: Business Intelligence, Cloud Computing | Tags: Cloud, AWS, Data Lake, Data Science, Data Warehouse, GCP, Azure, Snowflake

Snowflake is a SaaS-based data-warehousing platform that centralizes, in the cloud, the storage and processing of structured and semi-structured data. The increasing generation of data produced over…

Jules HAMELIN-BOYER

By Jules HAMELIN-BOYER

Apr 7, 2020

Optimisation of Spark applications in Hadoop YARN

Categories: Data Engineering, Learning | Tags: Mesos, Spark, YARN, Big Data, Tuning, Hadoop, Clustering, Kubernetes, Python

Apache Spark is an in-memory data processing tool widely used in companies to deal with Big Data issues. Running a Spark application in production requires user-defined resources. This article…

Ferdinand DE BAECQUE

By Ferdinand DE BAECQUE

Mar 30, 2020

MLflow tutorial: an open source Machine Learning (ML) platform

Categories: Data Engineering, Data Science, Learning | Tags: Arch Linux, Deep Learning, R, Spark MLlib, AWS, Databricks, Deployment, Keras, Machine Learning, Azure, MLflow, Python, PyTorch, TensorFlow, XGBoost, MXNet, Scikit-learn, MLOps

Introduction and principles of MLflow With increasingly cheaper computing power and storage and at the same time increasing data collection in all walks of life, many companies integrated Data Science…

Introduction to Ludwig and how to deploy a Deep Learning model via Flask

Categories: Data Science, Tech Radar | Tags: API, CLI, Deep Learning, File Format, Learning and tutorial, Server, Ludwig Deep Learning Toolbox, Machine Learning, Python

Over the past decade, Machine Learning and deep learning models have proven to be very effective in performing a wide variety of tasks such as fraud detection, product recommendation, autonomous…

Robert Walid SOARES

By Robert Walid SOARES

Mar 2, 2020

Install and debug Kubernetes inside LXD

Categories: Containers Orchestration | Tags: Container, Debug, Docker, Linux, LXD, Kubernetes

We recently deployed a Kubernetes cluster with the need to maintain clusters isolation on our bare metal nodes across our infrastructure. We knew that Virtual Machines would provide the required…

Leo SCHOUKROUN

By Leo SCHOUKROUN

Feb 4, 2020

Policy enforcing with Open Policy Agent

Categories: Cyber Security, Data Governance | Tags: Kafka, Ranger, Authorization, Big Data, Cloud, Docker, Go, REST, Tools, Kubernetes, SSL/TLS

Open Policy Agent is an open-source multi-purpose policy engine. Its main goal is to unify policy enforcement across the cloud native stack. The project was created by Styra and it is currently…

Leo SCHOUKROUN

By Leo SCHOUKROUN

Jan 22, 2020

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.