Amazon Web Services (AWS)

Amazon Web Services is a division of Amazon that offers a platform for on-demand cloud computing with more than 175 affiliated services. Among them, AWS S3 and AWS EC2, which allows you to respectively host your data and rent virtual machines to execute various types of actions.

Related articles

Self-Paced training from Databricks: a guide to self-enablement on Big Data & AI

Self-Paced training from Databricks: a guide to self-enablement on Big Data & AI

Categories: Data Engineering, Learning | Tags: Cloud, Data Lake, Databricks, Delta Lake, MLflow

Self-paced trainings are proposed by Databricks inside their Academy program. The price is $ 2000 USD for unlimited access to the training courses for a period of 1 year, but also free for customers…

Anna KNYAZEVA

By Anna KNYAZEVA

May 26, 2021

Find your way into data related Microsoft Azure certifications

Find your way into data related Microsoft Azure certifications

Categories: Cloud Computing, Data Engineering | Tags: Data Governance, Azure, Data Science

Microsoft Azure has certification paths for many technical job roles such as developer, Data Engineer, Data Scientist and solution architect among others. Each of these certifications consists of…

Barthelemy NGOM

By Barthelemy NGOM

Apr 14, 2021

Importing data to Databricks: external tables and Delta Lake

Importing data to Databricks: external tables and Delta Lake

Categories: Data Engineering, Data Science, Learning | Tags: Parquet, AWS, Amazon S3, Azure Data Lake Storage (ADLS), Databricks, Delta Lake, Python

During a Machine Learning project we need to keep track of the training data we are using. This is important for audit purposes and for assessing the performance of the models, developed at a later…

Introducing Apache Airflow on AWS

Introducing Apache Airflow on AWS

Categories: Big Data, Cloud Computing, Containers Orchestration | Tags: PySpark, Learning and tutorial, Airflow, Oozie, Spark, AWS, Docker, Python

Apache Airflow offers a potential solution to the growing challenge of managing an increasingly complex landscape of data management tools, scripts and analytics processes. It is an open-source…

Aargan COINTEPAS

By Aargan COINTEPAS

May 5, 2020

Snowflake, the Data Warehouse for the Cloud, introduction and tutorial

Snowflake, the Data Warehouse for the Cloud, introduction and tutorial

Categories: Business Intelligence, Cloud Computing | Tags: Cloud, Data Lake, Data Science, Data Warehouse, Snowflake

Snowflake is a SaaS-based data-warehousing platform that centralizes, in the cloud, the storage and processing of structured and semi-structured data. The increasing generation of data produced over…

Jules HAMELIN-BOYER

By Jules HAMELIN-BOYER

Apr 7, 2020

MLflow tutorial: an open source Machine Learning (ML) platform

MLflow tutorial: an open source Machine Learning (ML) platform

Categories: Data Engineering, Data Science, Learning | Tags: AWS, Azure, Databricks, Deep Learning, Deployment, Machine Learning, MLflow, MLOps, Python, Scikit-learn

Introduction and principles of MLflow With increasingly cheaper computing power and storage and at the same time increasing data collection in all walks of life, many companies integrated Data Science…

Cloudera CDP and Cloud migration of your Data Warehouse

Cloudera CDP and Cloud migration of your Data Warehouse

Categories: Big Data, Cloud Computing | Tags: Azure, Cloudera, Data Hub, Data Lake, Data Warehouse

While one of our customer is anticipating a move to the Cloud and with the recent announcement of Cloudera CDP availability mi-september during the Strata conference, it seems like the appropriate…

David WORMS

By David WORMS

Dec 16, 2019

Should you move your Big Data and Data Lake to the Cloud

Should you move your Big Data and Data Lake to the Cloud

Categories: Big Data, Cloud Computing | Tags: DevOps, AWS, Azure, Cloud, CDP, Databricks, GCP

Should you follow the trend and migrate your data, workflows and infrastructure to GCP, AWS and Azure? During the Strata Data Conference in New-York, a general focus was put on moving customer’s Big…

Joris RUMMENS

By Joris RUMMENS

Dec 9, 2019

Google Cloud Summit Paris Notes

Google Cloud Summit Paris Notes

Categories: Events | Tags: AWS, Azure, Cloud, GCP, Kubernetes, On-premises

Google organized its yearly Summit edition 2019 in Paris on the 18th of June. This year’s event was the biggest yet in Paris, which reflect Google’s commitment to position itself in the French market…

Tariq SAHNOUNI

By Tariq SAHNOUNI

Jun 26, 2019

Running Enterprise Workloads in the Cloud with Cloudbreak

Running Enterprise Workloads in the Cloud with Cloudbreak

Categories: Big Data, Cloud Computing, DataWorks Summit 2018 | Tags: Cloudbreak, HDP, Operation, Hadoop, AWS, Azure, GCP, OpenStack

This article is based on Peter Darvasi and Richard Doktorics’ talk Running Enterprise Workloads in the Cloud at the DataWorks Summit 2018 in Berlin. It presents Hortonworks’ automated deployment tool…

Joris RUMMENS

By Joris RUMMENS

May 28, 2018

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain