Apache Hadoop YARN

Related articles

Optimisation of Spark applications in Hadoop YARN

Categories: Data Engineering, Learning | Tags: Spark, Tuning, Hadoop, Python

Apache Spark is an in-memory data processing tool widely used in companies to deal with Big Data issues. Running a Spark application in production requires user-defined resources. This article…

Ferdinand DE BAECQUE

By Ferdinand DE BAECQUE

Mar 30, 2020

Machine Learning model deployment

Categories: Big Data, Data Engineering, Data Science, DevOps & SRE | Tags: Cloud, DevOps, On-premise, Operation, Schema, AI, Machine Learning, MLOps

“Enterprise Machine Learning requires looking at the big picture … from a data engineering and a data platform perspective,” lectured Justin Norman during the talk on the deployment of Machine…

Oskar RYNKIEWICZ

By Oskar RYNKIEWICZ

Sep 30, 2019

Clusters and workloads migration from Hadoop 2 to Hadoop 3

Categories: Big Data, Infrastructure | Tags: HDFS, Slider, Spark, YARN, Docker, Erasure Coding, Rolling Upgrade

Hadoop 2 to Hadoop 3 migration is a hot subject. How to upgrade your clusters, which features present in the new release may solve current problems and bring new opportunities, how are your current…

Lucas BAKALIAN

By Lucas BAKALIAN

Jul 25, 2018

Deep learning on YARN: running Tensorflow and friends on Hadoop cluster

Categories: Data Science | Tags: Spark, YARN, Deep Learning, GPU, Hadoop, Spark MLlib, PyTorch, TensorFlow, XGBoost, MXNet

With the arrival of Hadoop 3, YARN offer more flexibility in resource management. It is now possible to perform Deep Learning analysis on GPUs with specific development environments, leveraging…

Louis BIANCHERIN

By Louis BIANCHERIN

Jul 24, 2018

Apache Hadoop YARN 3.0 – State of the union

Categories: Big Data, DataWorks Summit 2018 | Tags: HDFS, MapReduce, YARN, Cloudera, Docker, GPU, Hortonworks, Release and features, Hadoop

This article covers the ”Apache Hadoop YARN: state of the union” talk held by Wangda Tan from Hortonworks during the Dataworks Summit 2018. What is Apache YARN? As a reminder, YARN is one of the two…

Lucas BAKALIAN

By Lucas BAKALIAN

May 31, 2018

YARN and GPU Distribution for Machine Learning

Categories: Data Science, DataWorks Summit 2018 | Tags: YARN, GPU, Machine Learning, Neural Network, Storage

This article goes over the fundamental principles of Machine Learning and what tools are currently used to run machine learning algorithms. We will then see how a resource manager such as YARN can be…

Grégor JOUET

By Grégor JOUET

May 30, 2018

TensorFlow on Spark 2.3: The Best of Both Worlds

Categories: Data Science, DataWorks Summit 2018 | Tags: Mesos, Spark, YARN, C++, CPU, GPU, JavaScript, Tuning, Keras, Kubernetes, Machine Learning, Python, TensorFlow

The integration of TensorFlow With Spark has a lot of potential and creates new opportunities. This article is based on a conference seen at the DataWorks Summit 2018 in Berlin. It was about the new…

Yliess HATI

By Yliess HATI

May 29, 2018

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.