Apache Hadoop YARN
Related articles
Optimisation of Spark applications in Hadoop YARN
Categories: Data Engineering, Learning | Tags: Spark, Tuning, Hadoop, Python
Apache Spark is an in-memory data processing tool widely used in companies to deal with Big Data issues. Running a Spark application in production requires user-defined resources. This article…
Mar 30, 2020
Machine Learning model deployment
Categories: Big Data, Data Engineering, Data Science, DevOps & SRE | Tags: DevOps, Operation, AI, Cloud, Machine Learning, MLOps, On-premises, Schema
“Enterprise Machine Learning requires looking at the big picture … from a data engineering and a data platform perspective,” lectured Justin Norman during the talk on the deployment of Machine…
Sep 30, 2019
Clusters and workloads migration from Hadoop 2 to Hadoop 3
Categories: Big Data, Infrastructure | Tags: HDFS, Slider, Spark, YARN, Docker, Erasure Coding, Rolling Upgrade
Hadoop 2 to Hadoop 3 migration is a hot subject. How to upgrade your clusters, which features present in the new release may solve current problems and bring new opportunities, how are your current…
Jul 25, 2018
Deep learning on YARN: running Tensorflow and friends on Hadoop cluster
Categories: Data Science | Tags: Spark, YARN, Deep Learning, GPU, Hadoop, Spark MLlib, PyTorch, TensorFlow, XGBoost, MXNet
With the arrival of Hadoop 3, YARN offer more flexibility in resource management. It is now possible to perform Deep Learning analysis on GPUs with specific development environments, leveraging…
Jul 24, 2018
Apache Hadoop YARN 3.0 – State of the union
Categories: Big Data, DataWorks Summit 2018 | Tags: HDFS, MapReduce, YARN, Cloudera, Docker, GPU, Hortonworks, Hadoop, Release and features
This article covers the ”Apache Hadoop YARN: state of the union” talk held by Wangda Tan from Hortonworks during the Dataworks Summit 2018. What is Apache YARN? As a reminder, YARN is one of the two…
May 31, 2018
YARN and GPU Distribution for Machine Learning
Categories: Data Science, DataWorks Summit 2018 | Tags: YARN, GPU, Machine Learning, Neural Network, Storage
This article goes over the fundamental principles of Machine Learning and what tools are currently used to run machine learning algorithms. We will then see how a resource manager such as YARN can be…
By Grégor JOUET
May 30, 2018
TensorFlow on Spark 2.3: The Best of Both Worlds
Categories: Data Science, DataWorks Summit 2018 | Tags: Mesos, Spark, YARN, C++, CPU, GPU, JavaScript, Tuning, Keras, Kubernetes, Machine Learning, Python, TensorFlow
The integration of TensorFlow With Spark has a lot of potential and creates new opportunities. This article is based on a conference seen at the DataWorks Summit 2018 in Berlin. It was about the new…
By Yliess HATI
May 29, 2018