DataWorks Summit 2018

Articles related to conferences at Hortonworks’ DataWorks Summit 2018 in Berlin on 18-19/04

Deep learning on YARN: running Tensorflow and friends on Hadoop cluster

With the arrival of Hadoop 3, YARN offer more flexibility in resource management. It is now possible to perform Deep Learning analysis on GPUs with specific development environments, leveraging available resources. This article is a based on the presentation of Wandga Tan, Apache Hadoop PMC menber, at the DataWorks Summit 2018. It mostly focus on [...]

By |2018-07-24T19:43:12+00:00July 24th, 2018|Categories: Data Science, DataWorks Summit 2018|Tags: , , , |0 Comments

Curing the Kafka blindness with the UI manager

Today it’s really difficult for developers, operators and managers to visualize and monitor what happens in a Kafka cluster. This articles covers a new graphical interface to oversee Kafka. It was given  by George Vetticaden, VP Management product at Hortonworks, during the DataWorks Summit at the San Jose Conference Center June 2018. […]

By |2018-06-21T13:06:52+00:00June 20th, 2018|Categories: Big Data, DataWorks Summit 2018|Tags: , , , |0 Comments

DataWorks Summit 2018: A few days speaking Hadoop

The Adaltas crew went to the DataWorks Summit 2018 held in Berlin on the 18th and 19th of April 2018. On this occasion, we compiled a series of articles about the conferences that have marked us most. […]

By |2018-06-05T22:36:32+00:00June 5th, 2018|Categories: DataWorks Summit 2018|Tags: , |0 Comments

Accelerating query processing with materialized views in Apache Hive

Jesus Camacho Rodriguez from Hortonworks held a talk “Accelerating query processing with materialized views in Apache Hive” about the new materialized view feature coming in Apache Hive 3.0. This article covers the main principle of this feature, gives some examples and the improvements that are in the roadmap. […]

By |2018-06-06T16:14:47+00:00May 31st, 2018|Categories: Data Engineering, DataWorks Summit 2018|0 Comments

YARN and GPU Distribution for Machine Learning

This article goes over the fundamental principles of Machine Learning and what tools are currently used to run machine learning algorithms. We will then see how a resource manager such as YARN can be useful in this context and how it can help the algorithms to run smoothly. This article stems from a conference at [...]

By |2018-06-07T10:25:09+00:00May 30th, 2018|Categories: Data Science, DataWorks Summit 2018|Tags: , , |2 Comments

Apache Metron in the Real World

Apache Metron is a storage and analytic platform specialized in cyber security. This talk was about demonstrating the usages and capabilities of Apache Metron in the real world. The presentation was led by Dave Russell, Principal Solutions Engineer – EMEA + APAC at Hortonworks, at the Dataworks Summit 2018 (Berlin). [...]

By |2018-06-07T13:38:15+00:00May 29th, 2018|Categories: Cyber Security, DataWorks Summit 2018, Events|Tags: , , |0 Comments

TensorFlow on Spark 2.3: The Best of Both Worlds

The integration of TensorFlow With Spark has a lot of potential and creates new opportunities. […]

Running Enterprise Workloads in the Cloud with Cloudbreak

This article is based on Peter Darvasi and Richard Doktorics’ talk Running Enterprise Workloads in the Cloud at the DataWorks Summit 2018 in Berlin. It presents Hortonworks’ automated deployment tool for cloud environments, Cloudbreak, describes and comments features that Peter and Richard explained in their talk, and give some personal guidelines on when and why [...]

By |2018-06-06T09:16:58+00:00May 28th, 2018|Categories: Big Data, DataWorks Summit 2018|Tags: , , , |1 Comment

Omid: Scalable and highly available transaction processing for Apache Phoenix

Apache Omid provides a transactional layer on top of key/value NoSQL databases. In practice, it is usually used on top of Apache HBase. […]

By |2018-06-05T22:36:36+00:00May 24th, 2018|Categories: Big Data, DataWorks Summit 2018, Events|Tags: , , , , , |1 Comment