Loading...
Home2018-06-06T08:30:40+00:00

BigData

Data Engineering

Data collect, data preparation, data lake, data gouvernance

Data Science

Writing algorithms, Spark, machine learning, exploration, statistics, python, R

Data Streaming

Message Bus, Key Performance Indicator (KPI), Threshold Detection, Time Window Queries, Intelligent Behaviors

Data Analytics

Visualization, notebooks

Latest articles

TensorFlow on Spark 2.3: The Best of Both Worlds

By |May 29th, 2018|Categories: Big Data, DataWorks Summit 2018, Deep Learning|Tags: , , , , , , , |

The integration of TensorFlow With Spark has a lot of potential and creates new opportunities. […]

Running Enterprise Workloads in the Cloud with Cloudbreak

By |May 28th, 2018|Categories: Big Data, DataWorks Summit 2018|Tags: , , , |

This article is based on Peter Darvasi and Richard Doktorics’ talk Running Enterprise Workloads in the Cloud at the DataWorks Summit 2018 in Berlin. It presents Hortonworks’ automated deployment tool for cloud environments, Cloudbreak, describes [...]

Omid: Scalable and highly available transaction processing for Apache Phoenix

By |May 24th, 2018|Categories: Big Data, DataWorks Summit 2018, Events|Tags: , , , , , |

Apache Omid provides a transactional layer on top of key/value NoSQL databases. In practice, it is usually used on top of Apache HBase. […]

Apache Beam: a unified programming model for data processing pipelines

By |May 24th, 2018|Categories: Big Data, Data Engineering, DataWorks Summit 2018, Events|Tags: , , , , , , , , |

In this article, we will review the concepts, the history and the future of Apache Beam, that may well become the new standard for data processing pipelines definition. […]

Present and future of Hadoop workflow scheduling: Oozie 5.x

By |May 23rd, 2018|Categories: Big Data, DataWorks Summit 2018|Tags: , |

During the DataWorks Summit Europe 2018 in Berlin, I had the opportunity to attend a breakout session on Apache Oozie. It covers the new features released in Oozie 5.0, including future features of Oozie 5.X, [...]

Essential questions about Time Series

By |March 19th, 2018|Categories: Big Data, Data Engineering|Tags: , , , , , |

Today, the bulk of Big Data is temporal. We see it in the media and among our customers: smart meters, banking transactions, smart factories, connected vehicles … IoT and Big Data go hand in hand. [...]