Data Science

Introduction to Cloudera Data Science Workbench

Cloudera Data Science Workbench is a platform that allows Data Scientists to create, manage, run and schedule data science workflows from their browser. Thus it enables them to focus on their main task that is deriving insights from data, without thinking about the complexity that lies in the background. CDSW was released after Cloudera’s acquisition of [...]

Applying Deep Reinforcement Learning to Poker

We will cover the subject of Deep Reinforcement Learning, more specifically the Deep Q Learning algorithm introduced by DeepMind, and then we'll apply a version of this algorithm to the game of Poker. Reinforcement learning Machine Learning and Deep Learning have become a hot topic in the past years. With the recent improvements in parallel [...]

By |2019-03-10T20:12:15+00:00January 9th, 2019|Categories: Data Science, Deep Learning|Tags: , , |1 Comment

CodaLab – Data Science competitions

CodaLab Competition is a platform for code execution in the field of Data Science. It is a web interface on which a user can submit code or results and compare themselves to others. Let’s see how it works and how to install CodaLab On-Premise. […]

By |2018-12-17T16:45:38+00:00December 17th, 2018|Categories: Big Data, Data Science|Tags: , , , , |0 Comments

Main advantages of GraphQL as an alternative to REST

GraphQL is based on a simple idea, moving the assembly of a request from the server to the client. The client sees the overall strongly-typed schema instead of multiple REST endpoints and he builds the query he wants. My first REST based web application, SPAs for Single Page Applications as we are calling it lately, [...]

By |2018-11-27T09:56:07+00:00November 27th, 2018|Categories: Big Data, Data Science|Tags: , , , , , |0 Comments

Lando: Deep Learning used to summarize conversations

Lando is an application to summarize conversations using Speech To Text to translate the written record of a meeting into text and Deep Learning technics to summarize contents. It allows users to quickly understand the context of the conversation. During the cource of our internship at Adaltas, we worked on a new project called Lando to [...]

Deep learning on YARN: running Tensorflow and friends on Hadoop cluster

With the arrival of Hadoop 3, YARN offer more flexibility in resource management. It is now possible to perform Deep Learning analysis on GPUs with specific development environments, leveraging available resources. This article is a based on the presentation of Wandga Tan, Apache Hadoop PMC menber, at the DataWorks Summit 2018. It mostly focus on [...]

By |2018-07-24T19:43:12+00:00July 24th, 2018|Categories: Data Science, DataWorks Summit 2018|Tags: , , , |0 Comments

YARN and GPU Distribution for Machine Learning

This article goes over the fundamental principles of Machine Learning and what tools are currently used to run machine learning algorithms. We will then see how a resource manager such as YARN can be useful in this context and how it can help the algorithms to run smoothly. This article stems from a conference at [...]

By |2018-06-07T10:25:09+00:00May 30th, 2018|Categories: Data Science, DataWorks Summit 2018|Tags: , , |2 Comments

TensorFlow on Spark 2.3: The Best of Both Worlds

The integration of TensorFlow With Spark has a lot of potential and creates new opportunities. […]