Loading...
Home2018-06-06T08:30:40+00:00

BigData

Data Engineering

Data collect, data preparation, data lake, data gouvernance

Data Science

Writing algorithms, Spark, machine learning, exploration, statistics, python, R

Data Streaming

Message Bus, Key Performance Indicator (KPI), Threshold Detection, Time Window Queries, Intelligent Behaviors

Data Analytics

Visualization, notebooks

Latest articles

Remote connection with SSH

By |July 24th, 2013|Categories: Hack|

While teaching big data and Hadoop, a student ask me about SSH and how to use. I’ll discuss about the protocol and the tools to benefit from it. Lately, I’ve been supervising the deployment of [...]

Merging multiple files in hadoop

By |July 12th, 2013|Categories: Big Data|

This is a command I used to concatenate the files stored in Hadoop HDFS matching a globing expression into a single file. It use the "getmerge" utility of "hadoop fs" but contrary to "getmerge", the [...]