About david

This author has not yet filled in any details.
So far david has created 66 blog entries.

Node CSV version 0.1 and future developments

The Node CSV parser has just reach version 0.1 which close the 0.0.x releases. Started almost 2 years ago, the project has received a tremendous amount of participation in the form of bug reports, pull request and emails. It is used by a large population of the NodeJs community and could now be considered as [...]

By | 2017-11-21T20:18:00+00:00 July 21st, 2012|Categories: Node.js|0 Comments

Convert .flac music files to .mp3 on osx

As an osx user for years now, one should know by then that iTunes doesn’t support the flac format. We are now in 2012, I’ve been waiting for this to happens since years know. Loosing patience, dark time for Apple. In the meantine, for the record, here a how to convert flac files into an [...]

By | 2017-11-21T20:18:13+00:00 July 20th, 2012|Categories: Hack|0 Comments

Hadoop and R with RHadoop

RHadoop is a bridge between R, a language and environment to statistically explore data sets, and Hadoop, a framework that allows for the distributed processing of large data sets across clusters of computers. RHadoop is built out of 3 components which are R packages: rmr, rhdfs and rhbase. Below, we will present each of those [...]

By | 2017-11-21T20:18:21+00:00 July 19th, 2012|Categories: Data Science|0 Comments

Asynchronous array iteration in Node.js with Each

Control flow in Node.js is the sort of library for which almost all the developers have created and publish their own libraries. They usually aim at reducing spaghetti codes made of deep callbacks. I’m no exception to the rule. After a year and a half of intensive usage, I feel like it’s about time to [...]

By | 2017-11-21T20:18:32+00:00 July 18th, 2012|Categories: Node.js|0 Comments

Stockage HDFS et Hive – comparaison entre les formats de fichiers et les méthodes de compression

Il y a quelques jours, nous avons conduit un test dans le but de comparer différent format de fichiers et méthodes de compression disponible dans Hive. Parmi ces formats, certains sont natifs à HDFS et s’appliquent à tous les utilisateurs d’Hadoop. La suite de tests est composée de requête Hive toutes similaires qui créent une [...]

By | 2017-11-21T20:19:09+00:00 July 15th, 2012|Categories: Big Data|0 Comments

Installing and using MADlib with PostgreSQL on OSX

We cover basic installation and usage of PostgreSQL and MADlib on OSX and Ubuntu. Instructions for other environments should be similar. PostgreSQL is an open source database with enterprise functionalities which often lack in MySQL. MADlib is an open-source library which enhance a PostgreSQL or Greenplum database with functionalities for scalable in-database analytics. […]

By | 2017-11-21T20:19:35+00:00 July 7th, 2012|Categories: Tech Radar|0 Comments

Node CSV version 0.2 with streaming API

Announced in august, the Node CSV parser in its version 0.2 has just been released. This version is a major enhancement as it aligned the parser with the best Node.js practice in respect of streams. The CSV parser behave both as a Stream Writer and a Stream Reader. Be carefull, to achieve this goal, a [...]

By | 2017-11-21T20:19:51+00:00 July 2nd, 2012|Categories: Node.js|0 Comments

Two Hive UDAF to convert an aggregation to a map

I am publishing two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub in two Java classes: “UDAFToMap” and “UDAFToOrderedMap” or you can download the jar file. The first function converts an aggregation into a map and is internally using a Java HashMap. The second function extends [...]

By | 2017-11-21T20:23:13+00:00 March 6th, 2012|Categories: Big Data|0 Comments