David Worms

About David Worms

Passionate with programming, data and entrepreneurship, I participate in shaping Adaltas to be a team of talented engineers to share our skills and experiences.

Two Hive UDAF to convert an aggregation to a map

I am publishing two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub in two Java classes: “UDAFToMap” and “UDAFToOrderedMap” or you can download the jar file. The first function converts an aggregation into a map and is internally using a Java HashMap. The second function extends [...]

By |2018-06-05T22:37:23+00:00March 6th, 2012|Categories: Big Data|0 Comments

Coffee script, how do I debug that damn js line?

Update April 12th, 2012: Pull request adding error reporting to CoffeeScript with line mapping Chances are that, if you code in CoffeeScript, you often find yourself facing a JavaScript exception telling you a problem occured on a specific line. Problem is that the line number in question in the one of the generated JavaScript, not [...]

By |2018-06-05T22:37:25+00:00February 15th, 2012|Categories: Node.js|0 Comments

OS module on steroids with the SIGAR Node binding

Today we are announcing the first release of the Node binding to the SIGAR library. Visit the project website or the source code repository on GitHub. SIGAR is a cross platform interface for gathering system information. From the project website, such information include: System memory, swap, cpu, load average, uptime, logins Per-process memory, cpu, credential [...]

By |2018-06-05T22:37:27+00:00January 11th, 2012|Categories: Node.js|0 Comments

Timeseries storage in Hadoop and Hive

In the next few weeks, we will be exploring the storage and analytic of a large generated dataset. This dataset is composed of CRM tables associated to one timeserie table of about 7,000 billiard rows. Before importing the dataset into Hive, we will be exploring different optimization options expected to impact speed and storage size. [...]

By |2018-06-05T22:37:29+00:00January 10th, 2012|Categories: Big Data|0 Comments

How Node CSV parser may save your weekend

Last Friday, an hour before the doors of my customer close for the weekend, a co-worker came to me. He just finished to export 9 CSV files from [Oracle][oracle] which he wanted to import into [Greenplum][green] such as our customer could start testing on Monday morning. The problem as exposed was quite [...]

By |2018-06-05T22:37:31+00:00December 13th, 2011|Categories: Node.js|0 Comments

Node intégré à la plateforme cloud Microsoft Azure

Node est désormais un citoyen de premier ordre dans l’environnement cloud de Microsoft Azure au côté de [.Net][.net], [Java][java] et [PHP][php]. Cette intégration est la conséquence logique de l’implication de Microsoft dans le développement de Node il y a maintenant un an. A l’origine seulement disponible sur les plateformes de type Unix [...]

By |2018-06-05T22:37:32+00:00December 11th, 2011|Categories: Node.js|0 Comments

Chef : configuration et deploiement automatisé de Clusters

L’installation d’un cluster de plusieurs machines est consommateur de temps. La même procédure de mise en place des logiciels et de leurs paramétrages doit être répétées à l’identique. Au cours du temps, des mises à jours doivent être appliquées, certains logiciels doivent être supprimés quand d’autres sont ajoutés et au final, les systèmes divergent les [...]

By |2018-06-05T22:37:32+00:00December 10th, 2010|Categories: Hack|0 Comments