Adaltas

Adaltas manie les technologies open source de l’Internet. Nos domaines de compétences incluent la création d’applications riches basées sur l’HTML5, l’environnement serveur NodeJs, les stockages NoSQLs et le traitement de données massives, notamment sur la plateforme Hadoop.

Adaltas work with open source web technologies. Our focus is on rich Internet application based on HTML5, the server-side NodeJs stack, NoSQLs storages and big data treatment with Hadoop.

Nouveau regard sur les tests en Node.js avec Mocha, Should et Travis

Suite à une demande, l’article ci-dessous est la traduction d’un précédent publié le 19 février 2012.

Aujourd’hui, j’ai finalement décidé de passer une peu de temps autour de Travis. Cette petite image verte en haut des pages d’accueil de projets GitHub m’intrigue de plus en plus ces derniers jours. En fait, pour être tout à fait honnête, ce n’est pas exactement ainsi que j’ai débuté ma soirée. Tout d’abord, après deux ans de bon et loyaux services, j’ai décidé d’abandonner Expresso pour donner une chance à Mocha. Et puisque je m’étais habitué aux quelques petites fonction dont Expresso enrichit le module assert, il m’a fallut y remédier, ce qui m’a conduit au module Should. Il me fut assez plaisant de voir comment ces deux derniers modules se complètent parfaitement l’un et l’autre, dans la plus pure tradition Unix: petit, puissant et bon citoyen.

HDFS and Hive storage - comparing file formats and compression methods

A few days ago, we have conducted a test in order to compare various Hive file formats and compression methods. Among those file formats, some are native to HDFS and apply to all Hadoop users. The test suite is composed of similar Hive queries which create a table, eventually set a compression type and load the same dataset into the new table. Among all the queries, we tested the “sequence file”, “text file” and “RCFILE” formats and the “default”, “bz”, “gz”, “LZO” and “Snappy” compression codecs.

Two Hive UDAF to convert an aggregation to a map

I am publishing two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub in two Java classes: “UDAFToMap” and “UDAFToOrderedMap” or you can download the jar file. The first function converts an aggregation into a map and is internally using a Java HashMap. The second function extends the first one. It converts an aggregation into an ordered map and is internally using a Java TreeMap.

Java versus JS fun, a quote from the Node.js mailing list

I just read that one on the mailing list. I found it relevant enough to share it with those who did not subscribe to it:

First Lothar Pfeiler:

I still wonder, if it’s cool to have such a big discussion on how to convert a string into an integer, or if all the java developers laugh at us.

Then dolphin278:

They are busy with their spaceship-size configuration files, so we safe

A geek joke… but I fully backup that statement as a good caricature of the Java versus JS situation.

A fresh look at testing Node.js projects: Mocha, Should and Travis

Today, I finally decided to spend some time around Travis. It’s been a weeks since that little green image on top of GitHub homepages has been buzzing me. Well, to be totally honest, this isn’t how I started my evening. First, after 2 years of good and faithfull service, I decided to drop Expresso and give a chance to Mocha. Because Expresso enriches the assert module with one or two functions which I became addicted to, I also had to find the same functionalities into another assertion library which lead me to testing Should. It is very pleasant to see those two working together, as in the Unix tradition: small, powerful and naturally integrated.

Coffee script, how do I debug that damn js line?

Update April 12th, 2012: Pull request adding error reporting to CoffeeScript with line mapping

Chances are that, if you code in CoffeeScript, you often find yourself facing a JavaScript exception telling you a problem occured on a specific line. Problem is that the line number in question in the one of the generated JavaScript, not your in CoffeeScript line number. Even worse, if you generate your JavaScript transparently, you wont have any JavaScript file to look into and the all process of finding where this error occored is even more frustrating.

Well, it seems like the future version of JavaScript could come to the rescue, but not before a few months. In the mean time, here’s a little of fun about writing a small Bash script that may save you some time.

OS module on steroids with the SIGAR Node binding

Today we are announcing the first release of the Node binding to the SIGAR library. Visit the project website or the source code repository on GitHub.

SIGAR is a cross platform interface for gathering system information. From the project website, such information include:

  • System memory, swap, cpu, load average, uptime, logins
  • Per-process memory, cpu, credential info, state, arguments, environment, open files
  • File system detection and metrics
  • Network interface detection, configuration info and metrics
  • TCP and UDP connection tables
  • Network route table

Timeseries storage in Hadoop and Hive

In the next few weeks, we will be exploring the storage and analytic of a large generated dataset. This dataset is composed of CRM tables associated to one timeserie table of about 7,000 billiard rows.

Before importing the dataset into Hive, we will be exploring different optimization options expected to impact speed and storage size.

How Node CSV parser may save your weekend

Last Friday, an hour before the doors of my customer close for the weekend, a co-worker came to me. He just finished to export 9 CSV files from Oracle which he wanted to import into Greenplum such as our customer could start testing on Monday morning.

The problem as exposed was quite simple. He needed a quick solution (less than an hour, coding included) to transform all the date in the source CSV file into a format suitable for Greenplum. While Oracle exported dates in the form of ‘DD/MM/YYYY’, Greenplum was picky enough to expect dates in the form of ‘YYYY-MM-DD’.