Articles published in 2012

E-commerce electronic cigarettes: first impressions with Prestashop

E-commerce electronic cigarettes: first impressions with Prestashop

Categories: Tech Radar | Tags: HTML, Java, Node.js

Last year, I had to select and integrate an e-commerce software for the website CigarHit selling electronic cigarettes. Considering that the last e-commerce integration I made dated from 2005, I took…

By David WORMS

Jul 25, 2012

Node CSV version 0.2.1

Node CSV version 0.2.1

Categories: Node.js | Tags: CoffeeScript, CSV, Release and features, Streaming

After the announcement of the version 0.2.0 of the Node.js CSV parser at the begining of october, we are releasing today a new version 0.2.1. This is mostly a bug fix release with enhanced…

By David WORMS

Jul 24, 2012

Node CSV version 0.1 and future developments

Node CSV version 0.1 and future developments

Categories: Node.js | Tags: CoffeeScript, CSV, Markdown, Release and features, Streaming

The Node CSV parser has just reach version 0.1 which close the 0.0.x releases. Started almost 2 years ago, the project has received a tremendous amount of participation in the form of bug reports…

By David WORMS

Jul 21, 2012

Convert .flac music files to .mp3 on osx

Convert .flac music files to .mp3 on osx

Categories: Hack | Tags: File Format, OS X

As an osx user for years now, one should know by then that iTunes doesn’t support the flac format. We are now in 2012, I’ve been waiting for this to happen since years know. Loosing patience, dark…

By David WORMS

Jul 20, 2012

Hadoop and R with RHadoop

Hadoop and R with RHadoop

Categories: Business Intelligence, Data Science | Tags: HBase, HDFS, MapReduce, Thrift, Data Analytics, Learning and tutorial, R, Hadoop

RHadoop is a bridge between R, a language and environment to statistically explore data sets, and Hadoop, a framework that allows for the distributed processing of large data sets across clusters of…

By David WORMS

Jul 19, 2012

Asynchronous array iteration in Node.js with Each

Asynchronous array iteration in Node.js with Each

Categories: Node.js | Tags: Asynchronous, CoffeeScript, JavaScript, Release and features

Control flow in Node.js is the sort of library for which almost all the developers have created and publish their own libraries. They usually aim at reducing spaghetti codes made of deep callbacks. I…

By David WORMS

Jul 18, 2012

Installing and using MADlib with PostgreSQL on OSX

Installing and using MADlib with PostgreSQL on OSX

Categories: Data Science | Tags: Database, Greenplum, PostgreSQL, Statistics, SQL

We cover basic installation and usage of PostgreSQL and MADlib on OSX and Ubuntu. Instructions for other environments should be similar. PostgreSQL is an Open Source database with enterprise…

By David WORMS

Jul 7, 2012

Node CSV version 0.2 with streaming API

Node CSV version 0.2 with streaming API

Categories: Node.js | Tags: CSV, Data Engineering, Markdown, Node.js, Streaming

The Node CSV parser in its version 0.2 has just been released. This version is a major enhancement as it aligned the parser with the best Node.js practice in respect of streams. The CSV parser behave…

By David WORMS

Jul 2, 2012

HDFS and Hive storage - comparing file formats and compression methods

HDFS and Hive storage - comparing file formats and compression methods

Categories: Big Data | Tags: Analytics, HBase, HDFS, Hive, ORC, Parquet, File Format

A few days ago, we have conducted a test in order to compare various Hive file formats and compression methods. Among those file formats, some are native to HDFS and apply to all Hadoop users. The…

By David WORMS

Mar 13, 2012

Two Hive UDAF to convert an aggregation to a map

Two Hive UDAF to convert an aggregation to a map

Categories: Data Engineering | Tags: Analytics, HDFS, Hive, ORC, Parquet

I am publishing two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub in two Java classes: “UDAFToMap” and “UDAFToOrderedMap” or you can download the jar file. The…

By David WORMS

Mar 6, 2012

Java versus JS fun, a quote from the Node.js mailing list

Java versus JS fun, a quote from the Node.js mailing list

Categories: Node.js | Tags: Java, JavaScript, Node.js

I just read that one on the mailing list. I found it relevant enough to share it with those who did not subscribe to it: First Lothar Pfeiler: I still wonder, if it’s cool to have such a big…

By David WORMS

Feb 23, 2012

A fresh look at testing Node.js projects: Mocha, Should and Travis

A fresh look at testing Node.js projects: Mocha, Should and Travis

Categories: DevOps & SRE, Node.js | Tags: CI/CD, DevOps, JavaScript, Mocha, Node.js, Unit tests

Today, I finally decided to spend some time around Travis. It’s been a few weeks since that little green image on top of many GitHub homepages has been buzzing me. Well, to be totally honest, this isn…

By David WORMS

Feb 19, 2012

Coffee script, how do I debug that damn js line?

Coffee script, how do I debug that damn js line?

Categories: Hack, Node.js | Tags: CoffeeScript, Debug, JavaScript, Node.js

Update April 12th, 2012: Pull request adding error reporting to CoffeeScript with line mapping Chances are that, if you code in CoffeeScript, you often find yourself facing a JavaScript exception…

By David WORMS

Feb 15, 2012

Announcing Mecano, a set of functions for system deployment

Announcing Mecano, a set of functions for system deployment

Categories: DevOps & SRE, Node.js | Tags: Automation, CoffeeScript, DevOps, Infrastructure, JavaScript, Node.js, Open source

Update July 2016, Mecano is now renamed Nikita. We are releasing Node Mecano on GitHub which gather common functions used while deploying systems. The idea was to group those functions into a…

By David WORMS

Feb 12, 2012

OS module on steroids with the SIGAR Node binding

OS module on steroids with the SIGAR Node binding

Categories: Node.js | Tags: C++, CPU, File system, Metrics, Monitoring, Network

Today we are announcing the first release of the Node binding to the SIGAR library. Visit the project website or the source code repository on GitHub. SIGAR is a cross platform interface for gathering…

By David WORMS

Jan 11, 2012

Timeseries storage in Hadoop and Hive

Timeseries storage in Hadoop and Hive

Categories: Data Engineering | Tags: HDFS, Hive, CRM, File Format, timeseries, Tuning, Hadoop

In the next few weeks, we will be exploring the storage and analytic of a large generated dataset. This dataset is composed of CRM tables associated to one timeserie table of about 7,000 billiard rows…

By David WORMS

Jan 10, 2012

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.