Node CSV version 0.2 with streaming API

Node CSV version 0.2 with streaming API

By David WORMS

Jul 2, 2012

The Node CSV parser in its version 0.2 has just been released. This version is a major enhancement as it aligned the parser with the best Node.js practice in respect of streams. The CSV parser behave both as a Stream Writer and a Stream Reader.

Be carefull, to achieve this goal, a few changes in the API were required which make the compatibility slightly broken.

Migration

I’m trying to remember all the changes in the API. I will keep this section updated with your suggestions in case I forget.

The functions ‘from’ and ‘to’ are now rewritten as ‘from.’ and ‘to.’. The ‘data’ event is now the ‘record’ event. The ‘data’ now recieved a stringified version of the ‘record’ event.

The new Stream API

This is the most important enhancement which was announced in my last post. This little schema illustrates the structure of the stream architecture from Node.js applied to the CSV parser:

|-----------|      |---------|---------|       |---------|
|           |      |         |         |       |         |
|           |      |        CSV        |       |         |
|           |      |         |         |       |         |
|  Stream   |      |  Writer |  Reader |       |  Stream |
|  Reader   |.pipe(|   API   |   API   |).pipe(|  Writer |)
|           |      |         |         |       |         |
|           |      |         |         |       |         |
|-----------|      |---------|---------|       |---------|

As you can see, this new version is fully compliant with the stream API. It is both a Stream Writer to send input data and a Stream Reader to access output data.

Example:

fs.createReadStream( './in' )
.pipe( csv() )
.pipe( fs.createWriteStream('./out') )

Convenient functionnalities

Alternatively, it comes with convenient functions accessible by the from and to properties. Some of those functions were already present in the 0.1 release and are simply renamed. For exemple, the csv.fromPath() function is now csv.from.path(). New functions have been added such a csv.to.string.

Example:

csv()
.from.path( './in' )
.to.string( function(data){ console.log(data) } )

Documentation

Like I have done in the past in many projects like Mecano, now Nikita, the readme content has been reduced to a minimum and the documentation is generated directly from the source code. A small script was written specifically from that purpose. The idea is to document each function with comments written in a markdown syntax. A simple regexp parser reads each files, extracts the comment and writes markdown file inside a “./doc” folder. The doc folder is finally copied into the Jekyll directory of the website.

Note, at the time of this writing, the script needs some improvements and the API documentation needs to be reviewed and enhanced (check the markdown syntax, typos). Not being a native english speaker doesn’t help as well. As always, your contributions are appreciated.

Conclusion

Please try the new version and let me know how you feel with it.

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.