Asynchronous array iteration in Node.js with Each
Control flow in Node.js is the sort of library for which almost all the developers have created and publish their own libraries. They usually aim at reducing spaghetti codes made of deep callbacks. I’m no exception to the rule. After a year and a half of intensive usage, I feel like it’s about time to present Each, my own control flow library.
Well, to be exact, it isn’t a control flow library in the traditional sense. There is no such mechanism to chain and control functions. It came from my intensive need to traverse arrays and call asynchronous code on each of their elements. Think about
Array.prototype.forEach on steroids.
Let’s say that we need to create 3 directories. This operation may be run in parallel and may be composed of 3 sub-operations: check if directory exists, create the directory and make permissions.
Here’s the code:
each([ '/data/1/my_dir' '/data/2/my_dir' '/data/3/my_dir' ]) .on 'item', (dir, next) -> fs.stat dir, (err, stat) -> return next() if stat fs.mkdir dir, (err) -> next err .on 'error', (err) -> console.error err.message .on 'end', -> console.log 'Success'
Seems awkward to start this way but since Each is partially a Node.js control flow library, it feels important to explain why it doesn’t answer all the needs and why I don’t use any existing library to complement Each.
Asynchronous programming is great but in Node.js and Javasript, it leads to unaesthetic code in which callbacks are calling more callbacks, often called spaghetti code.
Let’s get back to our example above. One way to limit the depth of the code is by isolating the directory creation process into a single function:
create = (dir, callback) -> fs.stat dir, (err, stat) -> return next() if stat fs.mkdir dir, (err) -> next err
However, control flow libraries are not just useful at reducing code depth. They answer tricky problems as well. Let’s presume we need to create a file, whether the directory exists or not:
create = (file, content, callback) -> dir = path.dirname fs.stat dir, (err, stat) -> if stat fs.writeFile file, content, (err) -> callback err fs.mkdir dir, (err) -> fs.writeFile file, content, (err) -> callback err
Here, the code to write the file isn’t just redundant and ugly, it can become really hard when your code increase in complexity. After using different libraries, I finally came to the conclusion that the best approach to this problem was decomposing the code into small functions. Here’s how:
create = (file, content, callback) -> dir = path.dirname checkDir = -> fs.stat dir, (err, stat) -> unless stat then makeDir() else writeFile() makeDir = -> fs.mkdir dir, (err) -> return callback err if err writeFile()] writeFile = fs.writeFile file, content, (err) -> callback err checkDir() .on('item', create) .on 'error', (err) -> console.error err.message .on 'end', -> console.log 'Success'
This is how I came up with Each. At the time, I was installing and running an Hadoop cluster and my tasks had to be distributed across the overall cluster. Things like starting processes, running distributed commands or collecting statistics were (and still are) run by Each and of course Node.js.
Over the time, the library has become extra flexible and ultra tested. The API is an Event Emitter API, classic of a Node.js library. It also partially borrows from the Stream API.
The more I used Each and the more I realised that my problems where not about calling functions asynchronously. Every time I was tempted to use a control flow library, I was in fact in the need to traverse arrays. Asynchronous array iteration is a complex process and Each solve it with elegancy. I invite you all to try Each and make it even better.
Thanks for reading. Please visit the source code on GitHub.