Getting Too Modular

Getting Too Modular

Over the last year or so I've fallen in love with node and even JavaScript as a language (the good parts, of course). I was very pleased to find that the node community embraces small modules (aka packages)  that have one job and they do it well. However, sometimes I feel like we're traveling down a slippery slope and becoming too modular. I most definitely could be wrong and I'm perfectly willing to listen to rational reasons why you might do what I'm about to describe so please keep it civil if you comment to tell me how wrong you think I am.

I use a lot of modules from npm, but only in recent months have I started to really feel at home in the node community. I'm starting to get a foothold in the area and a few of my own modules have seen some decent activity. I recently needed to add some functionality to an npm module called optimist so I forked the repository, added my changes, and initiated a pull request.  After submitting my changes I poked around the issue tracker to see how active the author of optimist was with his users. He wasn't, and I quickly discovered why. The author of optimist is James Halliday (aka substack) and he maintains over 300 modules on npm! I didn't even know it until recently but he's one of the co-hosts for a node podcast I listen to called nodeUp.

His repository had many stale pull requests but I waited a few days anyway before complaining just to make sure it really wasn't going to get merged in. My initial instinct after waiting a bit was to run back to the repository and make a scene in the hope that it might encourage the author to come back and clean some of the pending pull requests up. However, this time I decided to do more than whine to the maintainer of the repo. Instead I decided I'd do something I've never done before and properly fork the repository. I've forked many repositories in the past, but never have I done this with the intention of taking a project over and maintaining it myself. I spent several hours rebranding the project, rolling in all those stale pull requests from optimist, and adding in my own functionality that I'd been wanting. Suddenly, yargs was born!

I tell you all this because the optimist module is a perfect example getting "too modular".  Originally optimist was one big module that did one job and did it well. It took in an array of arguments which it then used to build an argv hash, which is just a simple javascript object with all the options available as properties.

var optimist = require('optimist');
var argv = optimist(['echo', '"hello world"', '--iterations', '5').argv;


    _: [ 'echo', '"hello world"'],
    iterations: 5,
    '$0': 'node ./app.js'

You'll notice above that optimist accepts an array of arguments. This is fine if you just want to pass in process.argv (the array of args passed in from the command line), but not if you only have the user input as a string:

echo "hello world" --iterations 5

I had to write code to parse the above string into a proper array. When I was finished I really felt like this was all code that belonged in optimist. It felt only natural that it should also be able to understand and parse a string rather than just an array. All of my pull requests were rejected saying that it expands the scope of the module too much. Since then I've had many other projects reject updates, even robust and tested updates simply because they say it's "outside the scope" of the module. In some cases I thought about it for a bit and realized they were right, but in most cases I can't help but disagree.

I feel like some of these modules are too small. The node community is VERY adamant about modules being small, fast, and simple so saying this might be considered heresy, but hear me out. If the change is in line with the general goal of the module then I think it should usually be rolled in. For example, parsing strings or arrays makes optimist a much more powerful module for the module consumer, but being able to, say, read a string or array from a database would be a highly specialized scenario that should not be optimist's responsibility to handle.

I later found out that substack had already written code to parse an input string into an array of arguments and it was published to npm as the module shell-quote. With this module I could break up the string and then pass it into optimist as an array. This works great but I believe there are at least a couple problems with this.

  1. Awareness.

    Why do we write and publish modules at all? I can tell you why I do it; I do it because it helps others. I write modules in the hopes that another developer will find it useful. Otherwise I'd just write local modules and maintain them in the specific project they are for right? Well, I didn't even know about shell-quote. Optimist doesn't mention it anywhere and it wasn't until I got a pull request rejected that was trying to do something similar that I was finally introduced to this shell-quote module. If this module was just part of optimist's default functionality I would have discovered that I could pass in either a string or an array and I would have been a happy camper.
  2. We think we're solving a problem, but we're creating one instead.

    Why do we split up modules into multiple modules? Well, we do it for many reasons but one is that the original module is growing out of its original scope. This is a very legitimate reason to break up a module. For instance, I wrote a module called shotgun that performs some server-side functionality. I initially modified this module so it understood the web, making it easy for the user to use it in real terminal applications or create a pseudo-terminal in a web application. Shotgun worked great in either environment so it didn't make sense to increases shotgun's size by 50% just to support the web when the person consuming my module may not even use any of that functionality. It made perfect sense to break that module up into shotgun and a new module called shotgun-client. This new module simply extends the functionality of shotgun for the web, but is entirely optional.

    I feel like this gets taken too far; to the point that it actually creates the opposite problem. Going back to our optimist example, this shell-quote module was created separately, presumably with the intent that it is useful on its own and won't bloat the optimist module when you don't need that functionality. Firstly, this module is TINY! That's not a bad thing by itself, especially if the module indeed is useful on its own, but it's not really. Shell-quote seems to me like a highly specialized module that you'd only really ever use when you're about to pass an array of args to optimist. This really does feel like the 99% use-case for this module. It would be interesting to see how usage of shell-quote correlates with usage of optimist; I'd be willing to bet it's an almost one to one relationship.

    By splitting off this module and completely removing it from optimist we have created more work for the user who will probably have to do what I did before ever finds out the shell-quote module even exists. I really feel like this micro-optimization only hurts the user of your module, but let's pretend the the module really is a great module all on its own and indeed deserves to be published by itself. This still doesn't mean that optimist has to lack this functionality entirely. Optimist could simply specify shell-quote as a dependency and invoke the module itself, making the 99% use-case easily accessible and part of the optimist API for very little extra cost.

In the end I agree with the community that modules should be fast, small, and do one thing well. I think the main difference is that I believe "small" and "one thing" are very subjective. Substack obviously thinks that parsing a string into an array and parsing an array into a hash are two very different things. I however, believe that they are two small steps in the same process and both belong in the module together. I neither think this extra code is "large" or is beyond the scope of the module. I guess my biggest fear is that someday soon the community will start promoting that we write code like this:

var add = require('math-add');
var subtract = require('math-subtract');
var multiply = require('math-multiply');
var divide = require('math-divide');
var sin = require('math-sine');
var cos = require('math-cosine');
var sqrt = require('math-square-root');
var exp = require('math-exponent');

When this makes SO much more sense for the user and incurs very little cost to your application:

var math = require('math');

math.add(5, 10);
math.subtract(5, 10);
// etc...

You could argue that addition and subtraction are totally different things but I would still argue that they all belong in the same module surrounding the higher level concept of just "math" in general.