Why I think "micro-packages" are a good thing.

Over the past couple days drama unfolded in the JavaScript community when a social media company called Kik threatened to send lawyers after a small time OSS developer if he didn't change the name of his npm package which was also named kik. I'm sure you don't want to read yet another developer's opinion on the debacle and I don't really want to write one, but if you want to read about the issue then you can check out the following posts from both sides of the story:
- I've Just Liberated My Modules by Azer Koçulu
- A discussion about the breaking of the Internet by Mike Roberts
The only reason I bring this up at all is to segue into a related discussion. Ultimately Azer Koçulu had has package name seized by NPM and given to Kik. In retaliation he decided to unpublish everything he had on the NPM registry. Azer happened to maintain a small module known as left-pad and when he took that down it had a ripple effect across the registry. This module is only 11 lines of code and simply pads the left side of a string with a specified character until the string reaches the desired length.
module.exports = leftpad;
function leftpad (str, len, ch) {
str = String(str);
var i = -1;
if (!ch && ch !== 0) ch = ' ';
len = len - str.length;
while (++i < len) {
str = ch + str;
}
return str;
}
There are many uses for such a simple function, but the debate within the community seems to center around the "ridiculousness" of so many people depending on such a small and simple module. This post by David Haney even goes so far as to accuse developers of having "forgotten how to program". He asks "How are hundreds of dependencies and 28,000 files for a blank project template anything but overly complicated and insane?" I wanted to examine this criticism because I don't agree with it. This isn't to say that on a case by case basis there aren't modules that are grossly overcomplicated. But I believe the core philosophy of tiny modules is actually sound and easier to maintain than giant frameworks.
Small modules are extremely easy to maintain
The criticism of small modules is a bit ironic because it runs deeper than many realize. You might get the impression after reading David's article above that this trend arose from lazy developers who "forgot how to program", but the reality is that the tiny-module ecosystem on NPM was the intention from the beginning. In this post by Isaac Schlueter he talks about the "Unix Philosophy" and how it relates to Node.js. He references Mike Gancarz who worked on the X Window System and also summed up the Unix Philosophy in 9 salient points:
- Small is beautiful.
- Make each program do one thing well.
- Build a prototype as soon as possible.
- Choose portability over efficiency.
- Store data in flat text files.
- Use software leverage to your advantage.
- Use shell scripts to increase leverage and portability.
- Avoid captive user interfaces.
- Make every program a filter.
Isaac then continues on to compare that philosophy to Node.js. They are slightly less succinct but still very enlightening.
- Write modules that do one thing well. Write a new module rather than complicate an old one.
- Write modules that encourage composition rather than extension.
- Write modules that handle data Streams, because that is the universal interface.
- Write modules that are agnostic about the source of their input or the destination of their output.
- Write modules that solve a problem you know, so you can learn about the ones you don't.
- Write modules that are small. Iterate quickly. Refactor ruthlessly. Rewrite bravely.
- Write modules quickly, to meet your needs, with just a few tests for compliance. Avoid extensive specifications. Add a test for each bug you fix.
- Write modules for publication, even if you only use them privately. You will appreciate documentation in the future.
Of course how each developer interprets and applies these very generalized guidelines is subjective and will vary from person to person. However, I think Isaac makes some great points. Small modules are extremely versatile and easy to compose together in an app with any number of other modules that suit your needs. Even something as small and simple as Azer's left-pad module is granted certain benefits by being a tiny module published to NPM. For one, anyone using this module would automatically benefit from any future performance improvements without having to do anything themselves.
"Functions Are Not Packages" - Well why not?
In David Haney's article that I linked earlier he goes on to say that simple functions should not be packages because they are too small. He says that instead of a "cosine" dependency we should desire a "trigonometry" dependency. He compares it to the .NET framework "core" library of functionality. He says that approach is better because the core framework is vetted by the language maintainers. Immediately after this he goes on to talk about third party problems and how you're never guaranteed something is written correctly or that even if it is you don't know if it's the most optimal solution. He says that writing the function yourself makes it easy to modify and to fix bugs or improve efficiency. I found this bit a tad ironic considering he's simultaneously admonishing small modules while complaining about how difficult it is to debug other people's code.
The whole point of small modules is that they are small and easy to maintain. It should be easy to open up tiny modules and see exactly what it's doing. It should also be easy to improve that module and submit those improvements back to the original package for others to benefit from. I would much rather have a "cosine" module than a "trigonometry" module because chances are good I only need a small fraction of the utilities provided by the larger trig module. By treating even small functions like a black box it promotes separation of concerns and allows said black box to evolve independently. If all you did was paste a snippet of code into your app it would work fine, but any future optimization would need to be done by you or other maintainers of your project; nobody will come by and magically update the code with some fancy new optimization for you. Sometimes language features are introduced that might speed up a function. With the function being versioned in its own little silo it's easy to support this new feature while still leaving older versions for those who may not have access to the fancy new language feature.
It makes sense within your own apps to pull out any code that you are duplicating and try to centralize it. Indeed many apps will pull simple functions out into their own utility library so that they can inject that utility in the needed places. This is intuitive programming practice and allows you to have one place to tweak the function in question so you don't have to propagate changes in logic all over your codebase.
This is all fine and great but why not share even the smallest optimizations with everyone else who also has a need for a similar utility? Maybe it only saves 10 minutes, but just like collecting pocket change in a jar on your nightstand it all adds up in the long run. I doubt anyone would argue against centralizing a simple utility function within a single app, yet for some reason many developers will declare sacrilege if you then publish that utility to a public registry for others to consume.
Do you really want modules to be Swiss Army Knives?
I used to be a .NET developer and as such I'm very familiar with the .NET framework. .NET is a giant monstrosity of a framework that conveniently has 90% of what you need to build almost any app all under one umbrella. It's usually fairly well tested and anything you find in their is supposed to have Microsoft's stamp of approval.
The .NET approach definitely has benefits. It gets you started quickly and comes with a giant blob of documentation called the MSDN Library. Though the quality of that documentation is sometimes questionable, it's undeniably more complete than many packages on NPM. Some community packages contain no documentation whatsoever. Unifying everything into one big framework definitely helps when it comes to testing and other activities you have to do on a regular basis. You don't have to ask many questions as Microsoft has laid out a very clear path for you in most cases. If you need an ORM for your data, .NET has you covered with Entity Framework. If you need websockets, SignalR is a couple imports away. Etcetera etcetera...
This all sounds great right? Why would you want to give all that up to drudge through a registry maintained by a community when you have a professionally vetted framework like .NET?
Flexibility
I initially started doing web development in ASP.NET where you'd drop server controls on a page and configure their handlers in a C# codebehind file. These controls would come with predefined markup and front-end scripting to make it all work as expected. At the time I thought this was awesome and convenient, but as time went on I began to feel trapped in the rigid page lifecycle of ASP.NET.
When developers wanted to use AJAX to make more robust pages that could perform tasks without reloading entire pages in the browser, Microsoft gave us the Update Panel control. However, because of the way ASP.NET Webforms maintained page state via ViewState devs quickly realized that it lacked some much needed flexibility. Sure it could update parts of a page without reloading, but it still used a single base 64 encoded viewstate string to keep track of the entire page's state on each postback. Meaning each update panel had to send that same string to the server to be modified and no two update panels could modify that state simultaneously.
After throwing out the update panel developers started hopping on client-side JavaScript frameworks like JQuery and learning how to do some legitimate front-end coding instead of letting Microsoft spit out some JS for you. Eventually Microsoft caught on and started building things into .NET to allow you to use client-side frameworks more easily. They even ditched the old-school page lifecycle pattern (technically it's still maintained but most devs consider it antiquated) and moved on to embrace a new MVC-style architecture with its own ASP.NET flavor. While that was a huge leap forward for .NET developers it still seemed that the rest of the web development world was a step ahead as usual. While I was learning .NET's flavor of MVC there were other cool frontend technologies doing amazing stuff with long-polling, server-sent events, and a short time later websockets to do realtime communication with the front-end browser from the back-end server and vice versa.
It eventually became clear to me that Microsoft was always going to be behind the curve. Waiting to see what tech gets popular and then implementing a complete solution down the road isn't necessarily a bad thing. It's a safe bet for many developers and indeed resulted in pretty robust APIs to learn when they were finished. However, I was running a business and many of my clients would want some cutting edge features that .NET was simply too slow at implementing. I often found myself battling with .NET, trying to get it to do what I wanted it to do. There were many times when I felt like my .NET apps were turning into a library of workarounds and hacks.
From the very first talk I attended about Node.js I instantly saw the benefits it could offer for a developer like myself. The flexibility of a community of small modules allowed me to build apps that could do almost anything I needed without having to battle a giant framework. There was definitely a trade-off as a lot of the simple things .NET would take care of became harder and more tedious. I had been a web developer for almost eight years already, yet I still had tons of learning to do once I was no longer allowed to be ignorant of the things .NET was doing for me under the hood.
I found that I gained a greater appreciation for the web in general and learned to love practices that adhered closely to the way the web actually worked. REST is a great example of something that never crossed my mind until I found myself having to build APIs from scratch instead of from big templates generated by wizards. Suddenly I was tasked with doing things like parsing cookie strings and other raw request data that .NET would have just done for me implicitly.
There is a slightly longer research curve when investigating new modules and it varies based on how well-documented the module is. Sometimes the documentation is so good that I can pick up the module and start running even faster than when I used to learn a new bit of the .NET framework. Other times though the opposite is true. There are definitely pros and cons to both ecosystems but it definitely felt liberating to be learning fun new modules rather than asking Stack Overflow how everyone else worked around some hurdle presented by .NET.
Community Contribution
My absolute favorite part about NPM is that when I have a problem with a module I can find it's Github repo and log an issue. Because the maintainer is maintaining a tiny module rather than a whole framework the chances of my issue getting addressed usually go way up. Better yet, I can fork the repo and make the change myself with ease, submitting the change back to the original repo for inclusion. It's now a common routine for me to find an inconvenience with a module and fix it myself. Sure sometimes my changes get rejected, but it almost always comes with a reason why and I can work together with the maintainer to come up with a sensible solution to my issue.
Back in my .NET days if I had an issue with the way the framework did something then I was just out of luck. I was stuck finding hacks or replicating entire portions of the .NET framework in custom code just so I could modify one small aspect of it. Not only do I have the ability to submit issues and pull requests to modules on the NPM registry but there are often several alternatives that do the same thing slightly differently. Many developers highlight this as a failing of NPM that makes it hard to choose the "right" module for your issue. Personally I think the opposite is true. The ability to pick and choose modules offers so much flexibility when you get stuck trying to make one work.
Are lots of files really so bad?
Another common complaint about the NPM ecosystem is how interconnected everything is. When left-pad was taken down it brought down a whole bunch of other repositories that depended on it. Dependencies of dependencies of dependencies were unable to find left-pad and started erroring out. If you've been following programming news lately then you've probably seen people freaking out about how fundamentally flawed NPM is. The general feeling seems to be that there are too many small modules and too many unnecessary dependencies.
First, I will grant that there is a flaw in NPM. It should not be possible to simply unpublish a bunch of stuff from the registry. With the way NPM works it should have been super obvious to them that taking all of Azer's modules down completely would cause havoc for other people. The good news is that the NPM maintainers are aware of this issue and have already said they'll be working to smooth this out so it doesn't happen again.
This response could have come a tad quicker, but I'd rather they collect their thoughts and address it professionally than appease the drama with a quick response. Even if you disagree with the actions they took regarding the kik package it's clear they put thought into how best to resolve it and that they care a great deal about the integrity of the ecosystem.
Second, I don't agree that there are too many small modules. In fact, I wish every common function existed as its own module. Even the maintainers of utility libraries like Underscore and Lodash have realized the benefits of modularity and allowed you to install individual utilities from their library as separate modules. From where I sit that seems like a smart move. Why should I import the entirety of Underscore just to use one function? Instead I'd rather see more "function suites" where a bunch of utilities are all published separately but under a namespace or some kind of common name prefix to make them easier to find. The way Underscore and Lodash have approached this issue is perfect. It gives consumers of their packages options and flexibility while still letting people like Dave import the whole entire library if that's what they really want to do.
I see lots of people upset about how many files NPM will pull down in even the most basic apps simply because one module has dependencies and those dependencies have dependencies, etc. etc. They will loudly proclaim that there are "TOO MANY FILES!" and go on to say it's just all so confusing. I personally don't find the dependency tree that confusing but I can see how others might. Especially with the recent release of NPM 3 which does a way better job of flattening the dependency tree as much as possible, but makes it look messier at the top level because your dependencies' dependencies are also brought to the top level if possible, making your node_modules
folder look enormous at first glance even though it's actually much smaller than it used to be. I find the complaint about too many files interesting because I don't fully understand why that matters. In fact, I prefer a directory full of tiny files far more than a handful of files that are all a mile long.
The existence of a file doesn't really add much overhead as far as disk space is concerned. I really don't buy the argument that importing a tiny module from my node_modules
directory is so much worse than adding those same lines of code right into my app. The difference in size on disk is extremely negligible and really only differs slightly because modules usually have some other files in them like package.json
to define metadata about the package.
The ecosystem continues to improve
I also don't buy the argument that the entire ecosystem is fundamentally flawed just because we discovered a new flaw with NPM. If anything it could have been a lot worse had the module been something more significant and this incident has called the issue into the spotlight so it can be fixed. To me that looks like open source software working as intended. Sure we all still get caught in dependency hell sometimes but I can honestly say that it is infinitely better today than it was in the early days of node. I'm fairly confident that next year I'll be able to say the same thing when comparing it to today.
Even after the recent drama I still hold to the idea that an abundance of tiny modules is a good thing, not a bad one. I think it solves far more problems than it creates and my experience has been overwhelmingly positive despite some trade-offs. Large frameworks do make more sense when that entire framework is maintained by a giant like Microsoft; it fits that ecosystem really well, but wasn't very conducive to community development or cutting edge growth. Ultimately it all boils down to preference and I think there is room for everyone, despite what some other developers will try to tell you.