A maelstrom of discontent has recently arisen over open source license proliferation. Martin Fink kicked it off with his Linuxworld keynote (available here, wherein he said:
Many people don’t realize that there are dozens and dozens of open source licenses today. There is really no value, and much confusion in having that many licenses. If you’re a vendor and you plan to create a new license: Stop, please don’t.
The argument is that open source license proliferation breeds uncertainty (license incompatibilities, as well as simply knowing what the various licenses say/mean) which, in turn, creates costs and slows open source software adoption. Don Rosenberg, author of
Open Source: The Unauthorized White Papers has written persuasively on the problem, which I blogged
here.
Even OSI, which arguably has done more to create the mess than anyone else, is getting in on the action. They have been
looking into ways to slim down the open source license library (likely a key topic at OSI’s summit meeting to be held at the Open Source Business Conference). Backing them up (or prodding them on, depending on how you want to look at) is OSDL. OSDL board member Sam Greenblatt, in fact, has an
entire session devoted to the topic at this year’s Open Source Business Conference (OSBC).
What gives? Are there really that many licenses? I mean, that developers commonly use?
At first, I assumed the answer was 'No,' figuring that most projects use one of a select group of licenses: GPL/LGPL, BSD-style, and very few others. But I just finished analyzing the licenses that govern the 896 packages that make up SuSE Enterprise Linux (SLES) distribution, and there is, in fact, quite a spread. (I chose SLES because of its relatively limited number of packages – I was not in the mood to count licenses in SuSE Linux Professional, which runs into the thousands.)
Here is the (rough) breakdown. Please note that many of these packages have multiple licenses governing their contents. So, these figures should be taken as simply an estimate. (For example, there is quite a lot of LGPL in the packages, which is not identified below. I rated them according to the first license listed in Novell SuSE's packaging summary, available on Novell's website (package by package - this is one of those times when information is "free" but costly in accumulating and aggregating it - donations welcome ;-), assuming (probably incorrectly) that the first license listed was the primary license for that package.
Apache: 7 (1%)
Artistic: 45 (5%)
BSD: 72 (8%)
LaTeX: 1 (.1%)
Commercial: 7 (1%)
Distributable [Meaning, weak limits on distribution, but not falling easily into one of these other license categories]: 26 (3%)
FSR: 15 (2%)
GPL: 550 (61%)
IBM PL: 17 (2%)
LGPL: 66 (7%)
MPL: 7 (1%)
Miscellaneous: 57 (6%) (Not easily identified or too complex to break out)
Python: 3 (.3%)
X11/MIT: 18 (2%)
YaST: 3 (.3%)
zlib: 2 (.2%)
Again, some packages have multiple licenses, so the numbers above should be considered roughly accurate (with the emphasis on "roughly").
While most licenses in SLES (and Red Hat AS is likely much the same) do meet my hypothesis above (GPL/LGPL + BSD-style), there are some quirky ones, and those outliers are the problem.
By "problem" I mean that they create uncertainty and extra legwork for enterprises that want to buy or sell open source. An example is in order. I talked with the general counsel of a large bank recently, advising her on the legal issues surrounding the bank’s prospective use of Linux/other open source software. She asked for a brief synopsis on open source licensing; a primer, as it were. In going through the various licenses, I tried to stay generic, highlighting BSD-style licenses, the GPL, and the LGPL. However, she wanted to know about the outliers, and how they synthesize with these primary licenses. What could have been (and, frankly, should have been) a short conversation to allay her concerns turned into a much longer discussion. She told me she plans to attend OSBC’s
Intellectual Property track for further enlightenment.
Good for her, but it should not have to be this way. Customers should buy software because of its utility to them, and not its license. Developers should use code because of its utility to them, and not its license. A license is a necessary evil – it a means, not an end. Too many of the open source community’s early luminaries completely missed this point. It is time for the industry to mature a little.
Such maturity compels several things. First, it suggests that developers and the corporations who may employ them need to stop creating new licenses, as Fink articulated in his keynote. When Novell’s Open Source Review Board first got started, we contemplated creating a license to govern the code we intended to open source. But we determined that our minimal concerns with existing licenses could stomach conformity with them – the benefit of aligning with “known good licenses” superceded the desire to fine-tune. HP is the same way, according to Fink.
While attorneys invariably believe they can write a better license – it is part of the attorney DNA - they are invariably wrong. Witness the mind-numbing exchanges between attorneys at opposing firms the next time your firm enters into a partnership, customer relationship, or whatever – the minutiae is only meaningful to them, and will make no appreciable difference in most cases. Developers may fork a project out of necessity – attorneys do it because of a genetic flaw. (As a licensed attorney, I often reveal this same flaw.)
Second, and related to the first, open source industry maturity means that we need to get over our hang-ups with existing licenses. In particular, this means that, with Stanley Kubrick’s Dr. Strangelove, we need to “learn to stop worrying and love the [GPL].” Even if you don't like the GPL (and, frankly, I do), and think the world would end if all open source licenses converged on that particular one, remember that confusion and uncertainty in licensing are worse than any "bad" licensing scheme itself.
Consider financial markets. Financial markets like certainty, even when that certainty is bad news. In times of uncertainty, investors lose confidence and stop trading (or, at least, trade less actively). Once data becomes available – even bad data/news – investors factor in that data and investors begin to trade based on fundamentals again. It is uncertainty, and not bad news, that kills a stock, generally speaking. We now how to manage bad situations so long as we know what we are up against.
So, back to the GPL, if "bad" news hit and everything were suddenly GPL, the downside effect of this would be momentary. Long-term, developers and buyers would learn to work with the GPL and life would go on. This is also a reason for
wanting the GPL to be tested/interpreted in court. Not that the court will get it "right" (whatever that means), but that they will help to establish some certainty around the license, right or wrong.
I am not arguing that everything should be GPL. Rather, I am arguing that we need to dramatically reduce the number of licenses we use, and stop forking existing licenses. I believe that this will mean a lot more GPL-licensed projects, and that this is a good thing. As we limit the number of licenses we use, we will actually maximize the effective amount of choice the open source community provides. Choosing between 50 licenses is too unwieldy to be an efficient choice. It only breeds more lawyers. Who wants that?