Tag: open source

  • The Rise of AI Native: Open Source Ecosystems

    I recently posted my thoughts on “AI Native” and automation, and how recent developments in the model and agent ecosystem were about to significantly disrupt our application lifecycle management tools, or what we have erroneously labeled “DevOps”. This entire post is based on that premise. If I turn out to be wrong, pretend like I never said it 🙂  Considering that a lot of open source time and effort goes into infrastructure tooling – Kubernetes, OpenTofu, etc – one might draw the conclusion that, because these many of these tools are about to be disrupted and made obsolete, therefore “OMG Open Source is over!!11!!1!!”. Except, my conclusion is actually the opposite of that. I’ll explain.

    Disclaimer: You cannot talk about “AI” without also mentioning that we have severe problems with inherent bias in models, along with the severe problems of misinformation, deepfakes, energy and water consumption, etc. It would be irresponsible of me to blog about “AI” without mentioning the significant downsides to the proliferation of these technologies. Please take a minute and familiarize yourself with DAIR and its founder, Dr. Timnit Gebru.

    How Open Source Benefits from AI Native

    As I mentioned previously, my “aha!” moment was when I realized that AI Native automation basically removes the need for tools like Terraform or OpenTofu. I concluded that tools that we have come to rely on will be replaced – slowly at first, and then accelerating – by AI Native systems based on models and agents connected by endpoints and interfaces adhering to standard protocols like MCP. This new way of designing applications will become this generation’s 3-tiered web architecture. As I said before, it will be models all the way down. Composite apps will almost certainly have embedded data models somewhere, the only question is to what extent.

    The reason that open source will benefit from this shift (do not say paradigm… do not say paradigm… do not say…) is the same reason that open source benefited from cloud. A long long time ago, back in the stone age, say 2011 – 2016, there were people who claimed that the rise of SaaS, cloud, and serverless computing would spell the end of open source development, because “nobody needed access to source code anymore.” A lot of smart people made this claim – honest! But I’m sorry, it was dumb. If you squint really hard, you can sort of see how one might arrive at that conclusion. That is, if you made the erroneous assumption that none of these SaaS and cloud ran on, ya know, software. To his credit, Jim Zemlin, the Linux Foundation czar, never fell for this and proclaimed that open source and cloud went together like “peanut butter and chocolate” – and he was right. Turns out, using SaaS and cloud-based apps meant you were using somebody else’s computer, which used – wait for it – software. And how did tech companies put together that software so efficiently and cheaply? That’s right – they built it all on open source software. The rise of cloud computing didn’t just continue or merely add to the development of open source, it supercharged it. One might say that without open source software, SaaS and cloud native apps could never have existed.

    I know that history doesn’t repeat itself, per se, and that it rhymes. In the case of AI Native, there’s some awfully strong rhyming going on. As I have mentioned before, source code is not a particularly valuable commodity and nothing in the development of AI native will arrest that downward trend. In fact, it will only accelerate it. As I mentioned in my first essay on the topic of open source economics, There is no Open Source Community, the ubiquity of software development and the erosion of geographical borders makes for cheaper software asymptotic to zero cost. This makes the cost of producing software very cheap and the cost of sharing the software pretty much $0. In an environment of massive heaps of software flying around the world hither and yon, there is a strong need for standard rules of engagement, hence the inherent usefulness of standard open source licensing, eg. the Apache License or the GNU General Public License.

    There are 2 key insights that I had recently on this subject: The first was in part 1 of this essay: devops tools are about to undergo severe disruption. The 2nd is this: the emergence of bots and agents editing and writing software does not diminish the need for standard rules of engagement; it increases the need. It just so happens that we now have 25+ years of standard rules of engagement for writing software in the form of the Open Source Definition and the licenses that adhere to its basic principles. Therefore, the only logical conclusion is that the production of open source code is about to accelerate. I did not say “more effective” or “higher quality” code, simply that there will be more of it.

    InnerSource, Too

    The same applies to InnerSource, the application of open source principles in an internal environment behind the firewall. If AI Native automation is about to take over devops tooling, then it stands to reason that these models and agents used internally will need rules of engagement for submitting pull/merge requests, fixing bugs, remediating vulnerabilities, and submitting drafts of new features. Unfortunately, whereas the external world is very familiar with open source rules of engagement, internal spaces have been playing catchup… slowly. Whereas b2b software development has all occurred in open source spaces, large enterprises have instead invested millions of $s in Agile and automation tooling while avoiding the implementation of open source collaboration for engineers. I have a few guesses as to why that is the case but regardless, companies will now have to accelerate their adoption of InnerSource rules to make up for 25 years of complacency. If they don’t, they’re in for a world of hurt, because everyone will either follow different sets of rules, or IT will clamp down and allow none of it, raising obstacles to the effectiveness of their agents. Think about an agent, interacting with MCP-based models, looking to push a new version of a YAML file into a code repo. But they can’t, because someone higher up decided that such activities were dangerous and never bothered to build a system of governance around it.

    Mark my words: the companies that make the best use of AI Native tools will be open source and InnerSource savvy.

  • The Rise of AI Native Automation

    I conceived of this blog in 2023 as a skeptic’s outpost, demystifying AI hype and taking it down whenever necessary. I had no interest in fueling a speculative bubble, but as a technologist at heart, I’m always interested in seeing what’s coming down the road. This is my way of saying that this post is not about poking holes in the current hype cycle, but rather taking a look at what I see developing in the application development world. Because I see major changes afoot, and I’m not sure folks are ready for it.

    No, we’re not all losing our jobs

    This is one of those zombie lies that won’t die. Turns out, sprinkling AI sauce into your IT and build environments makes humans in the loop even more important than before. Humans who understand how to build things; humans who understand how to communicate; humans who know how to write effectively; and humans who can conceive of a bigger picture and break it down into lanes of delivery. In other words, the big winners in an AI world are full-stack humans (I stole this from Michelle Yi. I hope she doesn’t mind) – a career in tech has never been more accessible to a humanities degree holder than now.

    The big losers are the code monkeys who crank out indecipherable code and respond in monosyllabic grunts to anyone who deigns to ask – those “10x” engineers that everyone was lauding just 3 years ago. We already knew source code wasn’t that valuable, and now it’s even less valuable than ever.

    AI and DevOps

    I had thought, until recently, that as AI infiltrated more and more application development, that the day would come when developers would need to integrate their model development into standard tools and platforms commonplace in our devops environments. Eventually, all AI development would succumb to the established platforms and tools that we’ve grown to know and love that make up our build, test, release, and monitor application lifecycle. I assumed there would be a great convergence. I still believe that, but I think I had the direction wrong. AI isn’t getting shoehorned into DevOps, it’s DevOps that is being shoehorned into AI. The tools we use today for infrastructure as code, continuous integration, testing, and releasing are not going to suddenly gain relevance in the AI developers world. A new class of AI-native tools are going to grow and obliterate (mostly) the tools that came before. These tools will both use trained models to be better at the build-test-release application development lifecycle as well as deploy apps that use models and agents as central features. It will be models all the way down, from the applications that are developed to the infrastructure that will be used to deploy, monitor, and improve it.

    Ask yourself a question: why do I need a human to write Terraform modules? They’re just rule sets with logic that defines guardrails for how infrastructure gets deployed and in what sequence. But let’s take that one step further: if I train my models and agents to interact with my deployment environments directly – K8s, EC2, et al – why do I need Terraform at all? Training a model to interact directly with the deployment environments will give it the means to master any number of rulesets for deployments. Same thing with CI tools. Training models to manage the build and release processes can proceed without the need for CI platforms. The model orchestrators will be the CI. A Langchain-based product is a lot better at this than Circle CI or Jenkins. The eye-opener for me has been the rise of standards like MCP, A2A, and the like. Now that we are actively defining the interfaces between models, agents, and each other, it’s a short hop, skip, and a jump to AI-native composite apps that fill our clouds and data centers, combined with AI-native composite platforms that build, monitor, and tweak the infrastructure that hosts them.

    AI Native Tools are Coming

    Once you fully understand the potential power of model-based development and agent-based delivery and fulfillment, you begin to realize just how much of the IT and devops world is about to be flipped on its head. Model management platforms, model orchestrators, and the like become a lot more prominent in this world, and the winners in the new infrastructure arms race will be those tools that take the most advantage of these feature sets. Moreover, when you consider the general lifespan of platforms and the longevity of the current set of tools most prevalent in today’s infrastructure, you get the impression that the time for the next shift has begun. Hardly any of today’s most popular tools were in use prior to 2010.

    DevOps tools have followed a general pattern over the past 30 years, starting with the beginning of what I’ll call the “web era”:

    Timeline showing the progress of machine automation in the web era, starting from 1995 until today. The diagram shows "custom scripts on bare metal" lasting from 1995 - 2010, "Automated IaC, CI 'cloud native'" from 2010 - 2025, and "MCP, agentic automation 'AI Native'" starting in 2025 going until ????
    Progression of automation in the web era

    The current crop of automation tools are now becoming “legacy” and their survival now rests on how well they acclimate to an AI Native application development world. Even what we call “MLOps” was mostly running the AI playbook in a CI “cloud native” DevOps world. Either MLOps platforms adapt and move to an AI native context, or they will be relegated to legacy status (my prediction). I don’t think we yet know what AI native tools will look like in 5 years, but if the speed of MCP adoption is any indicator, I think the transition will happen much more quickly than we anticipated. This is perhaps due to the possibility that well-designed agentic systems can be used to help architect these new AI Native systems.

    I also don’t think this will be any great glorious era of amazing transformation. Any migration to new systems will bring about a host of new challenges and obstacles, especially in the security space. I shudder to think of all the new threat vectors that are emerging as we speak to take advantage of these automated interfaces to core infrastructure. But that’s ok, we’ll design agentic security systems that will work 24/7 to thwart these threats! What could possibly go wrong??? And then there are all the other problems that have already been discussed by the founders of DAIR: bias, surveillance, deep fakes, the proliferation of misinformation, et al. We cannot count on AI Native systems to design inclusive human interfaces or prevent malicious ones. In fact, without proper human governance, AI native systems will accelerate and maximize these problems.

    In part 2, I’ll examine the impact of AI Native on development ecosystems, open source, and our (already poor) systems of governance for technology.

  • The Open Source Supply Chain Was Always Broken

    I’ve written a number of articles over the years about open source software supply chains, and some of the issues confronting open source sustainability. The ultimate thrust of my supply chain advocacy culminated in this article imploring users to take control of their supply chains. I naively thought that by bringing attention to supply chain issues, more companies would step up to maintain the parts that were important to them. When I first started brining attention to this matter, it was November 2014, when I keynoted for the first time at a Linux Foundation event. Over the next 3 years, I continued to evolve my view of supply chains, settling on this view of supply chain “funnels”:

    Diagram of a typical open source supply chain funnel, where upstream comments are pulled into a distribution, packaged for widespread consumption and finally made into a product
    Diagram of open source supply chian funnel

    So, what has happened since I last published this work? On the plus side, lots of people are talking about open source supply chains! On the downside, no one is drawing the obvious conclusion: we need companies to step up on the maintenance of said software. In truth, this has always been the missing link. Unfortunately, what has happened instead is that we now have a number of security vendors generating lots of reports that show thousands of red lights flashing “danger! danger!” to instill fear in any CISO that open source software is going to be their undoing at any given moment. Instead of creating solutions to the supply chain problem, vendors have instead stepped in to scare the living daylights out of those assigned the thankless task of protecting their IT enterprises.

    Securing Open Source Supply Chains: Hopeless?

    Originally, Linux distributions signed on for the role of open source maintainers, but the world has evolved towards systems that embrace language ecosystems with their ever-changing world of libraries, runtimes, and frameworks. Providing secure, reliable distributions that also tracked and incorporated the changes of overlaid language-specific package management proved to be a challenge that distribution vendors have yet to adequately meet. The uneasy solution has been for distribution vendors to provide the platform, and then everyone re-invents (poorly) different parts of the wheel for package management overlays specific to different languages. In short, it’s a mess without an obvious solution. It’s especially frustrating because the only way to solve the issue in the current environment would be for a single vendor to take over the commercial open source world and enforce by fiat a single package management system. But that’s frankly way too much power to entrust to a single organization. The organizations designed to provide neutral venues for open source communities, foundations, have also not stepped in to solve the core issues of sustainability or the lack of package management standardization. There have been some efforts that are noteworthy and have made a positive impact, but not the extent that is needed. Everyone is still wondering why certain critical components are not adequately maintained and funded, and everyone is still trying to undertand how to make language-specific package ecosystems more resilient and able to withstand attacks from bad-faith users and developers. (note: sometimes the call *is* coming from inside the house)

    But is the supply chain situation hopeless? Not at all. Despite the inability to solve the larger problems, the fact is that every milestone of progress brings us a step closer to more secure ecosystems and supply chains. Steps taken by multiple languages to institute MFA requirements for package maintainers, to use but one example, result in substantial positive impacts. These simple, relatively low-cost actions cover the basics that have long been missing in the mission to secure supply chains. But that brings us to a fundamental issue yet to be addressed: whose job is it to make supply chains more secure and resilient?

    I Am Not Your Open Source Supply Chain

    One of the better essays on this subject was written by Thomas Depierre titled “I Am Not a Supplier“. While the title is a bit cheeky and “clickbait-y” (I mean, you are a supplier, whether you like it or not) he does make a very pertinent – and often overlooked – point: developers who decide to release code have absolutely no relationship with commercial users or technology vendors, especially if they offer no commercial support of that software. As Depierre notes, the software is provided “as is” with no warranty.

    Which brings us back to the fundamental question: if not the maintainers, whose responsibility is open source supply chains?

    The 10% Rule

    I would propose the following solution: If you depend on open source software, you have an obligation to contribute to its sustainability. That means if you sell any product that uses open source software, and if your enterprise depends on the use of open source software, then you have signed on to maintain that software. This is the missing link. If you use, you’re responsible. In all, I would suggest replacing 10% of your engineering spend with upstream open source maintenance, and I’ll show how it won’t break the budget. There are a number of ways to do this in a sustainable way that leads to higher productivity and better software:

    • Hire a maintainer for software you depend on – this is a brute force method, but it would be valuable for a particularly critical piece of software
    • Fund projects dedicated to open source sustainability. There are a number of them, many run out of larger software foundations, eg. The Linux Foundation, the ASF, Eclipse, the Python Software Foundation, and others.
    • Pay technology vendors who responsibly contribute to upstream projects. If your vendors don’t seem to support the upstream sources for their software, you may want to rethink your procurement strategies
    • Add a sustainability clause to your Software Bills of Materials (SBOM) requirements. Similar to the bullet above, if you start requiring your vendors to disclose their SBOMs, add a requirement that they contribute to the sustainability of the projects they build into their products.

    There is, of course, still a need to coordinate and maximize the impact. Every critical piece of software infrastructure should be accounted for on a sustainability metric. Ideally, software foundations will step up as the coordinators, and I see some progress through the Alpha and Omega project. It doesn’t quite reach the scale needed, but it is a step in the right direction.

    If you work for a company that uses a lot of open source software (and chances are that you do) you may want to start asking questions about whether your employers are doing their part. If you do the job well of sustaining open source software and hardening your supply chains, you can spend a lot less on “security” software and services that generate reports that show thousands of problems. By coordinating with communities and ecosystems at large, you can help solve the problem at the source and stop paying ambulance chasers that capitalize on the fear. That’s why spending 10% of your IT budget on open source sustainability will be budget neutral for the first 2 years and deliver cost savings beyond that. Additionally, your developers will learn how to maintain open source software and collaborate upstream, yielding qualitative benefits in the form of greater technology innovation.

  • The Rise of Open Source Analytics Software

    I was pleased to read about the progress of Graylog2, ElasticSearch, Kibana, et al. in the past year. Machine data analysis has been a growing area of interest for some time now, as traditional monitoring and systems management tools aren’t capable of keeping up with All of the Things that make up many modern workloads. And then there are the more general purpose, “big data” platforms like Hadoop along with the new in-memory upstarts sprouting up around the BDAS stack. Right now is a great time to be a data analytics person, because there has never in the history of computing been a richer set of open source tools to work with.

    There’s a functional difference between what I call data processing platforms, such as Hadoop and BDAS, and data search presentation layers, such as what you find with the ELK stack (ElasticSearch, Logstash and Kibana). While Hadoop, BDAS, et al. are great for processing extremely large data sets, they’re mostly useful as platforms for people Who Know What They’re Doing (TM), ie. math and science PhDs and analytics groups within larger companies. But really, the search and presentation layers are, to me, where the interesting work is taking place these days: it’s where Joe and Jane programmer and operations person are going to make their mark on their organization. And many of the modern tools for data presentation can take data sets from a number of sources: log data, JSON, various forms of XML, event data piped directly over sockets or some other forwarding mechanism. This is why there’s a burgeoning market around tools that integrate with Hadoop and other platforms.

    There’s one aspect of data search presentation layers that has largely gone unmentioned. Everyone tends to focus on the software, and if it’s open source, that gets a strong mention. No one, however, seems to focus on the parts that are most important: data formats, data availability and data reuse. The best part about open source analytics tools is that, by definition, the data outputs must also be openly defined and available for consumption by other tools and platforms. This is in stark contrast to traditional systems management tools and even some modern ones. The most exciting premise of open source tooling in this area is the freedom from the dreaded data roach motel model, where data goes in, but it doesn’t come out unless you pay for the privilege of accessing the data you already own. Recently, I’ve taken to calling it the “skunky data model” and imploring people to “de-skunk their data.”

    Last year, the Red Hat Storage folks came up with the tag line of “Liberate Your Information.” Yes, I know, it sounds hokey and like marketing double-speak, but the concept is very real. There are, today, many users, developers and customers trapped in the data roach motel and cannot get out, because they made the (poor) decision to go with a vendor that didn’t have their needs in mind. It would seem that the best way to prevent this outcome is to go with an open source solution, because again, by definition, it is impossible to create an open source solution that creates proprietary data – because the source is open to the world, it would be impossible to hide how the data is indexed, managed, and stored.

    In the past, one of the problems is that there simply weren’t a whole lot of choices for would-be customers. Luckily, we now have a wealth of options to choose from. As always, I recommend that those looking for solutions in this area go with a vendor that has their interests at heart. Go with a vendor that will allow you to access your data on your terms. Go with a vendor that gives you the means to fire them if they’re not a good partner for you. I think it’s no exaggeration to say that the only way to guarantee this freedom is to go with an open source solution.

    Further reading:

     

  • The Tyranny of the Clouds

    Or “How I learned to start worrying and never trust the cloud.”

    The Clouderati have been derping for some time now about how we’re all going towards the public cloud and “private cloud” will soon become a distant, painful memory, much like electric generators filled the gap before power grids became the norm. They seem far too glib about that prospect, and frankly, they should know better. When the Clouderati see the inevitability of the public cloud, their minds lead to unicorns and rainbows that are sure to follow. When I think of the inevitability of the public cloud, my mind strays to “The Empire Strikes Back” and who’s going to end up as Han Solo. When the Clouderati extol the virtues of public cloud providers, they prove to be very useful idiots advancing service providers’ aims, sort of the Lando Calrissians of the cloud wars. I, on the other hand, see an empire striking back at end users and developers, taking away our hard-fought gains made from the proliferation of free/open source software. That “the empire” is doing this *with* free/open source software just makes it all the more painful an irony to bear.

    I wrote previously that It Was Never About Innovation, and that article was set up to lead to this one, which is all about the cloud. I can still recall talking to Nicholas Carr about his new book at the time, “The Big Switch“, all about how we were heading towards a future of utility computing, and what that would portend. Nicholas saw the same trends the Clouderati did, except a few years earlier, and came away with a much different impression. Where the Clouderati are bowled over by Technology! and Innovation!, Nicholas saw a harbinger of potential harm and warned of a potential economic calamity as a result. While I also see a potential calamity, it has less to do with economic stagnation and more to do with the loss of both freedom and equality.

    The virtuous cycle I mentioned in the previous article does not exist when it comes to abstracting software over a network, into the cloud, and away from the end user and developer. In the world of cloud computing, there is no level playing field – at least, not at the moment. Customers are at the mercy of service providers and operators, and there are no “four freedoms” to fall back on.

    When several of us co-founded the Open Cloud Initiative (OCI), it was with the intent, as Simon Phipps so eloquently put it, of projecting the four freedoms onto the cloud. There have been attempts to mandate additional terms in licensing that would force service providers to participate in a level playing field. See, for example, the great debates over “closing the web services loophole” as we called it then, during the process to create the successor to the GNU General Public License version 2. Unfortunately, while we didn’t yet realize it, we didn’t have the same leverage as we had when software was something that you installed and maintained on a local machine.

    The Way to the Open Cloud

    Many “open cloud” efforts have come and gone over the years, none of them leading to anything of substance or gaining traction where it matters. Bradley Kuhn helped drive the creation of the Affero GPL version 3, which set out to define what software distribution and conveyance mean in a web-driven world, but the rest of the world has been slow to adopt because, again, service providers have no economic incentive to do so. Where we find ourselves today is a world without a level playing field, which will, in my opinion, stifle creativity and, yes, innovation. It is this desire for “innovation” that drives the service providers to behave as they do, although as you might surmise, I do not think that word means what they think it means. As in many things, service providers want to be the arbiters of said innovation without letting those dreaded freeloaders have much of a say. Worse yet, they create services that push freeloaders into becoming part of the product – not a participant in the process that drives product direction. (I know, I know: yes, users can get together and complain or file bugs, but they cannot mandate anything over the providers)

    Most surprising is that the closed cloud is aided and abetted by well-intentioned, but ultimately harmful actors. If you listen to the Clouderati, public cloud providers are the wonderful innovators in the space, along with heaping helpings of concern trolling over OpenStack’s future prospects. And when customers lose because a cloud company shuts its doors, the clouderati can’t be bothered to bring themselves to care: c’est la vie and let them eat cake. The problem is that too many of the clouderati think that Innovation! is a means to its own ends without thinking of ground rules or a “bill of rights” for the cloud. Innovation! and Technology! must rule all, and therefore the most innovative take all, and anything else is counter-productive or hindering the “free market”. This is what happens when the libertarian-minded carry prejudiced notions of what enabled open source success without understanding what made it possible: the establishment and codification of rights and freedoms. None of the Clouderati are evil, freedom-stealing, or greedy, per se, but their actions serve to enable those who are. Because they think solely in terms of Innovation! and Technology!, they set the stage for some companies to dominate the cloud space without any regard for establishing a level playing field.

    Let us enumerate the essential items for open innovation:

    1. Set of ground rules by which everyone must abide, eg. the four freedoms
    2. Level playing field where every participant is a stakeholder in a collaborative effort
    3. Economic incentives for participation

    These will be vigorously opposed by those who argue that establishing such a list is too restrictive for innovation to happen, because… free market! The irony is that establishing such rules enabled Open Source communities to become the engine that runs the world’s economy. Let us take each and discuss its role in creating the open cloud.

    Ground Rules

    We have already established the irony that the four freedoms led to the creation of software that was used as the infrastructure for creating proprietary cloud services. What if the four freedoms where tweaked for cloud services. As a reminder, here are the four freedoms:

    • The freedom to run the program, for any purpose (freedom 0).
    • The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1).
    • The freedom to redistribute copies so you can help your neighbor (freedom 2).
    • The freedom to distribute copies of your modified versions to others (freedom 3).

    If we rewrote this to apply to cloud services, how much would need to change? I made an attempt at this, and it turns out that only a couple of words need to change:

    • The freedom to run the program or service, for any purpose (freedom 0).
    • The freedom to study how the service works, and change it so it does your computing as you wish (freedom 1).
    • The freedom to implement and redistribute copies so you can help your neighbor (freedom 2).
    • The freedom to implement your modified versions for others (freedom 3).

    Freedom 0 adds “or service” to denote that we’re not just talking about a single program, but a set of programs that act in concert to deliver a service.

    Freedom 1 allows end users and developers to peak under the hood.

    Freedom 2 adds “implement and” to remind us that the software alone is not much use – the data forms a crucial part of any service.

    Freedom 3 also changes “distribute copies of” to “implement” because of the fundamental role that data plays in any service. Distributing copies of software in this case doesn’t help anyone without also adding the capability of implementing the modified service, data and all.

    Establishing these rules will be met, of course, with howls of rancor from the established players in the market, as it should be.

    Level Playing Field

    With the establishment of the service-oriented freedoms, above, we have the foundation for a level playing field with actors from all sides having a stake in each other’s success. Each of the enumerated freedoms serves to establish a managed ecosystem, rather than a winner-take-all pillage and plunder system. This will be countered by the argument that if we hinder the development of innovative companies won’t we a.) hinder economic growth in general and b.) socialism!

    In the first case, there is a very real threat from a winner-take-all system. In its formative stages, when everyone has the economic incentive to innovate (there’s that word again!), everyone wins. Companies create and disrupt each other, and everyone else wins by utilizing the creations of those companies. But there’s a well known consequence of this activity: each actor will try to build in the ability to retain customers at all costs. We have seen this happen in many markets, such as the creation of proprietary, undocumented data formats in the office productivity market. And we have seen it in the cloud, with the creation of proprietary APIs that lock in customers to a particular service offering. This, too, chokes off economic development and, eventually, innovation. At first, this lock in happens via the creation of new products and services which usually offer new features that enable customers to be more productive and agile. Over time, however, once the lock-in is established, customers find that their long-term margins are not in their favor, and moving to another platform proves too costly and time-consuming. If all vendors are equal, this may not be so bad, because vendors have an incentive to lure customers away from their existing providers, and the market becomes populated by vendors competing for customers, acting in their interest. Allow one vendor to establish a larger share than others, and this model breaks down. In a monopoly situation, the incumbent vendor has many levers to lock in their customers, making the transition cost too high to switch to another provider. In cloud computing, this winner-take-all effect is magnified by the massive economies of scale enjoyed by the incumbent providers. Thus, the customer is unable to be as innovative as they could be due to their vendor’s lock-in schemes. If you believe in unfettered Innovation! at all costs, then you must also understand the very real economic consequences of vendor lock-in. By creating a level playing field through the establishment of ground rules that ensure freedom, a sustainable and innovative market is at least feasible. Without that, an unfettered winner-take-all approach will invariably result in the loss of freedom and, consequently, agility and innovation.

    Economic Incentives

    This is the hard one. We have already established that open source ecosystems work because all actors have an incentive to participate, but we have not established whether the same incentives apply here. In the open source software world, developers participate because they had to, because the price of software is always dropping, and customers enjoy open source software too much to give it up for anything else. One thing that may be in our favor is the distinct lack of profits in the cloud computing space, although that changes once you include services built on cloud computing architectures.

    If we focus on infrastructure as a service (IaaS) and platform as a service (PaaS), the primary gateways to creating cloud-based services, then the margins and profits are quite low. This market is, by its nature, open to competition because the race is on to lure as many developers and customers as possible to the respective platform offerings. However, the danger becomes if one particular service provider is able to offer proprietary services that give it leverage over the others, establishing the lock-in levers needed to pound the competition into oblivion.

    In contrast to basic infrastructure, the profit margins of proprietary products built on top of cloud infrastructure has been growing for some time, which incentivizes the IaaS and PaaS vendors to keep stacking proprietary services on top of their basic infrastructure. This results in a situation where increasing numbers of people and businesses have happily donated their most important business processes and workflows to these service providers. If any of them are to grow unhappy with the service, they cannot easily switch, because no competitor would have access to the same data or implementation of that service. In this case, not only is there a high cost associated with moving to another service, there is the distinct loss of utility (and revenue) that the customer would experience. There is a cost that comes from entrusting so much of your business to single points of failure with no known mechanism for migrating to a competitor.

    In this model, there is no incentive for service providers to voluntarily open up their data or services to other service providers. There is, however, an incentive for competing service providers to be more open with their products. One possible solution could be to create an Open Cloud certification that would allow services that abide by the four freedoms in the cloud to differentiate themselves from the rest of the pack. If enough service providers signed on, it would lead to a network effect adding pressure to those providers who don’t abide by the four freedoms. This is similar to the model established by the Free Software Foundation and, although the GNU people would be loathe to admit it, the Open Source Initiative. The OCI’s goal was to ultimately create this, but we have not yet been able to follow through on those efforts.

    Conclusion

    We have a pretty good idea why open source succeeded, but we don’t know if the open cloud will follow the same path. At the moment, end users and developers have little leverage in this game. One possibility would be if end users chose, at massive scale, to use services that adhered to open cloud principles, but we are a long way away from this reality. Ultimately, in order for the open cloud to succeed, there must be economic incentives for all parties involved. Perhaps pricing demands will drive some of the lower rung service providers to adopt more open policies. Perhaps end users will flock to those service providers, starting a new virtuous cycle. We don’t yet know. What we do know is that attempts to create Innovation! will undoubtedly lead to a stacked deck and a lack of leverage for those who rely on these services.

    If we are to resolve this problem, it can’t be about innovation for innovation’s sake – it must be, once again, about freedom.

     

  • Do Open Source Communities Have a Social Responsibility?

    This post continues my holiday detour into things not necessarily tech related. Forgive me this indulgence – there is at least one more post I’ll make in a similar vein.

    Open Source communities are different. At least, I’ve always felt that they are. Think of the term “community manager.” If you’re a community manager in an open source community your responsibilities include, but are not limited to: product management, project management, enabling your employer’s competition, enabling people’s success without their paying you, marketing strategy and vision, product strategy and vision, people management (aka cat-herding), event management, and even, sometimes, basic IT administration and web development. If you ask a community manager in some other industry, they do anywhere from half of those things to, at most, 3/4. But even the most capable  community manager in a non-open source field will not do at least two of the things mentioned, enabling your competitors and enabling “freeloaders”. (Before anyone says anything – no, enabling non-paying contributors to upload free content that the your employer uses to rake in ad revenue doesn’t count for the latter. That’s called tricking people into contributing free labor to a product you sell.)

    So it would seem that Open Source community management is a different beast, a much more comprehensive set of duties and, dare I say it, a proving ground for executive leadership. There are other differences, too, that make the scope of open source communities different and more expansive. Beginning with the GNU project and the Free Software Foundation, the roots of open source are enmeshed with social responsibility, but do modern open source communities continue to carry the flame of social responsibility?

    One of the things that attracted me to open source communities in the beginning was the sense that by participating in them, I was making the world a better place. And by that, I don’t mean in the Steve Jobs sense, where “making the world a better place” means “anything that fattens my wallet and strips people of their information rights.” I mean actually creating something that adds value to others without expecting any form of monetary remuneration. Others have called this a “gift economy” but I’m not sure that’s exactly correct. I mean, I’ve always been paid for my open source work, which is different from other social advocates who literally make nothing for their efforts. Regardless, there’s a sense that I’m enabling a better world while also drawing a nice paycheck, which certainly beats making the world crappier while drawing an even bigger paycheck.

    Anyway, throughout my open source community career, I’ve seen all sorts of social causes at work: bridging the digital divide, defining information rights and, more recently, gender and ethnic equality in technology. Because of our social activism roots the question becomes, how much responsibility do we have as open source advocates to carry the torch for related causes? Take the Ada Initiative, for example. Does it not behoove us to do our part for gender equality in high tech? How many open source conferences have you been to that were >90% male? Does saying that “well, the code is open, so anyone can participate” really cut it? If we’re really going to address the problem of the digital divide, does it not make sense to more aggressively recruit women and under-represented minorities into the fold?

    If we really want to rid the world or proprietary software, I don’t see how we can do that without adding in people who currently do not actively participate in open source communities. There’s also been a disturbing trend whereby the more commercial communities have begun to separate themselves from the communities with more social activism roots, dividing the hippies from the money-makers. As I noted in my previous post, the hippies were right the whole time about the four freedoms, so perhaps we should listem to them more closely on these other issues? Think about it – if we more aggressively recruit from under-represented portions of society, would that not add a much-needed influx of talent and ambition? Would that not, then, make our communities that much more dynamic and productive? I’ve always held that economics has a long-term liberal bias, and I think this is an opportunity to put that maxim to the test.

    This holiday season, let’s think about the social responsibility of open source communities and its participants. Let’s think about ways we can bring the under-represented into the fold.

  • It Was Never About Innovation

    This is the first in a series of articles about innovation and open computing. Because it’s a holiday time of year in the USA, I’ve decided that these next few articles will be a detour from the usual stuff you’ll find here.

    Ever since a few of us got together to form the Open Cloud Initiative, I’ve looked at cloud computing with awe, but also mistrust. There are many good things that can come of cloud computing initiatives, but there’s also the opportunity (some might say inevitability) of abuse and exploitation.

    Over the last few months, I’ve made a point of giving a talk at various conferences with the title of “It Was Never About Innovation.” The point being that Open Source software proved victorious in the data center, not because developers necessarily wanted to release more software under an open source license or because open source development models are necessarily more innovative. No, as I see it, open source led to more innovation and took over the data center because of the basic ground rules that were laid down from the beginning with the intent of creating an ecosystem that espoused the four freedoms as enumerated by Richard Stallman and the Free Software Foundation. It was those ground rules that leveled the playing field, forcing developers to treat end users as equal partners in the game of software development. You will note that it was open source that took over the data center, not freemium free-as-in-beer software. As I’ve grown fond of saying over the last few months, the hippies had it right the whole time. In this model, innovation wasn’t the end goal, it was just a very interesting by-product.

    I posit that innovation is much like Douglas Adams’ description of flying in “The Hitchhiker’s Guide to the Galaxy,” which you’ll no doubt recall as the art of throwing yourself at the ground and missing. To attempt to fly would be to miss the point – and fail miserably. No, the trick is to distract yourself before you hit the ground so that flying becomes the end result. Innovation is very much like that. To attempt to be innovative is to perhaps miss the entire point of how the creative process happens. Technology office parks, anyone? Incubators? Are these supposed houses of innovation necessarily more innovative than the alternatives?

    My point is that the innovation that’s taking place right now in multiple open source ecosystems is due to the positive feedback loop that was a direct consequence of implementing the four freedoms and mandating that all parties abide by them. It was the implementation of the four freedoms that created a system in which “freeloaders”, those who don’t pay anything for software, could be every bit as important as the developer or even paying customer. If developers didn’t necessarily want to participate in an open ecosystem, which forced participants into abiding by rules that weren’t necessarily in their direct self-interest, why, then, did they willingly participate?

    That was the question I set out to answer way back in 2005, when I wrote “There is no Open Source Community.” The impetus for that paper was when I found myself unable to answer the essential question, “Why do developers willingly release open source software? Was it out of some sense of charity? Of providing for the greater good?” One of the most startling discoveries in my young career, back when I worked on SourceForge.net at VA Software, was that most developers who write open source software don’t really care about the concepts of “open source” or “free software”. At the time, we conducted a survey of developers on SourceForge.net and were surprised to discover that they really didn’t give two tosses about the four freedoms. They weren’t free software advocates, and yet they were still free software participants. But why?

    What I discovered by working through thought experiments and measuring the results of my model versus reality was that developers didn’t write open source software because they wanted to – they wrote it because they had to. There is an economic incentive for developers to participate in open source communities due to three major trends:

    1. The ubiquity of internet access

    2. Possibly as a direct result of #1, the ubiquity of developers writing software, and

    3. Also as a direct result of the previous bullet, the price of any given feature in software is asymptotic to zero over time.

    When the forces of economics put constant downward price pressure on software, developers look for other ways to derive income. Given the choice between simply submitting to economic forces and releasing no-cost software in proprietary form, developers found open source models to be a much better deal. Some of us didn’t necessarily like the mechanics of those models, which included dual licensing and using copyleft as a means of collecting ransom, but it was a model in which developers could thrive. I wrote this back in 2005, all from the developer’s and software vendor’s point of view. For years, I struggled with how to incorporate the end user’s point of view. I simply didn’t find the problem that interesting to contemplate. Of course end users would take to open source software: who doesn’t want to get access to stuff free-of-charge?

    Recently I started asking other questions as I contemplated the state of cloud computing and how it relates to open source economics. For example, if end users have the choice between something free-of-charge but proprietary versus open source software, how and why did open source win the day? If we believe that end users played a role in the model, and I think it’s clear that they did, why did they make this choice? Often I’m told that customers don’t care about open source or freedom, because they just want solutions that work. And yet, we have evidence that customers very much do care about those things. If they didn’t, why the overwhelming demand for software that’s open source? Also, the cost of acquiring software is minimal compared to the cost of maintaining software deployments, so why is the cost-free aspect of open source even a factor? One could argue that the speed and agility of acquiring something free-of-charge leads to ubiquity, but why prefer open source? After all, we’ve been told for many years that end users rarely use the direct access to software to make changes under the hood – they usually end up relying on the software developer for that, just as in the proprietary model. We simply have not understood very well why this process works.

    What I’ve come to believe is that it’s all about agility and, yes, the four freedoms. In fact, those two things are very much related. Think about it: there is no such thing as a single vendor for everything in the modern data center. That’s impossible at the moment (and, one could argue, ever). Data centers have incorporated automation and orchestration of the different layers of software stacks that must interact seamlessly so that the operators can go home at night and not suffer through incessant pager alerts. These layers need to inter-operate at levels of complexity that make lesser operators fall asleep weeping, unsure if they could rebuild all of their services if called upon. One very telling anecdote regarding this phenomenon came from one such operators who casually commented to me that “if our data center went down and we were forced to do a complete reboot, we wouldn’t know how to do it.”

    In this scenario, you have to rely on things that allow you to automate and orchestrate at will, on your timetable, not that of a vendor. Think of all the projects that Netflix, Twitter and Facebook have released that allow them to orchestrate massive amounts of software and data in ways that deliver services without single points of failure. There is no universe in which this could be done with proprietary software. This is where the four freedoms come in. By creating an ecosystem that mandated the four freedoms, end users, consequentially, were able to participate in this ecosystem for the first time as equal partners to the developers. This dramatically changed the dynamics of customer-vendor relationships. In the world of open source software, end users have the freedom to be as agile as they will ever need to be. They can report problems, patch software, release new versions if the other developers don’t step up, and work in a system that allows them to deliver solutions with much faster times to market. The other developers, perhaps pure software vendors or also developer-operators, also benefit by virtue of the fact that willing participants in the form of end users create better software, and thus begins the virtuous cycle.

    To review:

    Developers write version 0.1 of some set of tools → end users evaluate software and decide how valuable it is → if it’s usable, end users either add patches, file bug reports, send feedback or all of the above → developers incorporate feedback, accept or reject patch submissions, rewrite portions of the code → end users evaluate and either use or discard.

    Lather, rinse, repeat. One couldn’t have set out to design a system that was as agile and innovative, which is precisely my point! This system works because the open source ecosystem is a level playing field, and there is an economic incentive for all parties. Developers don’t rule over end users, and end users, by definition, can’t rule the process because they don’t produce the code, but it was their adoption of open source solutions that forced the hands of developers. In this model, developers showed up to fill customer demand. It was this level playing field, which was a direct result of implementing the four freedoms, that led to all the innovation that happens today. And as we have all-too-often been reminded, humans don’t generally set out to create level playing fields, which is why groups of people could never have intended to create this feedback loop. All the various parties would have at various points become too greedy, protecting their own interests, thus the reason it had to begin with the four freedoms.

    And look where we are now – in a world where much of the innovation is just as likely (if not more) to come from “users” as it is from venerable software vendors. This is the world we live in today, where a company like Yahoo developed some data analysis software internally and decided that it might be useful for the rest of the world to work with, thus beginning the Hadoop juggernaut. Rackspace, long known for its hosting and support, not software development, collaborates with NASA and unleashes what became the OpenStack ecosystem. In both cases, the role of traditional software vendors was reduced to, at first, fast followers, not lead innovators or developers. None of this would have been possible without stating, from the beginning, that every software program must honor these freedoms:

    • The freedom to run the program, for any purpose (freedom 0).
    • The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1).
    • The freedom to redistribute copies so you can help your neighbor (freedom 2).
    • The freedom to distribute copies of your modified versions to others (freedom 3).

    This is all fine and dandy, but what does this mean in a cloud computing context? You’ll have to wait for the next installment, coming soon!

  • On the Gluster vs Ceph Benchmarks

    If you’ve been following the Gluster and Ceph communities for any length of time, you know that we have similar visions for open software-defined storage and are becoming more competitive with each passing day. We have been rivals in a similar space for some time, but on friendly terms – and for a couple of simple reasons: 1.) we each have common enemies in the form of proprietary big storage and 2.) we genuinely like each other. I’m a Sage Weil fan and have long been an admirer of his work. Ross Turk and Neil Levine are also members of the Inktank clan whom I respect and vouch for on a regular basis. There are others I’m forgetting, and I hope they don’t take it personally!

    So you can imagine the internal debate I had when presented with the first results of a Red Hat Storage comparison with Ceph in a set of benchmarks commissioned by the Red Hat Storage product marketing group (for reference, they’re located here). If you saw my presentations at the OpenStack Summit in Hong Kong, then you know I went with it, and I’m glad I did. While the Ceph guys have been very good about not spouting FUD and focusing instead on the bigger picture – taking down EvilMaChines, for example – others in the clan of OpenStack hangers-on have not been so exemplary.

    I don’t know who, exactly, the Red Hat Storage marketing group was targeting with the benchmarks, but I am targeting a very specific audience, and it isn’t anyone associated with Inktank or the Ceph project. I am targeting all the people in the OpenStack universe who wrote us off and wanted to declare the storage wars over. I’m also a bit tired of the inexplicable phrase that “Ceph is faster than Gluster”, often said with no qualification, which I’ve known for sometime was not true. It’s that truism, spouted by some moustachioed cloudy hipsters at an OpenStack meetup, that rankles me – almost as much as someone asking me in a public forum why we shouldn’t all ditch Gluster for Ceph. The idea that one is unequivocally faster or better than the other is completely ridiculous – almost as ridiculous as the thought that hipsters in early 20th century drag are trusted experts at evaluating technology. The benchmarks in question do not end any debates. On the contrary, they are just the beginning.

    I felt uneasy when I saw Sage show up at our Gluster Cloud Night in Hong Kong, because I really didn’t intend for this to be an “In yo’ face!” type of event. I did not know beforehand that he would be there, but even if I had, I wouldn’t have changed my decision to show the results. The “Ceph is faster” truism had become one of those things that everyone “knows” without the evidence to support it, and the longer we let it go unopposed, the more likely it was to become a self-fulfilling prophecy. Also, while we may have common enemies, it has become increasingly clear that the OpenStack universe would really prefer to converge around a single storage technology, and I will not let that happen without a fight.

    We’ve radically improved GlusterFS and the Gluster Community over the last couple of years, and we are very proud of our work. We don’t have to take a back seat to anyone; we don’t have to accept second place to anyone; and we’re not going to. In the end, it’s very clear who the winners of this rivalry will be. It won’t be Ceph, and it won’t be Gluster. It will be you, the users and developers, who will benefit from the two open source heavyweights scratching and clawing their way to the top of the heap. Rejoice and revel in your victory, because we work for you.

    To see the benchmark results for yourself, see the Red Hat Storage blog post on the subject.

    To see the VAR Guy’s take, see this article.

  • Some Thoughts on Gluster Community Governance

    tl;dr
    – This is a long description designed to elicit constructive discussion of some recent Gluster Community governance initiatives. For all things related to Gluster Community Governance, see gluster.org/Governance

    The recent initiatives around GlusterFS development and project governance have been quite amazing to witness – we have been making steady progress towards a “real” open source model for over a year now, and the 3.5 planning meetings are a testament to that.

    You may have also noticed recent announcements about organizations joining the Gluster Community and the formation of a Gluster Community Board. This is part of the same process of opening up and making a better, more active community, but there is a need to define some of the new (and potentially confusing) terminology.

    – Gluster Community: What is the Gluster Community? It is a group of developers, users and organizations dedicated to the development of GlusterFS and related projects. GlusterFS is the flagship project of the Gluster Community, but it is not the only one – see forge.gluster.org to get a sense of the scope of the entire ecosystem. Gluster Community governance is different from GlusterFS project governance.

    – Gluster Community Board: This consists of individuals from the Gluster Community, as well as representatives of organizations that have signed letters of intent to contribute to the Gluster Community.

    – Letter of Intent: document signed by organizations who wish to make material contributions to the Gluster Community. These contributions may take many forms, including code contributions, event coordination, documentation, testing, and more. How organizations may contribute is listed at gluster.org/governance

    – Gluster Software Distribution: with so many projects filling out the Gluster Community, there is a need for an incubation process, as well as a need for criteria that determine eligibility for graduating from incubation into the GSD. We don’t yet know how we will do this and are looking for your input.

    We realized some time ago that there was quite a demand for contributing to and growing the community, but there was no structure in place to do it. The above is our attempt to create an inclusive community that is not solely dependent on Red Hat and enlists the services of those who view the Gluster Community as a valuable part of their business.

    All of this is in-process but not yet finalized. There is an upcoming board meeting on September 18 where we will vote on parts or all of this.

    For all links and documents regarding Gluster Community governance, you can always find the latest here: gluster.org/Governance

  • GlusterFS portability on full view – ARM 64

    Today at Red Hat Summit, Jon Masters, Red Hat’s chief ARM architect, demonstrated GlusterFS replicated on two ARM 64 servers, streaming a video. This marks the first successful demo of a distributed filesystem running on ARM 64. Video and podcast to come soon.