Tag: security

  • Software Composition Analysis is Finally Dead – Good Riddance

    There is a long history of products that were solutions looking for problems, but none were as exasperating, futile, and devoid of application like software composition analysis, or SCA, the dumb database of security “products”. More ambulance-chasing than providing anything useful, SCA vendors were the personal injury lawyers of the technology world. Initially, they started life as an attempt to help enterprises get a handle on the vast quantities of open source dependencies and frameworks they used to build applications, especially with respect to license compliance. That proved to be a pretty limited source of revenue, however, so SCA vendors moved on to something they could seek legitimate rent from: scaring the bejesus out of CIOs everywhere and convincing them that open source was scary stuff that needed to be held at arm’s length no matter the cost. That is to say, they morphed into a security product category and spent their time convincing customers that they were at great risk from all this open source stuff, and only the SCA vendors could help customers avoid an apocalyptic end. It didn’t help that some rather large enterprises misconfigured their systems and allowed hackers to exfiltrate sensitive data – I’m looking at you Equifax.

    These data breaches would have been prevented by better security processes and more secure configurations, but it was the SCA vendors that decided to emphasize the role of open source software. It was the SCA vendors that exaggerated the exploitability of every published security vulnerability, no matter how little it applied to a given application’s context, because SCA vendors didn’t care about context. Even the log4j vulnerability, as bad as it was, was only vulnerable in specific circumstances. But good luck explaining that one to your CISO after the SCA vendors had their way. It was the SCA vendors that prevented enterprises from participating in open source communities because they convinced their customers that collaborating upstream would dirty their bottoms and give them a case of the ick. it’s because of the SCA vendors that I have had great difficulty convincing my technology leaders of the value of contributing to open source projects. So no, I will shed no tears for the death of an industry category that caused more harm than good and is at least partially responsible for the terrible state of affairs that is open source maintainership.

    What is SCA?

    For those who never had the misfortune of being subjected to these “products”, you may be forgiven for wondering just what all the fuss is about and why you should care. here is a very simple overview:

    • Take a source code repository, probably in git, and look at the language it’s written in. If it’s python-based, there will be a file called “requirements.txt” and if it’s Java-based, there will be a file called “pom.xml”.
    • Those files contain lists of libraries or software “dependencies” that are needed to use or run the software in your source code repository
    • The SCA scanner looks at those libraries, determines if there are other dependencies that will be used but are not on the list (I’m not going to get into the details of this), and analyses their metadata. Note: the SCA scanner does not actually scan the software; it only scans the metadata that describes the software: version numbers, licensing, file size, etc
    • After compiling a list of all the dependencies it can find, the SCA scanner phones home and compares your metadata to its database, looking for version matches
    • In its massive metadata library, the SCA scanner looks up published security vulnerabilities and determines the likelihood that your software is using vulnerable libraries, based on the matches it finds
    • The SCA vendors often supplement the publicly available vulnerability data with their own proprietary research data that they don’t share because why would they want to solve security problems? That doesn’t increase their revenue share
    • The SCA scanner then provides you a list of vulnerable software and gives you a score of how risky it is based on published security analysis.

    This is SCA scanning in a nutshell. Note that at no point do they actually analyze any software. They only match metadata with security data and then give you their best guess as to whether it applies to you. Given that they are incentivized to amplify whatever risk you actually face, don’t count on them being very accurate or proactively removing false positives. They would much rather terrify you into believing you have a severe problem that you need to pay millions of dollars to rectify. SCA vendors have no idea if a particular vulnerability makes your software less secure (this is changing, but these vendors have been loathe to provide this context, because it undermines their value proposition). At its heart, an SCA scanner is a predictive analysis tool that tries to tell you how much security risk you have incurred with your software. If I modify a library or its configuration to improve the security, the SCA tool isn’t smart enough to understand that and will simply label your modified software with the same metadata analysis as the unmodified version. It’s a dumb tool that abhors nuance and besides, it’s better for the SCA vendor if they can tag as many libraries as possible with the high severity security vulnerability label.

    For inexplicable reasons, this industry category has been around for over 20 years now, and it is finally dying. Agentic engineering systems are rendering it null and void. This is ironic, because autonomous agents are able to find and exploit vulnerable software faster than ever, so you might think that SCA tools are needed more than ever. And yet, it’s the agentic tools that can now find and fix vulnerabilities just as quickly. In fact, agentic systems uncover and fix vulnerabilities so quickly that there’s little time to publish vulnerability data in a dumb metadata library. This is going to accelerate so quickly that the need for SCA scanning, such that it ever existed, will approach zero. Why would I need an SCA scanner when the software in question has already been updated, with a fixed version published for my consumption? Why would I need a predictive analysis tool when I can systematically retrieve the fixed versions of software as quickly as you can publish the software metadata? That’s just it: I won’t need a predictive analysis tool. Just give me the bits.

    Death to SCA vendors. May they waste away slowly and painfully.

  • The Revenge of the Linux Distribution

    The Revenge of the Linux Distribution

    Some things appear in hindsight as blindingly obvious. And to some of us, perhaps they seemed obvious to the world even at the time. The observations of Copernicus and Galileo come to mind. To use a lesser example, let’s think back to the late 2000s and early 2010s when this new-fangled methodology called “devops” started to take shape. This was at a moment in time when “just-in-time” (JIT) was all the rage, and just-in-time continuous integration (CI) was following the same path as just-in-time inventory and manufacturing. And just like JIT inventory management had some weaknesses that were exposed later (supply chain shocks), so too were the weak points of JIT CI similarly exposed in recent security incidents. But it wasn’t always thus – let’s roll back the clock even further, shall we?

    Party like it’s 1999

    Back when Linux was first making headway towards “crossing the chasm” in the late 90s, Linux distributions were state of the art. After all, how else could anyone keep track of the all the system tools, core libraries, and language runtime dependencies without a curated set of software packaged up as part of a Linux distribution? Making all this software work together from scratch was quite difficult, so thank goodness for the fine folks at Red Hat, SuSE, Caldera, Debian, and Slackware for creating ready-made platforms that developers could rely on for consistency and reliability. They featured core packages by default that would enable anyone, so long as they had hardware and bandwidth, to run their own application development shop and then deliver those custom apps on the very same operating systems, in one consistent dev-run-test-deploy workflow. They were almost too good – so good, in fact, that developers and sysadmins (ahem, sorry… “devops engineers”) started to take them for granted. The heyday of the linux distribution was probably 2006, when Ubuntu Linux, which was based on Debian, became a global phenomenon, reaching millions of users. But then a funny thing happened… with advances in software automation, the venerable Linux distribution started to feel aged, an artifact from a bygone time when packaging, development, and deployment were all manual processes, handled by hand-crafted scripts created with love by systems curmudgeons who rarely saw the light of day.

    The Age of DevOps

    With advances made in systems automation, the question was asked, reaching a crescendo in the early to mid-2010’s, “why do we need Linux distributions, if I can pull any language runtime dependency I need at a moment’s notice from a set of freely available repositories of artifacts pre-built for my operating system and chip architecture? Honestly, it was a compelling question, although it did lead to iconic graphics like this one from XKCD:

    For a while it was so easy. Sure, give me a stripped down platform to start with, but then get the operating system out of the way, and let me design the application development and deployment layers. After all, any competent developer can assemble the list of dependencies they will need in their application. Why do I need Red Hat to curate it for me? Especially when their versions are so out of date? The rise of Docker and the race to strip down containers was a perfect example of this ethos.

    A few incidents demonstrated the early limitations of this methodology, but for the most part the trend continued apace, and has remained to this day. But now it feels like something has changed. It feels like curation is suddenly back in vogue. Because of the risks from typo-squatting, social engineering hacks, and other means of exploiting gaps in supply chain security, I think we’ve reached somewhat of a sea change. In a world where the “zero trust” buzzword has taken firm hold, it’s no longer en vogue to simply trust that the dependencies you download from a public repository are safe to use. To compensate, we’ve resorted to a number of code scanners, meta data aggregators, and risk scoring algorithms to determine whether a particular piece of software is relatively “safe”. I wonder if we’re missing the obvious here.

    Are We Reinventing the Wheel?

    Linux distributions never went away, of course. They’ve been around the whole time, although assigned to the uncool corner of the club, but they’re still here. I’m wondering if now is a moment for their return as the primary platform application development. One of the perennial struggles of keeping a distribution up to date was the sheer number of libraries one had to curate and oversee, which is in the tens of thousands. Here’s where the story of automation can come back and play a role in the rebirth of the distribution. It turns out that the very same automation tools that led some IT shops to get too far ahead over their skis and place their organizations at risk also allow Linux distributions to operate with more agility. Whereas in the past distributions struggled to keep up the pace, now automated workflows allow curation to operate quickly enough for most enterprise developers. Theoretically, this level of automated curation could be performed by enterprise IT, and indeed it is at some places. But for teams who don’t have expertise in the area of open source maintainership or open source packaging, the risk is uncertain.

    Is It Time for a Comeback?

    I don’t know for a fact that Linux distributions are poised to return to the center of application development, but I do know that much of what we’re doing to isolate and mitigate risk – security scanning, dependency curation, policy enforcement, and scorecards – feels an awful lot like what you get “out of the box” with a distribution. Enterprise IT has moved to a different delivery model than what existed previously, and moving away from that is not trivial. But if I were looking to start an organization or team from scratch, and I wanted to reduce the risk of supply chain attacks, I would probably elect to outsource risk mitigation to a curated distribution as much as possible.
  • The Open Source Supply Chain Was Always Broken

    I’ve written a number of articles over the years about open source software supply chains, and some of the issues confronting open source sustainability. The ultimate thrust of my supply chain advocacy culminated in this article imploring users to take control of their supply chains. I naively thought that by bringing attention to supply chain issues, more companies would step up to maintain the parts that were important to them. When I first started brining attention to this matter, it was November 2014, when I keynoted for the first time at a Linux Foundation event. Over the next 3 years, I continued to evolve my view of supply chains, settling on this view of supply chain “funnels”:

    Diagram of a typical open source supply chain funnel, where upstream comments are pulled into a distribution, packaged for widespread consumption and finally made into a product
    Diagram of open source supply chian funnel

    So, what has happened since I last published this work? On the plus side, lots of people are talking about open source supply chains! On the downside, no one is drawing the obvious conclusion: we need companies to step up on the maintenance of said software. In truth, this has always been the missing link. Unfortunately, what has happened instead is that we now have a number of security vendors generating lots of reports that show thousands of red lights flashing “danger! danger!” to instill fear in any CISO that open source software is going to be their undoing at any given moment. Instead of creating solutions to the supply chain problem, vendors have instead stepped in to scare the living daylights out of those assigned the thankless task of protecting their IT enterprises.

    Securing Open Source Supply Chains: Hopeless?

    Originally, Linux distributions signed on for the role of open source maintainers, but the world has evolved towards systems that embrace language ecosystems with their ever-changing world of libraries, runtimes, and frameworks. Providing secure, reliable distributions that also tracked and incorporated the changes of overlaid language-specific package management proved to be a challenge that distribution vendors have yet to adequately meet. The uneasy solution has been for distribution vendors to provide the platform, and then everyone re-invents (poorly) different parts of the wheel for package management overlays specific to different languages. In short, it’s a mess without an obvious solution. It’s especially frustrating because the only way to solve the issue in the current environment would be for a single vendor to take over the commercial open source world and enforce by fiat a single package management system. But that’s frankly way too much power to entrust to a single organization. The organizations designed to provide neutral venues for open source communities, foundations, have also not stepped in to solve the core issues of sustainability or the lack of package management standardization. There have been some efforts that are noteworthy and have made a positive impact, but not the extent that is needed. Everyone is still wondering why certain critical components are not adequately maintained and funded, and everyone is still trying to undertand how to make language-specific package ecosystems more resilient and able to withstand attacks from bad-faith users and developers. (note: sometimes the call *is* coming from inside the house)

    But is the supply chain situation hopeless? Not at all. Despite the inability to solve the larger problems, the fact is that every milestone of progress brings us a step closer to more secure ecosystems and supply chains. Steps taken by multiple languages to institute MFA requirements for package maintainers, to use but one example, result in substantial positive impacts. These simple, relatively low-cost actions cover the basics that have long been missing in the mission to secure supply chains. But that brings us to a fundamental issue yet to be addressed: whose job is it to make supply chains more secure and resilient?

    I Am Not Your Open Source Supply Chain

    One of the better essays on this subject was written by Thomas Depierre titled “I Am Not a Supplier“. While the title is a bit cheeky and “clickbait-y” (I mean, you are a supplier, whether you like it or not) he does make a very pertinent – and often overlooked – point: developers who decide to release code have absolutely no relationship with commercial users or technology vendors, especially if they offer no commercial support of that software. As Depierre notes, the software is provided “as is” with no warranty.

    Which brings us back to the fundamental question: if not the maintainers, whose responsibility is open source supply chains?

    The 10% Rule

    I would propose the following solution: If you depend on open source software, you have an obligation to contribute to its sustainability. That means if you sell any product that uses open source software, and if your enterprise depends on the use of open source software, then you have signed on to maintain that software. This is the missing link. If you use, you’re responsible. In all, I would suggest replacing 10% of your engineering spend with upstream open source maintenance, and I’ll show how it won’t break the budget. There are a number of ways to do this in a sustainable way that leads to higher productivity and better software:

    • Hire a maintainer for software you depend on – this is a brute force method, but it would be valuable for a particularly critical piece of software
    • Fund projects dedicated to open source sustainability. There are a number of them, many run out of larger software foundations, eg. The Linux Foundation, the ASF, Eclipse, the Python Software Foundation, and others.
    • Pay technology vendors who responsibly contribute to upstream projects. If your vendors don’t seem to support the upstream sources for their software, you may want to rethink your procurement strategies
    • Add a sustainability clause to your Software Bills of Materials (SBOM) requirements. Similar to the bullet above, if you start requiring your vendors to disclose their SBOMs, add a requirement that they contribute to the sustainability of the projects they build into their products.

    There is, of course, still a need to coordinate and maximize the impact. Every critical piece of software infrastructure should be accounted for on a sustainability metric. Ideally, software foundations will step up as the coordinators, and I see some progress through the Alpha and Omega project. It doesn’t quite reach the scale needed, but it is a step in the right direction.

    If you work for a company that uses a lot of open source software (and chances are that you do) you may want to start asking questions about whether your employers are doing their part. If you do the job well of sustaining open source software and hardening your supply chains, you can spend a lot less on “security” software and services that generate reports that show thousands of problems. By coordinating with communities and ecosystems at large, you can help solve the problem at the source and stop paying ambulance chasers that capitalize on the fear. That’s why spending 10% of your IT budget on open source sustainability will be budget neutral for the first 2 years and deliver cost savings beyond that. Additionally, your developers will learn how to maintain open source software and collaborate upstream, yielding qualitative benefits in the form of greater technology innovation.

  • Is Open Source More Risky?

    Is Open Source More Risky?

    There’s been a long-running debate over open source and security, and it goes something like this:

    Pro: Open source is awesome! Given enough eyes, all bugs are shallow. This is why open source software is inherently more secure.

    Con: Hackers can see the code! They’ll look at the source code and find ways to exploit it. This is why open source software is inherently more insecure.

    And on and on… ad nauseum. There are a variety of studies that each side can finger to help state their case. The problem as I see it, is that we’re not even talking about the same thing. If someone says open source software is more or less secure, what are they actually talking about? Do they mean software you download from the web and push into production? Or do they mean vendor-supported solutions? Unless we can agree on that, then any further discussion is pointless.

    Open Source Products

    So let’s shift the conversation to an apples vs. apples comparison so that we’re discussing the same things. According to a survey by Black Duck, upwards of 96% of commercial software solutions use open source software to some extent. This means virtually *all* new software solutions use open source software. So, when someone argues whether open source is more or less secure, the question to ask is, “more or less secure than *what*?” Because as we can see, the number of software solutions that *don’t* use open source software is rapidly dwindling.

    To save everyone’s breath, let’s change the dynamics of this conversation. Let’s compare “raw” upstream open source code vs. supported software solutions backed by a vendor. As I’ve mentioned before, you can do the former, but it helps if you’re Amazon, Google or Facebook and have an army of engineers and product managers to manage risk. Since most of us aren’t Amazon, Google or Facebook, we usually use a vendor. There are, of course, many grey areas in-between. If you choose to download “raw” code and deploy in production, there are naturally many best practices you should adopt to ensure reliability, including developing contingency plans for when it all goes pear-shaped. Most people choose some hybrid approach, where core, business-critical technologies come with vendor backing, and everything else is on a case-by-case basis.

    So, can we please stop talking about “open source vs. proprietary”? We should agree that this phrasing is inherently anachronistic. Instead, let’s talk about “managed” vs. “unmanaged” solutions and have a sane, productive discussion that can actually lead us forward.

  • Open Source Supply Chain “Full of Bugs”

    Open Source Supply Chain “Full of Bugs”

    From EnterpriseTech: I came across a link today to a news commentary which asserts that open source software is “a supply chain rife with security vulnerabilities and clogged with outdated versions of widely used software components.” I’m often reluctant to give these types of stories too much air time, because they’re often rife with FUD, but there’s a lot of truth here, and it’s something that we need to face up to, especially if we want companies to continue to innovate on open source platforms and build open source products.

    If you read Nadia Eghbal‘s “Roads and Bridges” white paper for the Ford Foundation, you’ll see that crusty, old open source software has been a concern for some time. She proposes that we view software the same as any other core infrastructure, such as roads and bridges. There’s also a collaborate project from the Linux Foundation, the Core Infrastructure Initiative, to attempt to deal with these issues.

    This is not an easy problem to solve, and it hits at the hears of what we want to do at the Open Source Entrepreneur Network, because we want companies to build process around their consumption and contribution of this great open source software and make contingency plans for when it all goes haywire. We want companies to be able to reduce their risk exposure while still benefiting from the innovation happening right now on open source platforms.