It has been a scary and exhausting couple of months to be a systems maintainer.
Even if you don’t work as a software developer, you probably heard about a very high profile security issue that had IT Admins and developers frantically patching servers over the holidays. This week another critical vulnerability was announced affecting nearly every Linux server on the planet, sending IT teams back into overdrive.
With the severity of these issues, both technical and non-technical teams in social impact technology are taking a renewed look at their underlying risks and approach to security. I wanted to reflect a bit on takeaways from one of these issues in particular, as the details speak to some of the key challenges in our industry. A deeper look into them illustrates a number of lessons that can help us be prepared to protect data whether we are building software directly, or implementing it as practitioners.
Security is a process, not a property
A software system’s ongoing practical security starts with one core piece of knowledge, and it has nothing to do with cryptography or 2FA. It is that new and dangerous vulnerabilities in all technologies will be discovered in the future, and that someone needs to show up on time able to fix them.
Knowing that truth and acting on it effectively, however, are two very different things. A staggering 60% of data breaches globally are the result of attacks using a known defect where a solution exists but simply wasn’t applied to the system that was breached.
This is critical information for teams to factor into how they prioritize their requirements and investments around security. It tells us that most security failures are structural failures which could have been prevented by an improved approach, and not technical failures which could have been prevented by a better programmer.
This data shows us that organizations still tend to underestimate the risk of failures in fundamental practices like system maintenance, and the difficulty of getting them right. We have a lot to gain from a deeper understanding of the challenges which are causing these failures if we want to prevent falling victim to the same patterns.
Code may be free, but maintenance is a full time job
We can gain some of that understanding from the nature of the vulnerability discovered on December 18, 2021 with Apache’s “log4j”. One of the most pernicious aspects to highlight about this vulnerability is that the defect was discovered in one of the building blocks that developers commonly use when they write code for their own software. As a consequence the fix for the issue can’t be deployed directly to a server. It requires any piece of software written in Java to potentially need to be updated with the fix, rebuilt, and then redeployed.
That sounds daunting enough on its own, but these risks are further compounded by many of the common challenges faced by software systems in digital development programs. Out of date versions may not be capable of being updated directly to a new patched release. Custom code added to a project fork may no longer have an active maintainer capable of creating a new release.
These types of challenges provide more context for the ongoing presence of the vulnerabilities being exploited in the 60% of attacks previously described. This is especially true when you consider that barriers to applying maintenance often only get harder over time as systems fall further and further out of date, reducing the odds they will ever be applied at all. Indeed 75% of attacks in 2020 relied on vulnerabilities that were more than 2 years old.
For organizations adopting Open Source global goods at scale, the log4j issue illustrates how critical it is to have a credible plan for permanent maintenance of every piece of technology and just how valuable a history of demonstrated long term maintenance is. At Dimagi, for example, since we intend to maintain our software systems indefinitely when choosing between Open Source components our team weighs a long history of reliable maintenance heavily over other factors.
Data security is a matter of confidence
For our team the log4j issue was an instance in which our concerns were both internal as a software provider and external as a consumer. After our devops engineers identified the complexity of determining the scope of the issue against our own servers, we also needed to establish the safety of our other software services that we rely on as sub-processors from our AWS Cloud Service Provider down to the Datadog tool we use to manage monitoring data.
Like most impact organizations we are limited in time and resources, so this is an area where we were deeply happy to benefit from the division of responsibility and expertise that comes from our distributed cloud services. When vulnerabilities are announced, Dimagi’s devops team is already responsible for the more than 200 virtual servers running our architecture within our cloud infrastructure. It’s daunting to imagine the additional demands on time and expertise that would be required if we needed to also apply that same thoroughness to the infrastructure for the other complex, large scale tools our sub-processors provide as well. Based on our experience providing CommCare as a service, I’m confident that our best in class partners are able to leverage their expertise towards auditing and reporting their systems much more effectively than we could.
Our team’s confidence in the practices and thoroughness of our partners isn’t just good faith, but the result of our review of their thorough auditing and published practices through Security Compliance standards like SOC-2. We want our customers to be able to rely on the same degree of confidence in our expertise and thoroughness, so I’m proud that CommCare has been able to lead the way on security in our industry as the first (and currently only) SOC-2 certified Digital Global Good.
Renewed focus, same cause
For organizations who work in digital development, security issues like these provide a sobering reminder that a successfully scaled technology system can be a double edged sword. Proven and effective technology based interventions can deliver uniquely cost-effective benefits, but the scale of harm from potential security or privacy failures increases along with the size of the populations and communities that a given technology reaches.
As we continue to expand the reach of technology interventions, we will also need to keep expanding our expectations of data security responsibilities alongside, including potentially reconsidering what foundations we are willing to consider entirely. The nature of security work is such that we will never finish, and we can always do more, but that only increases the importance of being intentional about our goals. We look forward to continuing to work with our partners in the global community to push to always set the bar higher and encourage adopters to keep increasing their standards in the specificity and comprehensiveness of the security commitments that they should expect from tools like ours to protect the people we all seek to serve.
Learn more about CommCare and contact us today. Or check out our careers page and come work with us.