Hiding the inner workings of software to prevent misuse — so-called “security by obscurity” — flies in the face of decades of real-world software development experience. The more open a model, the more trust we can place in it. This is why the White House concluded in a National Telecommunications and Information Administration (NTIA) report, that there’s no need to restrict open-source artificial intelligence and it’s a great thing for security. There’s no question that generative AI can be put to undesirable uses, from the mundane to the malevolent. In just this year, we’ve seen generative AI suggest adding glue to pizza and generate phone calls impersonating President Biden in an attempt to suppress voter turnout. As generative AI becomes more powerful and more prevalent, it’s easy to understand why government officials and security practitioners are concerned.
Acting on direction from President Biden’s Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence, the U.S. National Telecommunications and Information Administration (NTIA) investigated the risks and benefits of the large language models used in generative AI. The NTIA focused on models with publicly available weights. These models are more easily adopted by individuals and smaller companies, putting AI in the hands of users beyond the large Silicon Valley companies that have been driving the AI boom. This easier access opens up more opportunities for beneficial AI use. At the same time, it opens more opportunities for harmful AI use.
When the NTIA issued the “Dual-Use Foundation Models with Widely Available Model Weights” report at the end of July, many experts breathed a sigh of relief. The NTIA wisely avoided taking a “security through obscurity” approach, where protection against attack relies on hiding sensitive information or capabilities. Instead, NTIA recommended that the U.S. government not restrict these open models, but instead monitor them for potential risks.
We’ve learned over decades of real-world software development and operations that security through obscurity is ineffective. The protected information can be revealed by accident, whether it’s a developer committing credentials to a GitHub repo or a misconfigured server publishing confidential files on the public internet. Determined malicious actors can often find their way past the obscurations, like when they attempt to connect to an SSH server on a non-standard port.
Just like with traditional software, we can’t rely on hiding generative AI as a defense. Bad actors can find what we’re hiding on purpose or by accident. The ability to inspect open-source software means that potentially anyone can discover security issues. Yes, that sometimes means attackers can, but in general, that’s not the case. As a recent example, the backdoor inserted into the widely-used xz compression library was discovered and addressed before the backdoored version was widely distributed because someone developing another software package noticed a tiny performance hit from the backdoored version. Similarly, if there are issues in an AI model, the pool of people who can discover them becomes much larger the more open the model is.
A fully open model — one where the training data is available for inspection and modification — provides a means for addressing another threat: malicious or accidentally bad training data. The phrase “garbage in, garbage out” goes back decades and applies to generative AI just as much as it applies to spreadsheets, ENIAC, or Babbage’s analytical engine. With an open AI model, you have the opportunity to inspect the model and the data to address sources of bad output.
Along those same lines, generative AI brings another wrinkle: The model itself can be an attack vector. The “Sleepy Pickle” attack embeds malware into the model files in Python’s pickle serialized data format. When the user loads the compromised pickle file, the model can produce output that gives intentionally misleading information, directs the user to a malicious website, or executes arbitrary code.
Responsible use of generative AI includes understanding the complex supply chain that combines training data, weights and models. The more open a model is, the more you can inspect what’s in it and verify the provenance.
The first thing to remember is that “correct” is better than “first.” While no one wants to be the last to market, you also don’t want a product or customer service bot that suggests putting glue on pizza. Taking the time to understand your model’s supply chain is worthwhile, given AI’s ability to rapidly generate output.
Only use models from trusted sources and verify that you’re getting what you think you are. Cryptographically-signed files can give you confidence that your model was not the victim of an adversary-in-the-middle attack. Many models are called “open,” but there’s no shared definition for what that means yet. Ensure that the model you’re using is open in the way you think it is.
Guarding against attack is not a one-time activity. You have to proactively monitor your AI supply chain in the same way you would monitor the rest of your software supply chain. By using models and data from trusted sources with known provenance, verifying attestations, and fixing vulnerabilities as they are discovered, you can securely use generative AI. Trust and verification are only fully possible when you have a truly open AI model.
Ben Cotton, Open Source Community Lead, Kusari, co-authored this article.