How to Mitigate AI-Generated Code Security Risks

Your existing tools won't cut It. Learn why AI demands new software composition analysis (SCA) safeguards.

Software Composition Analysis

Generative AI has rapidly taken off, infiltrating and automating countless tasks and processes in every type of business. Software developers are quickly adopting AI-generated code, which introduces its own unique considerations, including security risks specific to AI. As companies consider how to address these issues, one thing has become clear: existing application security tools aren’t suited to the singular complexity presented by AI code. Much like the early days of open-source software and the software composition analysis (SCA) tools built to monitor it, we need new security tools to address AI-generated code.

The Rise of AI in Software Code

When companies develop software, it is generally composed of three parts that an organization buys or contracts people to write:

  • Code
  • Open source
  • Commercial code

There is now a fourth component contributing code to the software, and that, of course, is AI. This unique piece of modern software is layered with all these other pieces, adding greater complexity to the application security challenge.

AI code is used in two distinct ways. First, some use AI-based code-generating tools to write lines of code for them. Copilot is the most well-known. ChatGPT is also very heavily used to generate code. Amazon has one called Whisperer, and there are also many smaller culprits.

The second way is when people who produce software leverage AI to replace or improve what they’re trying to do in some way. Frequently, they use pre-trained, publicly available models. For example, Hugging Face is a website with 400,000 AI models and 90,000 data sets that train the models.

Publicly available, pre-trained models resemble what we already do for open source in SCA. However, it’s a new category of vulnerabilities that previously didn’t exist in other open-source projects. However, the first use case is more revolutionary, much like the emergence of open source.

Lessons from Open-Source Emergence

AI-generated code is very reminiscent of the early days of open source. Looking back 30-40 years, open source was an organic movement where developers and students began publishing their code projects online with no fee or license attached. Eventually, as open source started to take off, developers would connect a license to the code to waive liability if something were to go wrong when someone used that code. Companies steered away from open source, and through the 1990s and 2000s, avoiding open source was the mainstream position of most enterprises. Today, it’s estimated open-source code constitutes 70-90 percent of modern software.

Machine learning-based code could behave similarly. However, the lifecycle will likely be faster since the transformation is already happening. Although we see a small number of enterprises trying to avoid using LLM-based code, it’s much harder today than it was 30 years ago because everyone is smarter and more agile, and communication is easier. Notably, other companies are already adopting it, and there is a real concern that organizations that avoid AI will be left behind.

SCA: The Art of Looking at Components

That said, there are mounting concerns about what to do with AI code, both from a legal and compliance standpoint and a security standpoint. AI-generated code brings unique challenges and vulnerabilities that have never existed before. Much like open source, AI code needs tools that are purpose-built for AI.

Where SCA scans and identifies vulnerabilities in open source, I foresee a new SCA market dedicated to monitoring and securing AI-generated code. SCA for open source didn’t exist 15 years ago, but today, it’s a half-billion-dollar market. At the pace at which AI is being adopted, SCA for AI could likely match or exceed that growth.

What’s Next?

AI-generated code is less secure because AI models are trained on open-source code, which is less secure than commercial code. It’s a situation of ‘bad code in, bad code out.’ We are starting to see application security technologies aimed at this challenge. Where open-source libraries might pull vulnerable code from a library that pulled code from another library, the software supply chain is hard to track. The same applies to AI code, which may be generated from a LLM trained on bad code. Thus, expecting an AI bill of materials (AI BOM) to appear soon is not too farfetched.

Before AI-generated code is embraced fully, organizations must recognize its challenges. Simultaneously, the application security industry must take steps to help monitor and secure AI code in modern software. The open-source evolution has provided a blueprint for modern software providers to adopt new technologies and practices to ensure that AI-generated code fulfills its promise and avoids risk.

Rami Sass, Co-founder and CEO at Mend

Rami Sass is co-founder and CEO of Mend.io, a company that enables organizations to accelerate‌ the development of secure software at ‌scale‌ with automated tools that help bridge the security knowledge gap. Since the company’s founding in 2011, Rami has grown Mend.io from a small Israeli startup to a global business with over 300 employees across several countries and hundreds of enterprise customers, including Microsoft and IBM.


Datacap - We Solve Payment Problems
Rami Sass, Co-founder and CEO at Mend

Rami Sass is co-founder and CEO of Mend.io, a company that enables organizations to accelerate‌ the development of secure software at ‌scale‌ with automated tools that help bridge the security knowledge gap. Since the company’s founding in 2011, Rami has grown Mend.io from a small Israeli startup to a global business with over 300 employees across several countries and hundreds of enterprise customers, including Microsoft and IBM.