The continued use of AI and machine learning, coupled with an explosion of interest in generative AI tools such as ChatGPT, is raising inevitable questions about the new cybersecurity risks they pose within enterprises.
AI algorithms are trained on data sets, which can be very large or relatively small if the AI tool is designed for a specific purpose, such as a narrow business use case. there is. If an AI tool’s dataset is altered or corrupted in any way, the tool’s output may become inaccurate and, in some cases, discriminatory or inappropriate.
In some cases, attackers can introduce backdoors or other vulnerabilities into AI tools by contaminating data sets. For example, imagine an AI model being trained to recognize suspicious emails or unusual behavior on a corporate network. Successful data poisoning attacks can result in phishing and ransomware activity going undetected and email and spam filters being bypassed.
How data poisoning attacks work
To launch a data poisoning attack, threat actors need access to the underlying data. The approach depends on whether the dataset is private or public.
Data poisoning attacks against private datasets
For small, private data sets used to train certain AI tools, the attacker could be a malicious insider or a hacker who gained unauthorized access.
In some cases, attackers can introduce backdoors or other vulnerabilities into AI tools by contaminating data sets.
Such attackers are known as targeted attack. In this situation, the tool works correctly most of the time, and the breach is brought to the attention of the software owner.
But when a user prompt asks the model to see corrupted data, the tool suddenly becomes confused and responds in a completely different way than the operator expected or intended. Depending on the industry and use case (e.g. finance or healthcare), the impact can be costly and even life-threatening.
Data poisoning attacks against public datasets
If the data used to train AI tools is publicly available data, poisoning may need to be done collaboratively by multiple parties.
For example, a tool known as Nightshade allows artists to insert changes into their art that are nearly invisible to the human eye but invisible to generative AI tools like Midjourney and Dall-E. This is intended to confuse AI such as: Use it as training data without permission.
The changes that Nightshade makes can manipulate the AI to produce the wrong images (for example, a house instead of a car), effectively polluting the tool’s dataset and potentially undermining user trust. .
The stated goal of Nightshade operators is to increase the cost of training AI on unlicensed data. Then, AI operators may eventually decide to configure their tools to avoid scraping content without permission.
How to prevent data poisoning attacks
Protecting against data poisoning requires a multi-layered approach. For tools that do not use large amounts of data (for example, tools that address narrow enterprise use cases), ensure the integrity of the data set on which the tool is trained and ensure that data is retrieved only from trusted sources. It’s easier.
That said, it is possible to sanitize data from public sources and pre-process the data to ensure that no intentional errors are introduced into the dataset.
AI developers can also implement procedural checks to ensure that the output meets certain criteria, such as appropriateness and non-discrimination, regardless of the dataset or user prompt.
Rob Shapland is an ethical hacker specializing in cloud security, social engineering, and providing cybersecurity training to companies around the world.