Building Trustworthy AI: Contending with Data Poisoning
2024-8-1 20:28:47 Author:查看原文) 阅读量:5 收藏

Executive Summary

As Artificial Intelligence (AI) and Machine Learning (ML) systems are adopted and integrated globally, the threat of data poisoning attacks remains a significant concern for developers and organizations deploying AI technologies. This paper will explore the landscape of data poisoning attacks, their impacts, and the strategies being developed to mitigate this threat.


AI and ML systems are increasingly and rapidly being adopted across various sectors, from healthcare and finance to autonomous vehicles and social media. As these technologies continue to evolve, threat actors are already seeking to adapt to, and exploit new vulnerabilities. One of these vulnerabilities is data poisoning.

Data poisoning is when a threat actor intentionally compromises a training dataset used by an AI or ML model to manipulate or degrade the model, or introduce specific vulnerabilities for future exploits (see source 1 in appendix). These attacks can cause AI systems to make wrong decisions, exhibit bias, or even fail completely. As organizations increasingly rely on AI/ML systems for critical decision-making processes, the threat of data poisoning attacks becomes more urgent.

Modern deep learning models are trained on massive datasets, often containing billions of samples automatically crawled from the internet (see source 2 in appendix). While this scale has enabled significant advancements in AI capabilities, it has also introduced new vulnerabilities. Poisoning even a minuscule fraction (as little as 0.001%) of these large, uncurated datasets can be sufficient to introduce targeted mistakes in a model’s behavior (see source 3 in appendix).

As AI systems become more integrated into our daily lives and critical infrastructure, the potential impact of these attacks grows exponentially. As the industry shifts to smaller, more specialized models, this attack surface will only increase. Additionally, as training cycles shorten, threat actors’ ability to poison datasets will only become easier. From compromising autonomous vehicle safety systems to manipulating financial algorithms, the consequences of successful data poisoning attacks can range from financial losses to threats to human life (see source 4 in appendix).

Poisoning as little as 0.001% of AI datasets can be sufficient to introduce targeted mistakes in a model’s behavior

Evolution of Data Poisoning Attacks

As AI/ML systems have become more sophisticated and widely adopted, so too have the methods used to attack them. Early forms of data poisoning were relatively simple, and often involved the injection of mislabeled data into training sets. However, as AI/ML models became more complex, threat actors developed more sophisticated, targeted, and undetectable techniques. These may involve subtle manipulations of training data that cause specific misclassifications or introduce backdoors into models for future exploitation, without disrupting the performance of the model (see source 5 in appendix).

Types of Data Poisoning Attacks

Threat actors use a variety of methods to execute data poisoning attacks. We have captured various types and examples in the table below to highlight the complexity and diversity of threats facing AI/ML systems. Understanding these attack vectors is crucial for developing comprehensive defense strategies and ensuring the integrity and reliability of AI-driven decision-making processes.
