In the first part of this mini-series, we explored briefly what kind of impacts AI may have on the CIA Triad and whether we should adjust this fundamental framework. The goal of this and subsequent blogposts is assessing the pillars of the proposed Information Security Hexagon, starting with Confidentiality.
Maintaining confidentiality in Artificial Intelligence (AI) systems leveraging Machine Learning (ML) introduces new challenges. However, the purpose of Confidentiality remains unchanged: ensuring sensitive information is protected from unauthorised access and exposure.
This post distinguishes three separate pillars of Confidentiality in AI implementations: Access Control, Accurate Training, and Intellectual Property Protection. Access Control and Intellectual Property Protection are well-established concepts, but they require a revision when talking about AI. With ML models being used in AI, the Accurate Training of these models also increases in importance. The considerations made require organisations to rethink their current approach to Confidentiality.
Solid access control to systems, applications and databases is a long-established foundation of cybersecurity. Consequently, these same controls should be applied to AI applications and their associated training and production datasets. However, embedding confidentiality in AI systems requires stepping up access control mechanisms. This pertains to data access rights granted to AI systems.
A core principle consists of limiting data access for AI systems to the one granted to its users. Excessive access by AI systems quickly leads to undesired data propagation. Consider, for example, the scenario where an employee uses an organisational AI tool to gather basic information on specific clients. If the tool enjoys broader access rights than the employee, it could implicitly or unintentionally expose confidential data the employee is not authorised to view, thereby creating a breach in confidentiality.
Aligning AI system access rights with user access rights may therefore limit the quality and depth of the AI system’s responses, as it has to base its conclusions on the dataset accessible by the user, thereby limiting its potential. This may not only be the case in production systems, but also during the training of a ML model. If such a model is trained on a broader set of data, some of them not accessible by its users, it could present findings or derive conclusions that could never be reached by its users. In this case, the concept of Confidentiality is not limited to unwanted access or exposure, but also covers the possible findings or conclusions that can only be derived by assessing confidential data.
ML models require training, often with large datasets. These should be correctly prepared, to ensure they do not contain confidential information, which could later leak as it was formally absorbed by the model during its training process.
Imagine a hospital trains a ML model to predict the likelihood of patients developing certain medical conditions based on their medical history, lab results, and/or other medical information. The training dataset consists of numerous entries containing sensitive patient information, such as diagnoses, treatment plans, and outcomes. The hospital then offers access to this model to external healthcare providers for a fee. When submitting a patient’s details, the model provides a confidence score/probability of the patient developing a specific condition.
An attacker, knowing some medical history of a specific individual, could attempt to perform a membership inference attack. This attack involves determining whether the individual’s information was included in the original training dataset of the ML model by providing the individual’s known information to the model and looking at confidence scores of the development of a certain condition. Superficially, this knowledge may not cause harm, but deeper analysis shows more reason for concern. High confidence scores indicate that the individual’s medical information might have been part of the training dataset, through which an attacker can infer that the person has a relationship with the hospital. Furthermore, the confidence score can indicate that the individual might be suffering from certain medical conditions. All of this information could be exploited further to uncover more sensitive data, such as specific diagnoses, treatment regimens, or even genetic information.
The example above is completely fictitious, as hospitals are not likely using any production data when training ML models. Not using production data to train models is the best remedy to privacy protection.
However, certain production data, generated in large volumes with wide diversity, might be highly interesting to train ML models with. This is, for example, the case for health statistics obtained from hospitals and public health records. This data could enable a ML model to predict future trends. In this case, simply not using the data needs to be balanced against the potential benefits of improved healthcare outcomes and public health interventions.
If direct production data is used, anonymisation techniques and overfitting prevention are essential prerequisites to prevent unwanted privacy leaks as much as possible. The former involves masking personal data so that individuals cannot be directly or indirectly identified. This can, for example, be done by adding “noise” to training datasets or by using fictive personae with distinctive profile elements. Preferably, any real-world information is only used in training datasets if it has been sufficiently scrubbed to prevent unintended leakage. Overfitting can be avoided by limiting the presence of redundant model features or by splitting a large data set of masked data, one part for training the model and another part for testing the model.
Protection of intellectual property rights is also a much debated aspect in the use of AI systems. To enjoy protection of intellectual property rights through patents, information must be made public. Therefore, it can in principle be used for model training purposes. There is also plenty of copyrighted information that could make its way into training datasets. The output of systems trained with such information can be examined for intellectual property violation, just like a person’s output. Plain copying, or actions closely resembling plain copying, appear straightforward. It becomes more difficult with generative AI systems, as they could respect a style, but insert sufficient difference to make it hard to claim copyright infringement. This will most likely be the subject of many court cases in the future. It is novel ground to define an acceptable approach of deeming output of AI systems, trained with public information, in breach of intellectual property rights.
However, many innovative approaches, especially in the software industry, are protected by patents, but are kept confidential through trade secrets. These innovations are still subject to reverse engineering techniques and model inspections. Moreover, an efficient AI machine learning process may even outperform human efforts and could directly lead to the discovery and exploitation of the “hidden” innovations. Therefore, software innovations kept as a trade secrets require a special design. This is where techniques such as code obfuscation come into play.
Note that the risks mentioned above extend beyond the mere disclosure of confidential information, but also touch upon the integrity of the system scrutinised. Malicious threat actors could, for example, alter a software component protected by trade secrets and insert malicious code. Insufficient protection of confidentiality then leads to an integrity violation.
There are several AI implementation-specific aspects to consider when it comes to protecting confidentiality. Considering that AI systems can either explicitly disclose or implicitly process confidential information, access rights to confidential data should be contained to the level of rights of users. Data used for training purposes should be masked, so that it cannot be linked to real personal information, yet maintains the features required for predictive learning.
As to the protection of intellectual property, there is likely no clear method yet to define originality of an AI system output. However, software protected by trade secrets requires additional techniques to prevent unwanted disclosure, leading either to unauthorised copying or even unwanted contamination.
At NVISO, we are well aware of security challenges in the development and actual use of general-purpose and high-risk AI systems. As a pure-play cybersecurity company, we can advise your organisation in how to be compliant with both the AI Act, and extending your current governance and risk management practices in line with security standards, such as ISO42001. We can also offer services to plot out security controls tailored to your specific environment, including your AI systems. In addition to this, we continuously research and innovate in leveraging AI capabilities in our offensive and defensive cybersecurity services.
Maxou Van Lauwe is a Cybersecurity Architect within NVISO’s Cyber Strategy and Architecture team. As a graduate in industrial engineering specialising in smart applications, he is well-aware of the AI landscape. Furthermore, through client work and internal research, he has acquired a vast understanding of how to secure AI implementations, while keeping in mind the necessary regulations.