By Heidy Khlaaf and Artem Dinaburg
The National Telecommunications and Information Administration (NTIA) has circulated an Artificial Intelligence (AI) Accountability Policy Request for Comment on what policies can support the development of AI audits, assessments, certifications, and other mechanisms to create earned trust in AI systems. Trail of Bits has submitted a response to the NTIA’s RFC on AI system accountability measures and policies.
We offer various recommendations informed by our extensive expertise in cybersecurity and safety auditing of mission-critical software. We support the NTIA’s efforts in fostering an open discussion on accountability and regulation. In our response, we emphasize the following:
- AI accountability is dependent on the claims and context in which AI is used and deployed.
The main theme of our recommendations is that there can be no AI accountability or regulation without a defined context. An audit of an AI system must be measured against actual verifiable claims regarding what the system is supposed to do, and not narrowly scoped benchmarks. The scope carried out should be relevant to a regulatory, safety, ethical, or technical claim, for which stakeholders may be held accountable.
We previously proposed the use of Operational Design Domains (ODDs), a concept adopted from automotive driving systems, to define operational envelopes for the risk assessments of AI-based systems, including generative models. An ODD helps in defining the specific operating conditions in which an AI-system is designed to properly behave, therefore outlining the safety envelope against which system hazards and harms can be determined.
- Accountability mechanisms for AI innovation can only be considered relative to a level of risk threshold that must be determined in part by legislatures, rulemaking bodies, and regulators.
There is no one-size-fits-all rule for trust in AI systems. Determining how much AI technologies can and need to be trusted depends on the risk that society accepts for the context in which they are used. Different risk levels should be determined via a democratic process: by legislatures, rule-making bodies, and regulators. When considering the costs of accountability mechanisms and how they may hinder innovation, it first must be possible to demonstrate that risk reduction measures would be grossly disproportionate to the benefit gained. However, no evidence has been provided regarding the cost of implementing accountability mechanisms by those developing AI-based systems to be able to make such a determination.
- Foundational cybersecurity and software safety best practices can enable the identification of novel AI hazards and harms. Technical assessments are intended to support higher-level socio-technical, legal, or regulatory claims regarding the fitness of a system.
The dichotomy of technical versus socio-technical assessments, as described by NTIA’s supplementary information, does not reflect the purpose of distinct technical assessment approaches, historically and practically. Technical assessments are not intended to support purely technical goals, but also claims regarding the holistic behavior of the system and how a system may technically achieve such claims. That is, technical assessments are also a necessary tool in supporting socio-technical, legal, or regulatory claims regarding the fitness of the system. It is currently difficult to assess the technical attributes that can foster or impede the implementation of accountability mechanisms given that existing AI-based systems do not follow basic software safety and security best practices (e.g., IEC 61508, NIST 800-154, ODDs). Using fundamental software safety and security best practices as a first step can enable the development of further AI-specific accountability mechanisms.
- Current AI-based systems do not possess any unique software components that warrant a generalized licensing scheme, which would heavily impede the use of software as a whole.
AI systems should be regulated as extensions of software-based systems, given the identical mechanisms of their development. Any implementation of a generalized licensing scheme would likely result in significant overreach due to the broad definition and software components of AI systems. AI regulatory policies should generally mirror the practices of the existing sectors in which they are deployed. For applications deemed safety-critical, a certificate of safety or a license should indeed be granted only when a regulator is satisfied with the argument presented in a safety case.
- Independent bodies (auditors, regulators, etc.) should assess the trustworthiness and accountability claims of AI-based systems.
Independent auditors and regulators are key to public trust. Independence allows the public to trust in the accuracy and integrity of assessments and the integrity of regulatory outcomes. It has been an attribute crucial to established auditing practices in other fields, such as safety-critical domains. It is therefore important that independent bodies, not vendors themselves, assess the trustworthiness and accountability of AI systems.
Our response delves into further detail for the selected questions. We believe that established methodologies and expertise from cybersecurity and safety-critical domains are a necessary foundation in building AI-specific accountability mechanisms, and we hope to continue enabling the development of novel AI auditing techniques.