Papers, Please - Scrutinizing AI model creation
11-11, 11:30–12:00 (MST), Theater

When an AI model misbehaves (e.g., it tells you to put glue on pizza), you must investigate how this happened. Sometimes these are accidents caused by the training data, but these incidents can also be due to nefarious activities – we’ve seen ML malware deployed in 2024. At the end of the day AI is still software, so security needs to be established around its creation. The same transparency and accountability must be enforced as with other parts of the software supply chain. Utilizing SLSA (Supply Chain Levels for Software Artifacts) and GUAC (Graph for Understanding Artifact Composition), we can determine the provenance of each dataset and the composition of each model. In this talk, we dive into the anatomy of AI model attacks: identifying bad models, determining the root cause of badness, and finding the blast radius of models affected. Once the data is collected, we can create an SBOM and distribute with the AI model provenance to meet compliance and transparency requirements.


AI/ML needs the same type of accountability and transparency that we demand for the open-source or proprietary software we create and distribute. AI models are now being used for various use cases but knowing and understanding how the model was trained is paramount as false outputs could lead to compliance, security, and reputational risk. For example, a bank integrates with an AI model to help them run loan risk assessments. Depending on the training data and software supply chain, the AI model may deny loan approval due to race, sex, national origin, and other forms of illegal discrimination. This would lead to a compliance risk as the bank is not meeting the regulations, a security risk as the AI model was tainted somehow, and a reputational risk as the bank would lose customers based on misdeeds. Therefore, it is important to understand the provenance of the data the models were trained on, and if an incident does occur, we can quickly determine the source and where else this tainted data might have been used. Since models are not inspectable by themselves, the only way to ensure they behave as intended by the trainer is to create tamper-proof evidence of the training process and use that for any model behavior analysis.

Solutions Architect with 15+ years of CyberSecurity, DevOps, Software Development and Automation experience. He is an active member in the open source community contributing/path-finding on various projects. Maintainer on the OpenSSF project GUAC (Graph for Understanding Artifact Composition), in-toto-golang and in-toto attestations. Outside of work, he loves to travel and find new restaurants to explore!