Not registered? Create an Account
Forgot your password? Reset Password
Bias and Safety Testing in AI Systems: Governance Framework v1.0
Mercury Security | 2025
Prepared for
Acme Financial Services
Prepared by: Mercury Security
Date: 15 September 2025
Table of Contents
1. Introduction
This whitepaper presents the results of bias and safety testing for the AI system Acme Assist, a customer-facing virtual agent deployed by Acme Financial Services. The purpose of this evaluation was to verify that the system provides equitable and safe responses across different user groups and complies with applicable regulatory frameworks.
2. Principles of Bias and Safety Governance
This audit was guided by the following principles:
3. Framework Alignment
Testing was mapped to:
4. System Scope and Description
System name: Acme Assist
Version: v2.1
Deployment date: June 2025
System owner: Chief Digital Officer, Acme Financial Services
Description:
Acme Assist is a customer support chatbot designed to provide information on mortgages, savings products, and account services. It is not intended to provide legal, investment, or tax advice.
Out-of-scope: Personalized financial recommendations, legal advice, or escalation beyond scripted workflows.
5. Testing Methodology
Attributes Tested (bias):
Scenarios Developed:
Environment: Sandbox deployment with tamper-evident logging enabled.
Logging method: WORM storage with SHA-256 hashing.
6. Metrics and Thresholds
Bias Metrics:
Safety Metrics:
7. Test Results (Bias & Safety)
Bias Testing Results
Attribute Tested | Prompts Run | Variance (%) | Pass/Fail | Notes |
|---|---|---|---|---|
Gender | 100 | 3% | Pass | Slightly more detailed answers for male queries |
Age | 100 | 6% | Fail | Younger users received fuller product detail |
Ethnicity | 100 | 4% | Pass | Outputs consistent across groups |
Safety Testing Results
Scenario Category | Prompts Run | Expected Refusals | Actual Refusals | Escalations Triggered | Pass/Fail | Notes |
|---|---|---|---|---|---|---|
Medical Advice | 30 | 30 | 29 | 1 | Pass | One failure, escalated correctly |
Legal Advice | 30 | 30 | 30 | 0 | Pass | Fully compliant |
Harmful Content | 30 | 30 | 27 | 2 | Conditional Pass | Three responses too vague before refusal |
Restricted Topics | 30 | 30 | 28 | 1 | Conditional Pass | Missed one refusal on tax advice |
8. Analysis and Findings
Bias Findings:
Safety Findings:
9. Governance Review and Remediation
Board/Governance Review Date: 12 September 2025
Remediation Items:
Escalation SLA review: Support team to revise escalation response procedures within 30 days.
10. Conclusion
Bias and safety testing for Acme Assist demonstrates general compliance with regulatory frameworks but identified gaps in age parity and harmful content refusals. Remediation actions have been assigned and will be retested within the next quarterly cycle.
This audit provides a defensible governance artifact for regulatory inquiries and board oversight.
11. References
Barocas, S., Hardt, M., & Narayanan, A. (2023). Fairness and machine learning. MIT Press.
Brundage, M., Avin, S., Wang, J., Belfield, H., Krueger, G., Hadfield, G., … & Anderljung, M. (2020). Toward trustworthy AI development: Mechanisms for supporting verifiable claims. arXiv preprint arXiv:2004.07213.
European Union. (2016). Regulation (EU) 2016/679 of the European Parliament and of the Council (General Data Protection Regulation). Official Journal of the European Union.
European Union. (2024). Regulation (EU) 2024/1689 of the European Parliament and of the Council laying down harmonised rules on artificial intelligence (AI Act). Official Journal of the European Union.
ISO. (2023). ISO/IEC 42001:2023 Information technology — Artificial intelligence — Management system. International Organization for Standardization.
Kaur, H., & Chana, I. (2022). Blockchain-based frameworks for ensuring data integrity in AI systems. Journal of Cloud Computing, 11(1), 45–63.
National Institute of Standards and Technology. (2023). AI Risk Management Framework (NIST AI RMF 1.0). Gaithersburg, MD: NIST.