Conformity Evidence — Regulation (EU) 2024/1689

EU AI Act
Article 12 Evidence Pack

SystemClaimsTriage-v3

Period1 January 2026 — 31 March 2026

Issued3 April 2026

Pack IDAA-2026Q1-CT3-0001

Sample Co Ltd
Company No. 09876543 · FCA Reference No. 123456
87 Leadenhall Street, London EC3A 2DY

IntegritySHA-256 chain verified ✓

NotarisationRFC 3161 timestamped, hourly ✓

Signing keyCustomer-held, fingerprint a4:b2:8c:9d:e1:f3

Executive Summary

This pack contains complete action receipts for the high-risk AI system ClaimsTriage-v3 over the period 1 January 2026 to 31 March 2026, in compliance with Regulation (EU) 2024/1689 ("EU AI Act") Article 12 (Record-keeping).

During the period, the system processed 4,217 distinct claim assessments, generating 41,902 atomic agent action receipts. All receipts are cryptographically hash-chained and individually signed using a customer-held key. The chain head was notarised hourly to a public RFC 3161 timestamp authority. Tamper-evidence has been verified end-to-end; no integrity breaks detected.

Material events

Three material events occurred during the period requiring attention:

A decision-drift anomaly on 17 February 2026 (page 5)
One unauthorised tool-call attempt on 4 March 2026 (page 6)
One data-class escalation requiring human review on 22 March 2026 (page 6)

All three were resolved per documented procedure; none required regulatory notification. Detailed evidence including affected receipt references and resolution actions is provided on pages 5 and 6.

Pack composition

This pack contains the human-readable evidence (this document, 8 pages) and a machine-readable archive (manifest, Parquet receipt logs, signing certificates, integrity proof) referenced on page 8. Together they form the complete Article 12 record for the period.

Compliance Officer

Name
Date: __________

Data Protection Officer

Name
Date: __________

Information Security

Name
Date: __________

AI System Inventory

The following inventory is provided in alignment with EU AI Act Annex IV (technical documentation for high-risk AI systems).

System name	ClaimsTriage-v3
System identifier	sys_ct3
Provider	Sample Co Ltd (in-house development)
Risk classification	High-risk (Annex III, point 5(b) — credit-relevant assessments adjacent)
Intended purpose	Initial triage of motor insurance claims; routing between auto-approval, human review, and rejection
Deployment date	14 November 2025
Operational environment	Production, UK + EU policyholder base
Agent framework	OpenAI Agents SDK v2.4 with custom orchestration layer
Underlying models	gpt-4o-mini (primary), claude-3-5-sonnet (fallback)
Sub-agents	3 — eligibility_check, fraud_screen, doc_extract
Tool inventory	14 tools registered (full inventory page 6)
Data classifications touched	PII, financial, vehicle_registration
Human-in-loop trigger	confidence < 0.85 OR claim_value > £10,000
Last conformity assessment	12 December 2025 (internal, ref CA-2025-12-12)
Designated operator	Claims Operations, Sample Co Ltd
Cessation procedure documented	Yes — see Sample Co AI Operations Manual §4.7

Material changes during the period

None. The system configuration, model selection, sub-agent inventory, and tool inventory remained constant throughout the period. Two routine model-provider safety updates were applied (see page 5 for the operational impact of the 17 February update).

Operational Summary — Q1 2026

Throughput

Claims processed4,217

Auto-approved2,891 · 68.6%

Routed to human review1,219 · 28.9%

Rejected by system107 · 2.5%

Agent action volume

Total receipts captured41,902

Receipts per claim, mean9.9

Receipts per claim, p9517

Receipts per claim, max34

Model usage

LLM calls (primary model)38,114

LLM calls (fallback model)1,902

Tool calls23,481

Sub-agent invocations8,290

Operating cost

Total LLM cost£2,847.13

Cost per claim, mean£0.67

Tool execution cost£412.88

Storage + audit cost£287.40

Distribution of decisions, by week

Week	Claims	Auto-approved	Human review	Rejected
W1 (1–7 Jan)	312	219 · 70.2%	83 · 26.6%	10 · 3.2%
W2 (8–14 Jan)	301	211 · 70.1%	82 · 27.2%	8 · 2.7%
W3 (15–21 Jan)	337	240 · 71.2%	89 · 26.4%	8 · 2.4%
W4–W7 (22 Jan – 16 Feb)	1,418	1,011 · 71.3%	372 · 26.2%	35 · 2.5%
W8 (17–23 Feb)	329	192 · 58.4%	128 · 38.9%	9 · 2.7%
W9–W13 (24 Feb – 31 Mar)	1,520	1,018 · 67.0%	465 · 30.6%	37 · 2.4%

Week 8 anomaly reflects the decision-drift event of 17 February 2026 — investigated and mitigated within the same week. See page 5.

Material Event #1 — Decision-Drift Anomaly

Date detected

17 February 2026, 09:14 UTC

Detection method

Automated decision-drift alert — auto-approval rate fell outside the ±2σ tolerance band relative to the 30-day rolling baseline.

Summary

On 17 February 2026 the system's auto-approval rate dropped from a 30-day rolling baseline of 71.2% to 58.4% over a 6-hour window (09:14 UTC to 15:02 UTC).

Root cause

Investigation determined the primary model provider had silently deployed an updated safety filter to gpt-4o-mini, increasing false-positive rates on a specific document class (vehicle damage photographs containing extensive panel deformation). The model classified affected images as "potentially fraudulent" at a higher rate, suppressing auto-approval.

Mitigation

The system's increased human-review routing rate held auto-approval at the appropriate safety level (no incorrect rejection occurred). The drift was reported to the model provider, who confirmed the safety filter change on 18 February 2026. Internal model evaluation suite was re-run; no need to modify routing thresholds.

Impact

219 claims affected (the 17 February cohort routed to human review)
0 claims incorrectly rejected
Mean human-review latency increased by 38 minutes for affected claims
Additional human review cost: estimated £540 (cost of reviewer time)

Evidence excerpt (first affected receipts)

{ "event_id": "evt_8f3a2b4c1d9e08f4", "agent_id": "claims_triage_v3", "session_id": "sess_a1b2c3d4e5f6", "ts": "2026-02-17T09:14:08Z", "actor": { "type": "agent", "id": "fraud_screen" }, "action": { "type": "model_call", "name": "gpt-4o-mini", "params_hash": "sha256:a1b2c3d4e5f67890abcdef..." }, "resource": { "type": "claim", "id": "claim_30491", "classification": ["financial"] }, "decision": { "outcome": "routed_to_human", "confidence": 0.59, "reason": "image_classifier_flagged_high_severity" }, "prev_hash": "sha256:9c8b7a6e5d4c3b2a190f8e7d6c5b4a39282716...", "signature": "MEUCIQD8FaUkM4yzZ..." }

Chain-of-evidence references

First affected receipt	evt_8f3a2b4c1d9e08f4 (chain position 28,114)
Last affected receipt	evt_c2d8f1a7e9b34521 (chain position 28,332)
Total affected	219 receipts (chain positions 28,114 – 28,332, contiguous)
Sub-chain integrity	Verified ✓
Notarisation references	2026-02-17T09:00Z + 2026-02-17T15:00Z (both verified)

Material Events #2, #3 — and Tool Inventory

Material Event #2 — Unauthorised tool-call attempt

Date detected

4 March 2026, 11:43 UTC

On 4 March 2026 at 11:43 UTC, agent ClaimsTriage-v3 attempted to invoke a tool not in its registered allowlist: db_write_customer_record. The attempt was blocked at the SDK level by the standing tool-allowlist policy guard; no data modification occurred. Investigation traced the source to a model hallucination during a low-confidence sub-agent reasoning step.

Receipt	evt_b4a91d3c7e62f88a
Outcome	Tool call denied; error returned to agent; agent re-planned and completed without write
Action taken	Tool registry confirmed correct; no policy change required; incident logged to internal AI safety register
Status	Resolved ✓

Material Event #3 — Data-class escalation

Date detected

22 March 2026, 14:08 UTC

On 22 March 2026 a claim flagged for suspected fraud triggered the data-class escalation policy. The agent attempted to retrieve a customer's banking history (data class: financial_sensitive), which exceeds ClaimsTriage-v3's standing authorisation. The request was escalated to a human reviewer per policy.

Receipt	evt_d7c2a8f4e9b35912
Outcome	Human reviewer authorised retrieval after fraud investigation review; case continues
Action taken	Existing data-class policy operated correctly; no change required
Status	Resolved ✓

Tool inventory (excerpt — full inventory in machine-readable manifest)

Tool ID	Description	Calls in period	Errors
lookup_customer	Read customer record by ID	8,114	12
lookup_vehicle	Read vehicle registration data	4,217	0
score_damage_photos	ML model — damage severity classification	3,891	41
fetch_policy	Read policy details	4,217	3
score_fraud_risk	Fraud probability scoring	1,847	0
notify_reviewer	Send claim to human reviewer queue	1,219	2
notify_customer	Send customer-facing update	2,891	11
log_decision	Append decision to audit log	4,217	0
… 6 more tools — see manifest …

All tool calls are individually receipted in the chain. Full per-call detail is available in the Parquet receipt archives referenced in the pack manifest (page 8).

Cryptographic Integrity Verification

The integrity of every receipt in this pack has been verified using the three-layer cryptographic protocol documented in the Agent Audit Verification Specification v1.0.

Hash chain

Pack identifier	AA-2026Q1-CT3-0001
Receipts in pack	41,902
Chain position, first receipt	14,108
Chain position, last receipt	56,009
First receipt hash	sha256:7c9d4e8f3a2b1c5d6e8f9a0b1c2d3e4f5a6b7c8d9e0f1a2b...
Last receipt hash	sha256:e8a2c1d4b5c6e7f89a0b1c2d3e4f5a6b7c8d9e0f1a2b3c4d...
Chain verification	PASSED ✓ — verified 2026-04-03T09:14:23Z
Tamper anomalies	None detected

RFC 3161 notarisation

Notarisation interval	Hourly (chain head)
Notarisations in period	2,160
Timestamp authority	FreeTSA.org (publicly accessible RFC 3161 TSA)
First timestamp	2026-01-01T00:00:00Z
Last timestamp	2026-03-31T23:00:00Z
Timestamp validation	All 2,160 valid ✓
Gaps in notarisation	None

Signing key

Algorithm	ECDSA over secp256r1 (NIST P-256)
Public key fingerprint	a4:b2:8c:9d:e1:f3:7a:5c:9e:4f:8d:6b:2a:1c:0e:7f
Key issued	14 November 2025
Key rotation due	14 November 2026
Custody	Customer-held (Sample Co Ltd HSM)
Chain of trust	Self-signed (customer root); public key published 14 Nov 2025

Independent reproducibility

Verification of this pack against the public timestamp chain is independently reproducible by any party with read access to the pack manifest. Two verification paths are supported:

Command-line: agentaudit verify pack-AA-2026Q1-CT3-0001
Web interface: verify.agentaudit.co.uk/AA-2026Q1-CT3-0001

Both paths reproduce the integrity proof above without requiring access to private signing material. An external auditor can independently verify the pack's integrity without any cooperation from Sample Co Ltd or from Agent Audit.

Pack Manifest & Auditor Sign-Off

The machine-readable pack accompanies this document and contains the complete receipt log, integrity proofs, and signing certificates referenced throughout this pack.

File listing

pack-AA-2026Q1-CT3-0001/ ├── 00-manifest.json (this pack's machine-readable index) ├── 01-system-inventory.json (Page 3 source data) ├── 02-receipts-2026-01.parquet (January receipts, 14,108 rows) ├── 03-receipts-2026-02.parquet (February receipts, 12,917 rows) ├── 04-receipts-2026-03.parquet (March receipts, 14,877 rows) ├── 05-material-events.json (3 events with full receipt references) ├── 06-tool-inventory.json (full tool registry snapshot) ├── 07-integrity-proof.json (hash chain + notarisation records) ├── 08-signing-certificates.pem (public keys + chain of trust) └── 99-evidence-pack.pdf (this document) Total uncompressed size: 187 MB SHA-256 of manifest: sha256:a4f8d2e1b3c9d7e5f6a8b1c4d2e3f5a6b7c8d9e0f1a2b3c4

How to read this pack

This PDF is the human-readable summary intended for compliance, auditor, and regulator review.
The Parquet files contain the complete receipt log, queryable with any Parquet-aware tool (DuckDB, Apache Spark, Pandas).
The integrity-proof JSON allows independent re-verification of the hash chain and notarisation without contacting Sample Co Ltd or Agent Audit.
Signing certificates support verification of every receipt's signature against the customer-held key.

Auditor sign-off

To be completed by the external auditor on review of this pack.

Auditor name	____________________________________
Auditing firm	____________________________________
Audit reference	____________________________________
Date pack received	____________________________________
Date reviewed	____________________________________

Outcome

☐ Accepted as evidence for Article 12 compliance period
☐ Accepted with observations (recorded separately)
☐ Requires clarification — see attached query register

Signature

Auditor signature

Pack generated automatically by Agent Audit at 2026-04-03T09:14:23Z. For verification methodology, see the Agent Audit Trust Centre at trust.agentaudit.co.uk. This format conforms to the proposed "AI Action Evidence Pack" submission standard v1.0.

EU AI ActArticle 12 Evidence Pack

Executive Summary

Material events

Pack composition

AI System Inventory

Material changes during the period

Operational Summary — Q1 2026

Throughput

Agent action volume

Model usage

Operating cost

Distribution of decisions, by week

Material Event #1 — Decision-Drift Anomaly

Summary

Root cause

Mitigation

Impact

Evidence excerpt (first affected receipts)

Chain-of-evidence references

Material Events #2, #3 — and Tool Inventory

Material Event #2 — Unauthorised tool-call attempt

Material Event #3 — Data-class escalation

Tool inventory (excerpt — full inventory in machine-readable manifest)

Cryptographic Integrity Verification

Hash chain

RFC 3161 notarisation

Signing key

Independent reproducibility

Pack Manifest & Auditor Sign-Off

File listing

How to read this pack

Auditor sign-off

EU AI Act
Article 12 Evidence Pack