SAMPLE FORMAT
Conformity Evidence — Regulation (EU) 2024/1689

EU AI Act
Article 12 Evidence Pack

SystemClaimsTriage-v3
Period1 January 2026 — 31 March 2026
Issued3 April 2026
Pack IDAA-2026Q1-CT3-0001
Sample Co Ltd
Company No. 09876543 · FCA Reference No. 123456
87 Leadenhall Street, London EC3A 2DY
IntegritySHA-256 chain verified ✓
NotarisationRFC 3161 timestamped, hourly ✓
Signing keyCustomer-held, fingerprint a4:b2:8c:9d:e1:f3

Executive Summary

This pack contains complete action receipts for the high-risk AI system ClaimsTriage-v3 over the period 1 January 2026 to 31 March 2026, in compliance with Regulation (EU) 2024/1689 ("EU AI Act") Article 12 (Record-keeping).

During the period, the system processed 4,217 distinct claim assessments, generating 41,902 atomic agent action receipts. All receipts are cryptographically hash-chained and individually signed using a customer-held key. The chain head was notarised hourly to a public RFC 3161 timestamp authority. Tamper-evidence has been verified end-to-end; no integrity breaks detected.

Material events

Three material events occurred during the period requiring attention:

  1. A decision-drift anomaly on 17 February 2026 (page 5)
  2. One unauthorised tool-call attempt on 4 March 2026 (page 6)
  3. One data-class escalation requiring human review on 22 March 2026 (page 6)

All three were resolved per documented procedure; none required regulatory notification. Detailed evidence including affected receipt references and resolution actions is provided on pages 5 and 6.

Pack composition

This pack contains the human-readable evidence (this document, 8 pages) and a machine-readable archive (manifest, Parquet receipt logs, signing certificates, integrity proof) referenced on page 8. Together they form the complete Article 12 record for the period.

Compliance Officer
Name
Date: __________
Data Protection Officer
Name
Date: __________
Information Security
Name
Date: __________

AI System Inventory

The following inventory is provided in alignment with EU AI Act Annex IV (technical documentation for high-risk AI systems).

System nameClaimsTriage-v3
System identifiersys_ct3
ProviderSample Co Ltd (in-house development)
Risk classificationHigh-risk (Annex III, point 5(b) — credit-relevant assessments adjacent)
Intended purposeInitial triage of motor insurance claims; routing between auto-approval, human review, and rejection
Deployment date14 November 2025
Operational environmentProduction, UK + EU policyholder base
Agent frameworkOpenAI Agents SDK v2.4 with custom orchestration layer
Underlying modelsgpt-4o-mini (primary), claude-3-5-sonnet (fallback)
Sub-agents3 — eligibility_check, fraud_screen, doc_extract
Tool inventory14 tools registered (full inventory page 6)
Data classifications touchedPII, financial, vehicle_registration
Human-in-loop triggerconfidence < 0.85 OR claim_value > £10,000
Last conformity assessment12 December 2025 (internal, ref CA-2025-12-12)
Designated operatorClaims Operations, Sample Co Ltd
Cessation procedure documentedYes — see Sample Co AI Operations Manual §4.7

Material changes during the period

None. The system configuration, model selection, sub-agent inventory, and tool inventory remained constant throughout the period. Two routine model-provider safety updates were applied (see page 5 for the operational impact of the 17 February update).

Operational Summary — Q1 2026

Throughput

Claims processed4,217
Auto-approved2,891 · 68.6%
Routed to human review1,219 · 28.9%
Rejected by system107 · 2.5%

Agent action volume

Total receipts captured41,902
Receipts per claim, mean9.9
Receipts per claim, p9517
Receipts per claim, max34

Model usage

LLM calls (primary model)38,114
LLM calls (fallback model)1,902
Tool calls23,481
Sub-agent invocations8,290

Operating cost

Total LLM cost£2,847.13
Cost per claim, mean£0.67
Tool execution cost£412.88
Storage + audit cost£287.40

Distribution of decisions, by week

WeekClaimsAuto-approvedHuman reviewRejected
W1 (1–7 Jan)312219 · 70.2%83 · 26.6%10 · 3.2%
W2 (8–14 Jan)301211 · 70.1%82 · 27.2%8 · 2.7%
W3 (15–21 Jan)337240 · 71.2%89 · 26.4%8 · 2.4%
W4–W7 (22 Jan – 16 Feb)1,4181,011 · 71.3%372 · 26.2%35 · 2.5%
W8 (17–23 Feb)329192 · 58.4%128 · 38.9%9 · 2.7%
W9–W13 (24 Feb – 31 Mar)1,5201,018 · 67.0%465 · 30.6%37 · 2.4%

Week 8 anomaly reflects the decision-drift event of 17 February 2026 — investigated and mitigated within the same week. See page 5.

Material Event #1 — Decision-Drift Anomaly

Date detected

17 February 2026, 09:14 UTC

Detection method

Automated decision-drift alert — auto-approval rate fell outside the ±2σ tolerance band relative to the 30-day rolling baseline.

Summary

On 17 February 2026 the system's auto-approval rate dropped from a 30-day rolling baseline of 71.2% to 58.4% over a 6-hour window (09:14 UTC to 15:02 UTC).

Root cause

Investigation determined the primary model provider had silently deployed an updated safety filter to gpt-4o-mini, increasing false-positive rates on a specific document class (vehicle damage photographs containing extensive panel deformation). The model classified affected images as "potentially fraudulent" at a higher rate, suppressing auto-approval.

Mitigation

The system's increased human-review routing rate held auto-approval at the appropriate safety level (no incorrect rejection occurred). The drift was reported to the model provider, who confirmed the safety filter change on 18 February 2026. Internal model evaluation suite was re-run; no need to modify routing thresholds.

Impact

Evidence excerpt (first affected receipts)

{ "event_id": "evt_8f3a2b4c1d9e08f4", "agent_id": "claims_triage_v3", "session_id": "sess_a1b2c3d4e5f6", "ts": "2026-02-17T09:14:08Z", "actor": { "type": "agent", "id": "fraud_screen" }, "action": { "type": "model_call", "name": "gpt-4o-mini", "params_hash": "sha256:a1b2c3d4e5f67890abcdef..." }, "resource": { "type": "claim", "id": "claim_30491", "classification": ["financial"] }, "decision": { "outcome": "routed_to_human", "confidence": 0.59, "reason": "image_classifier_flagged_high_severity" }, "prev_hash": "sha256:9c8b7a6e5d4c3b2a190f8e7d6c5b4a39282716...", "signature": "MEUCIQD8FaUkM4yzZ..." }

Chain-of-evidence references

First affected receiptevt_8f3a2b4c1d9e08f4 (chain position 28,114)
Last affected receiptevt_c2d8f1a7e9b34521 (chain position 28,332)
Total affected219 receipts (chain positions 28,114 – 28,332, contiguous)
Sub-chain integrityVerified ✓
Notarisation references2026-02-17T09:00Z + 2026-02-17T15:00Z (both verified)

Material Events #2, #3 — and Tool Inventory

Material Event #2 — Unauthorised tool-call attempt

Date detected

4 March 2026, 11:43 UTC

On 4 March 2026 at 11:43 UTC, agent ClaimsTriage-v3 attempted to invoke a tool not in its registered allowlist: db_write_customer_record. The attempt was blocked at the SDK level by the standing tool-allowlist policy guard; no data modification occurred. Investigation traced the source to a model hallucination during a low-confidence sub-agent reasoning step.

Receiptevt_b4a91d3c7e62f88a
OutcomeTool call denied; error returned to agent; agent re-planned and completed without write
Action takenTool registry confirmed correct; no policy change required; incident logged to internal AI safety register
StatusResolved ✓

Material Event #3 — Data-class escalation

Date detected

22 March 2026, 14:08 UTC

On 22 March 2026 a claim flagged for suspected fraud triggered the data-class escalation policy. The agent attempted to retrieve a customer's banking history (data class: financial_sensitive), which exceeds ClaimsTriage-v3's standing authorisation. The request was escalated to a human reviewer per policy.

Receiptevt_d7c2a8f4e9b35912
OutcomeHuman reviewer authorised retrieval after fraud investigation review; case continues
Action takenExisting data-class policy operated correctly; no change required
StatusResolved ✓

Tool inventory (excerpt — full inventory in machine-readable manifest)

Tool IDDescriptionCalls in periodErrors
lookup_customerRead customer record by ID8,11412
lookup_vehicleRead vehicle registration data4,2170
score_damage_photosML model — damage severity classification3,89141
fetch_policyRead policy details4,2173
score_fraud_riskFraud probability scoring1,8470
notify_reviewerSend claim to human reviewer queue1,2192
notify_customerSend customer-facing update2,89111
log_decisionAppend decision to audit log4,2170
… 6 more tools — see manifest …

All tool calls are individually receipted in the chain. Full per-call detail is available in the Parquet receipt archives referenced in the pack manifest (page 8).

Cryptographic Integrity Verification

The integrity of every receipt in this pack has been verified using the three-layer cryptographic protocol documented in the Agent Audit Verification Specification v1.0.

Hash chain

Pack identifierAA-2026Q1-CT3-0001
Receipts in pack41,902
Chain position, first receipt14,108
Chain position, last receipt56,009
First receipt hashsha256:7c9d4e8f3a2b1c5d6e8f9a0b1c2d3e4f5a6b7c8d9e0f1a2b...
Last receipt hashsha256:e8a2c1d4b5c6e7f89a0b1c2d3e4f5a6b7c8d9e0f1a2b3c4d...
Chain verificationPASSED ✓ — verified 2026-04-03T09:14:23Z
Tamper anomaliesNone detected

RFC 3161 notarisation

Notarisation intervalHourly (chain head)
Notarisations in period2,160
Timestamp authorityFreeTSA.org (publicly accessible RFC 3161 TSA)
First timestamp2026-01-01T00:00:00Z
Last timestamp2026-03-31T23:00:00Z
Timestamp validationAll 2,160 valid ✓
Gaps in notarisationNone

Signing key

AlgorithmECDSA over secp256r1 (NIST P-256)
Public key fingerprinta4:b2:8c:9d:e1:f3:7a:5c:9e:4f:8d:6b:2a:1c:0e:7f
Key issued14 November 2025
Key rotation due14 November 2026
CustodyCustomer-held (Sample Co Ltd HSM)
Chain of trustSelf-signed (customer root); public key published 14 Nov 2025

Independent reproducibility

Verification of this pack against the public timestamp chain is independently reproducible by any party with read access to the pack manifest. Two verification paths are supported:

Both paths reproduce the integrity proof above without requiring access to private signing material. An external auditor can independently verify the pack's integrity without any cooperation from Sample Co Ltd or from Agent Audit.

Pack Manifest & Auditor Sign-Off

The machine-readable pack accompanies this document and contains the complete receipt log, integrity proofs, and signing certificates referenced throughout this pack.

File listing

pack-AA-2026Q1-CT3-0001/ ├── 00-manifest.json (this pack's machine-readable index) ├── 01-system-inventory.json (Page 3 source data) ├── 02-receipts-2026-01.parquet (January receipts, 14,108 rows) ├── 03-receipts-2026-02.parquet (February receipts, 12,917 rows) ├── 04-receipts-2026-03.parquet (March receipts, 14,877 rows) ├── 05-material-events.json (3 events with full receipt references) ├── 06-tool-inventory.json (full tool registry snapshot) ├── 07-integrity-proof.json (hash chain + notarisation records) ├── 08-signing-certificates.pem (public keys + chain of trust) └── 99-evidence-pack.pdf (this document) Total uncompressed size: 187 MB SHA-256 of manifest: sha256:a4f8d2e1b3c9d7e5f6a8b1c4d2e3f5a6b7c8d9e0f1a2b3c4

How to read this pack

  1. This PDF is the human-readable summary intended for compliance, auditor, and regulator review.
  2. The Parquet files contain the complete receipt log, queryable with any Parquet-aware tool (DuckDB, Apache Spark, Pandas).
  3. The integrity-proof JSON allows independent re-verification of the hash chain and notarisation without contacting Sample Co Ltd or Agent Audit.
  4. Signing certificates support verification of every receipt's signature against the customer-held key.

Auditor sign-off

To be completed by the external auditor on review of this pack.

Auditor name____________________________________
Auditing firm____________________________________
Audit reference____________________________________
Date pack received____________________________________
Date reviewed____________________________________
Outcome

☐ Accepted as evidence for Article 12 compliance period
☐ Accepted with observations (recorded separately)
☐ Requires clarification — see attached query register

Signature

Auditor signature

Pack generated automatically by Agent Audit at 2026-04-03T09:14:23Z. For verification methodology, see the Agent Audit Trust Centre at trust.agentaudit.co.uk. This format conforms to the proposed "AI Action Evidence Pack" submission standard v1.0.