| Internet-Draft | SCITT-Refusal-Events | January 2026 |
| Kamimura | Expires 14 July 2026 | [Page] |
This document describes a SCITT-based mechanism for creating verifiable records of AI content refusal events. It defines how refusal decisions can be encoded as SCITT Signed Statements, registered with Transparency Services, and verified by third parties using Receipts.¶
This specification provides auditability of refusal decisions that are logged, not cryptographic proof that no unlogged generation occurred. It does not define content moderation policies, classification criteria, or what AI systems should refuse; it addresses only the audit trail mechanism.¶
This note is to be removed before publishing as an RFC.¶
The latest version of this document, along with implementation resources and examples, can be found at [CAP-SRP].¶
Discussion of this document takes place on the SCITT Working Group mailing list (scitt@ietf.org).¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 14 July 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
This document is NOT a content moderation policy. It does not prescribe what AI systems should or should not refuse to generate, nor does it define criteria for classifying requests as harmful. The mechanism described herein is agnostic to the reasons for refusal decisions; it provides only an interoperable format for recording that such decisions occurred. Policy decisions regarding acceptable content remain the domain of AI providers, regulators, and applicable law.¶
AI systems capable of generating content increasingly implement safety mechanisms to refuse requests deemed harmful, illegal, or policy-violating. However, these refusal decisions typically leave no verifiable audit trail. When a system refuses to generate content, the event vanishes—there is no receipt, no log entry accessible to external parties, and no mechanism for third-party verification.¶
This creates several problems:¶
The SCITT architecture [I-D.ietf-scitt-architecture] provides primitives—Signed Statements, Transparency Services, and Receipts—that can address this gap. This document describes how these primitives can be applied to AI refusal events.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This document describes:¶
This document does NOT define:¶
This specification provides auditability of refusal decisions that are logged, not cryptographic proof that no unlogged generation occurred. An AI system that bypasses logging entirely cannot be detected by this mechanism alone. Detection of such bypass requires external enforcement mechanisms (e.g., trusted execution environments, attestation) which are outside the scope of this document.¶
This profile does not require Transparency Services to enforce completeness invariants; such checks are performed by verifiers using application-level logic.¶
This document uses terminology from [I-D.ietf-scitt-architecture]. The following terms are specific to this profile:¶
This document focuses on refusal events because successful generation is already observable through content existence and downstream provenance mechanisms (e.g., C2PA manifests, watermarks). Refusal events, by contrast, are negative events that leave no external artifact unless explicitly logged. The GENERATE and ERROR outcomes are defined for completeness invariant verification but are not the primary focus of this specification. This document does not attempt to standardize generation provenance; it focuses solely on refusal events as a complementary profile.¶
This profile maps refusal event concepts directly to SCITT primitives, minimizing new terminology:¶
| This Document | SCITT Primitive |
|---|---|
| ATTEMPT | Signed Statement |
| DENY / GENERATE / ERROR | Signed Statement |
| AI System | Issuer |
| Inclusion Proof | Receipt |
Refusal events are registered with a standard SCITT Transparency Service; this document does not define a separate log type.¶
A regulatory authority investigating AI system compliance needs to verify that a provider's stated content policies are actually enforced. Without verifiable refusal events, the regulator must trust provider self-reports. With this mechanism, regulators can request Receipts for refusal events within a time range, verify ATTEMPT/Outcome completeness for logged events, and confirm refusal decisions are anchored in an append-only log.¶
When investigating whether an AI system refused a specific request, investigators need to establish provenance. A Verifiable Refusal Record (ATTEMPT + DENY + Receipts) demonstrates that a specific request was received, classified as policy-violating, refused, and the refusal was logged.¶
AI service providers may need to demonstrate to stakeholders that safety mechanisms function as claimed. Verifiable refusal events enable statistical reporting on logged refusal rates, third-party verification of safety claims, and auditable proof that specific requests were refused.¶
In legal proceedings concerning AI-generated content, parties may need evidence that a system declined a request. Verifiable Refusal Records provide such evidence, subject to the limitation that they demonstrate logged refusals, not the absence of unlogged generation.¶
This section defines requirements for implementations. To maximize interoperability while allowing implementation flexibility, only the core completeness invariant uses MUST; other requirements use SHOULD or MAY.¶
The completeness invariant is the central requirement of this profile:¶
Verifiers SHOULD flag any logged ATTEMPT without a corresponding Outcome as potential evidence of incomplete logging or system failure.¶
This completeness invariant is defined at the event semantics level and applies only to logged events. It cannot detect ATTEMPT events that were never logged. Cryptographic detection of invariant violations depends on the properties of the underlying Transparency Service and verifier logic; it is discussed further in Section 8.¶
This profile does not require Transparency Services to enforce completeness invariants; such checks are performed by verifiers using application-level logic.¶
To protect against tampering, implementations SHOULD:¶
PrevHash chaining is RECOMMENDED but not required because append-only guarantees are primarily provided by the Transparency Service. PrevHash provides an additional, issuer-local integrity signal that can detect tampering even before Transparency Service registration.¶
SHA-256 for hashing and Ed25519 for signatures are RECOMMENDED. Other algorithms registered with COSE MAY be used.¶
Refusal events may be triggered by harmful or sensitive content. To avoid the audit log becoming a repository of harmful content, implementations SHOULD:¶
The hash function SHOULD be collision-resistant to prevent an adversary from claiming a benign prompt hashes to the same value as a harmful one.¶
Hashing without salting may be vulnerable to dictionary attacks if an adversary has a list of candidate prompts. Mitigations include access controls on event queries, time-limited retention policies, and monitoring for bulk query patterns. Salting may provide additional protection but introduces complexity; if used, implementations must ensure verification remains possible without requiring disclosure of the salt to third-party verifiers.¶
To enable external verification without access to internal systems, implementations SHOULD:¶
To maintain audit trail integrity, implementations SHOULD:¶
Some operational scenarios may require delayed outcomes:¶
Implementations SHOULD document expected latency bounds in their Registration Policy. Extended delays SHOULD trigger monitoring alerts.¶
An implementation conforms to this specification if it satisfies the following requirements:¶
All other requirements (SHOULD, RECOMMENDED, MAY) are guidance for interoperability and security best practices but are not required for conformance.¶
Implementations MAY extend the data model with additional fields provided the core conformance requirements are satisfied.¶
This section defines the data model for ATTEMPT and DENY Signed Statements. These are encoded as JSON payloads. This data model is non-normative; implementations MAY extend or modify these structures provided the conformance requirements in Section 4.6 are satisfied.¶
An ATTEMPT records that a generation request was received:¶
{
"eventType": "ATTEMPT",
"eventId": "019467a1-0001-7000-0000-000000000001",
"timestamp": "2026-01-10T14:23:45.100Z",
"issuer": "urn:example:ai-service:img-gen-prod",
"promptHash": "sha256:7f83b1657ff1fc53b92dc18148a1d65d...",
"inputType": "text+image",
"referenceInputHashes": [
"sha256:9f86d081884c7d659a2feaa0c55ad015..."
],
"sessionId": "019467a1-0001-7000-0000-000000000000",
"actorHash": "sha256:e3b0c44298fc1c149afbf4c8996fb924...",
"modelId": "img-gen-v4.2.1",
"policyId": "content-safety-v2",
"prevHash": "sha256:0000000000000000000000000000000...",
"eventHash": "sha256:a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4..."
}
¶
Field definitions:¶
A DENY records that a request was refused:¶
{
"eventType": "DENY",
"eventId": "019467a1-0001-7000-0000-000000000002",
"timestamp": "2026-01-10T14:23:45.150Z",
"issuer": "urn:example:ai-service:img-gen-prod",
"attemptId": "019467a1-0001-7000-0000-000000000001",
"riskCategory": "POLICY_VIOLATION",
"riskScore": 0.94,
"refusalReason": "Content policy: prohibited category",
"modelDecision": "DENY",
"humanOverride": false,
"prevHash": "sha256:a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4...",
"eventHash": "sha256:e5f6g7h8i9j0e5f6g7h8i9j0e5f6g7h8..."
}
¶
Field definitions:¶
This specification does not standardize content moderation categories, risk taxonomies, or refusal reason formats. These are policy decisions that remain the domain of AI providers and applicable regulations.¶
An ERROR records that a request failed due to system issues:¶
{
"eventType": "ERROR",
"eventId": "019467a1-0001-7000-0000-000000000003",
"timestamp": "2026-01-10T14:23:45.200Z",
"issuer": "urn:example:ai-service:img-gen-prod",
"attemptId": "019467a1-0001-7000-0000-000000000001",
"errorCode": "TIMEOUT",
"errorMessage": "Model inference timeout after 30s",
"prevHash": "sha256:e5f6g7h8i9j0e5f6g7h8i9j0e5f6g7h8...",
"eventHash": "sha256:h8i9j0k1l2m3h8i9j0k1l2m3h8i9j0k1..."
}
¶
ERROR events indicate system failures, not policy decisions. A high ERROR rate may indicate operational issues or potential abuse (e.g., adversarial inputs designed to crash the system). Implementations SHOULD monitor ERROR rates and investigate anomalies.¶
ATTEMPT, DENY, GENERATE, and ERROR events are encoded as SCITT Signed Statements:¶
The JSON payload is canonicalized per [RFC8785] and signed as the COSE_Sign1 payload bytes. This ensures deterministic serialization for signature verification.¶
After creating a Signed Statement, the Issuer SHOULD register it with a SCITT Transparency Service:¶
The Transparency Service's Registration Policy MAY verify that required fields are present and timestamps are within acceptable bounds.¶
Registration may fail due to network issues, service unavailability, or policy rejection. Implementations SHOULD implement retry logic with exponential backoff. Persistent registration failures SHOULD be logged locally and trigger operational alerts.¶
A Transparency Service operating as a refusal event log MAY implement a Registration Policy that validates:¶
This profile does not require Transparency Services to enforce completeness invariants. A TS accepting refusal events is not expected to verify that every ATTEMPT has an Outcome; such verification is performed by auditors and verifiers at the application level.¶
A complete Verifiable Refusal Record consists of:¶
Verifiers can confirm that a refusal was logged by validating both Receipts and checking the ATTEMPT/DENY linkage. This demonstrates that the refusal decision was recorded in the Transparency Service, but does not prove that no unlogged generation occurred.¶
For additional assurance, implementations MAY periodically anchor Merkle tree roots to external systems such as public blockchains, multiple independent Transparency Services, or regulatory authority registries. External anchoring provides defense against a compromised Transparency Service.¶
This document has no IANA actions at this time.¶
Future revisions may request registration of media types (e.g., "application/vnd.scitt.refusal-event+json") or establish registries for standardized event type values.¶
This specification assumes the following threat model:¶
This specification does NOT protect against:¶
An adversary controlling the AI system might attempt to omit refusal events to hide policy violations or, conversely, omit GENERATE events to falsely claim content was refused. The completeness invariant provides detection for logged events: auditors can identify ATTEMPT Signed Statements without corresponding Outcomes. Hash chains detect deletion of intermediate events.¶
However, if an ATTEMPT is never logged, this specification cannot detect the omission. Complete prevention of omission attacks is beyond the scope of this specification and would require external enforcement mechanisms such as trusted execution environments, RATS attestation, or real-time external monitoring.¶
A malicious Transparency Service might present different views of the log to different parties (equivocation). For example, it might show auditors a log containing DENY events while providing a different view to other verifiers. Mitigations include:¶
Detection of equivocation requires coordination between verifiers; a single verifier in isolation cannot detect it.¶
A malicious Issuer might maintain separate logs for refusals and generations, showing only the refusal log to auditors. The completeness invariant mitigates this by requiring every logged ATTEMPT to have an Outcome; if the GENERATE outcomes are hidden, auditors will observe orphaned ATTEMPTs.¶
Direct modification of log entries is prevented by cryptographic signatures on Signed Statements, hash chain linking, Merkle tree inclusion proofs in Receipts, and the append-only structure enforced by the Transparency Service.¶
An attacker might attempt to replay old refusal events to inflate refusal statistics or create false alibis. UUID v7 provides temporal ordering, timestamps are verified against Transparency Service registration time, and hash chain sequence numbers detect out-of-order insertion.¶
If an Issuer's signing key is compromised, an attacker could create fraudulent Signed Statements. Previously signed Signed Statements remain valid. Implementations SHOULD support key rotation and revocation. Transparency Service timestamps provide evidence of when Signed Statements were registered, which can help bound the impact of a compromise.¶
Although prompts are stored as hashes, an adversary with a dictionary of known prompts could attempt to identify which prompt was used by computing hashes and comparing. Mitigations include access controls on event queries, time-limited retention policies, monitoring for bulk query patterns, and rate limiting.¶
Salted hashing may provide additional protection but introduces operational complexity. If salting is used, the salt must be managed such that verification remains possible without disclosing the salt to third parties. This specification does not mandate salting.¶
An attacker could flood the system with generation requests to create a large volume of ATTEMPT Signed Statements, potentially overwhelming the Transparency Service or obscuring legitimate events. Standard rate limiting and access controls at the AI system level can mitigate this. The Transparency Service MAY implement its own admission controls.¶
This profile requires that harmful content not be stored. Prompt text is replaced with PromptHash, reference images are replaced with hashes, and refusal reasons SHOULD NOT quote or describe prompt content in detail. This prevents the audit log from becoming a repository of harmful content.¶
Actor identification creates tension between accountability and privacy. Implementations SHOULD use pseudonymous identifiers (ActorHash) by default, maintain a separate access-controlled mapping from pseudonyms to identities, define clear policies for de-pseudonymization, and support erasure of the mapping while preserving audit integrity (crypto-shredding).¶
Event metadata may enable correlation attacks. Timestamps could reveal user activity patterns, SessionIDs link multiple requests, and ModelIDs reveal which AI systems a user interacts with. Implementations SHOULD apply appropriate access controls and MAY implement differential privacy techniques for aggregate statistics.¶
Where personal data protection regulations apply (e.g., GDPR), implementations SHOULD support data subject access requests, erasure requests via crypto-shredding (destroying encryption keys for personal data while preserving cryptographic integrity proofs), and purpose limitation.¶
This section describes potential extensions and research directions that are outside the scope of this specification but may be addressed in future work.¶
Integration with Remote ATtestation procedureS (RATS) [RFC9334] could provide stronger guarantees that the AI system is operating as expected and logging all events. Hardware-backed attestation could reduce the trust assumptions on the Issuer.¶
High-volume AI systems may generate millions of events per day. Future work could explore batching mechanisms, rolling logs, and hierarchical Merkle structures to improve scalability while maintaining verifiability.¶
More sophisticated privacy mechanisms could be explored, including:¶
These mechanisms would add complexity and are not required for the core auditability goals of this specification.¶
Stronger completeness guarantees could be achieved through external enforcement mechanisms such as:¶
These approaches involve significant architectural changes and are outside the scope of this specification.¶
This appendix illustrates a complete flow from request receipt to Verifiable Refusal Record verification.¶
An auditor verifying the Verifiable Refusal Record:¶
This verification confirms that a refusal was logged, but does not prove that no unlogged generation occurred.¶
The authors thank the members of the SCITT Working Group for developing the foundational architecture. This work builds upon the transparency log concepts from Certificate Transparency [RFC6962].¶