ERC-7007 - Verifiable AI-Generated Content Token

Created	2023-05-10
Status	Final
Category	ERC
Type	Standards Track
Authors	Conway (@0x1cc) Lee Ting Ting (@tina1998612) Cathie So (@socathie) Xiaohang Yu (@xhyumiracle) Kartin <kartin at hyperoracle.io>
Requires	EIP-165 EIP-721

Abstract

The verifiable AI-generated content (AIGC) non-fungible token (NFT) standard is an extension of the ERC-721 token standard for AIGC. It proposes a set of interfaces for basic interactions and enumerable interactions for AIGC-NFTs. The standard includes an addAigcData and verify function interface, a new AigcData event, optional Enumerable and Updatable extensions, and a JSON schema for AIGC-NFT metadata. Additionally, it incorporates Zero-Knowledge Machine Learning (zkML) and Optimistic Machine Learning (opML) capabilities to enable verification of AIGC data correctness. In this standard, the tokenId is indexed by the prompt.

Motivation

The verifiable AIGC-NFT standard aims to extend the existing ERC-721 token standard to accommodate the unique requirements of AI-generated content NFTs representing models in a collection. This standard provides interfaces to use zkML or opML to verify whether or not the AIGC data for an NFT is generated from a certain ML model with a certain input (prompt). The proposed interfaces allow for additional functionality related to adding AIGC data, verifying, and enumerating AIGC-NFTs. Additionally, the metadata schema provides a structured format for storing information related to AIGC-NFTs, such as the prompt used to generate the content and the proof of ownership.

This standard supports two primary types of proofs: validity proofs and fraud proofs. In practice, zkML and opML are commonly employed as the prevailing instances for these types of proofs. Developers can choose their preferred ones.

In the zkML scenario, this standard enables model owners to publish their trained model and its ZKP verifier to Ethereum. Any user can claim an input (prompt) and publish the inference task. Any node that maintains the model and the proving circuit can perform the inference and proving, and submit the output of inference and the ZK proof for the inference trace to the verifier. The user that initiates the inference task will own the output for the inference of that model and input (prompt).

In the opML scenario, this standard enables model owners to publish their trained model to Ethereum. Any user can claim an input (prompt) and publish the inference task. Any node that maintains the model can perform the inference and submit the inference output. Other nodes can challenge this result within a predefined challenge period. At the end of the challenge period, the user can verify that they own the output for the inference of that model and prompt, and update the AIGC data as needed.

This capability is especially beneficial for AI model authors and AI content creators seeking to capitalize on their creations. With this standard, every input prompt and its resulting content can be securely verified on the blockchain. This opens up opportunities for implementing revenue-sharing mechanisms for all AI-generated content (AIGC) NFT sales. AI model authors can now share their models without concerns that open-sourcing will diminish their financial value.

An example workflow of a zkML AIGC NFT project compliant with this proposal is as follows:

zkML Suggested Workflow

There are 4 components in this workflow:

ML model - contains weights of a pre-trained model; given an inference input, generates the output
zkML prover - given an inference task with input and output, generates a ZK proof
AIGC-NFT smart contract - contract compliant with this proposal, with full ERC-721 functionalities
Verifier smart contract - implements a verify function, given an inference task and its ZK proof, returns the verification result as a boolean

Specification

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 and RFC 8174.

Every compliant contract must implement the IERC7007, ERC721, and ERC165 interfaces.

The verifiable AIGC-NFT standard includes the following interfaces:

IERC7007: Defines an addAigcData function and an AigcData event for adding AIGC data to AIGC-NFTs. Defines a verify function to check the validity of the combination of prompt and aigcData using zkML/opML techniques.

pragma solidity ^0.8.18;

/**
 * @dev Required interface of an ERC7007 compliant contract.
 * Note: the ERC-165 identifier for this interface is 0x702c55a6.
 */
interface IERC7007 is IERC165, IERC721 {
    /**
     * @dev Emitted when `tokenId` token's AIGC data is added.
     */
    event AigcData(
        uint256 indexed tokenId,
        bytes indexed prompt,
        bytes indexed aigcData,
        bytes proof
    );

    /**
     * @dev Add AIGC data to token at `tokenId` given `prompt`, `aigcData`, and `proof`.
     */
    function addAigcData(
        uint256 tokenId,
        bytes calldata prompt,
        bytes calldata aigcData,
        bytes calldata proof
    ) external;

    /**
     * @dev Verify the `prompt`, `aigcData`, and `proof`.
     */
    function verify(
        bytes calldata prompt,
        bytes calldata aigcData,
        bytes calldata proof
    ) external view returns (bool success);
}

Optional Extension: Enumerable

The enumeration extension is OPTIONAL for ERC-7007 smart contracts. This allows your contract to publish its full list of mapping between tokenId and prompt and make them discoverable.

pragma solidity ^0.8.18;

/**
 * @title ERC7007 Token Standard, optional enumeration extension
 * Note: the ERC-165 identifier for this interface is 0xfa1a557a.
 */
interface IERC7007Enumerable is IERC7007 {
    /**
     * @dev Returns the token ID given `prompt`.
     */
    function tokenId(bytes calldata prompt) external view returns (uint256);

    /**
     * @dev Returns the prompt given `tokenId`.
     */
    function prompt(uint256 tokenId) external view returns (string calldata);
}

Optional Extension: Updatable

The updatable extension is OPTIONAL for ERC-7007 smart contracts. This allows your contract to update a token's aigcData in the case of opML, where aigcData content might change over the challenge period.

pragma solidity ^0.8.18;

/**
 * @title ERC7007 Token Standard, optional updatable extension
 * Note: the ERC-165 identifier for this interface is 0x3f37dce2.
 */
interface IERC7007Updatable is IERC7007 {
    /**
     * @dev Update the `aigcData` of `prompt`.
     */
    function update(
        bytes calldata prompt,
        bytes calldata aigcData
    ) external;

    /**
     * @dev Emitted when `tokenId` token is updated.
     */
    event Update(
        uint256 indexed tokenId,
        bytes indexed prompt,
        bytes indexed aigcData
    );
}

ERC-7007 Metadata JSON Schema for reference

{
  "title": "AIGC Metadata",
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "description": "Identifies the asset to which this NFT represents"
    },
    "description": {
      "type": "string",
      "description": "Describes the asset to which this NFT represents"
    },
    "image": {
      "type": "string",
      "description": "A URI pointing to a resource with mime type image/* representing the asset to which this NFT represents. Consider making any images at a width between 320 and 1080 pixels and aspect ratio between 1.91:1 and 4:5 inclusive."
    },
    "prompt": {
      "type": "string",
      "description": "Identifies the prompt from which this AIGC NFT generated"
    },
    "aigc_type": {
      "type": "string",
      "description": "image/video/audio..."
    },
    "aigc_data": {
      "type": "string",
      "description": "A URI pointing to a resource with mime type image/* representing the asset to which this AIGC NFT represents."
    },
    "proof_type": {
      "type": "string",
      "description": "validity (zkML) or fraud (opML)"
    }
  }
}

ML Model Publication

While this standard does not describe the Machine Learning model publication stage, it is natural and recommended to publish the commitment of the Model to Ethereum separately, before any actual addAigcData actions. The model commitment schema choice lies on the AIGC-NFT project issuer party. The commitment should be checked inside the implementation of the verify function.

Rationale

Unique Token Identification

This specification sets the tokenId to be the hash of its corresponding prompt, creating a deterministic and collision-resistant way to associate tokens with their unique content generation parameters. This design decision ensures that the same prompt (which corresponds to the same AI-generated content under the same model seed) cannot be minted more than once, thereby preventing duplication and preserving the uniqueness of each NFT within the ecosystem.

Generalization to Different Proof Types

This specification accommodates two proof types: validity proofs for zkML and fraud proofs for opML. Function arguments in addAigcData and verify are designed for generality, allowing for compatibility with both proof systems. Moreover, the specification includes an updatable extension that specifically serves the requirements of opML.

`verify` interface

We specify a verify interface to enforce the correctness of aigcData. It is defined as a view function to reduce gas cost. verify should return true if and only if aigcData is finalized in both zkML and opML. In zkML, it must verify the ZK proof, i.e. proof; in opML, it must make sure that the challenging period is finalized, and that the aigcData is up-to-date, i.e. has been updated after finalization. Additionally, proof can be empty in opML.

`addAigcData` interface

We specify an addAigcData interface to bind the prompt and aigcData with tokenId. This function provides flexibility for different minting implementations. Notably, it acts differently in zkML and opML cases. In zkML, addAigcData should make sure verify returns true. While in opML, it can be called before finalization. The consideration here is that, limited by the proving difficulty, zkML usually targets simple model inference tasks in practice, making it possible to provide a proof within an acceptable time frame. On the other hand, opML enables large model inference tasks, with a cost of longer confirmation time to achieve the approximate same security level. Mint until opML finalization may not be the best practice considering the existing optimistic protocols.

Naming Choice on `update`

We adopt "update" over "finalize" because a successful challenge happens rarely in practice. Using update could avoid calling it for every tokenId and save gas.

Backwards Compatibility

This standard is backward compatible with the ERC-721 as it extends the existing functionality with new interfaces.

Test Cases

The reference implementation includes sample implementations of the ERC-7007 interfaces under contracts/ and corresponding unit tests under test/. This repo can be used to test the functionality of the proposed interfaces and metadata schema.

Reference Implementation

ERC-7007 for zkML and opML
ERC-7007 Enumerable Extension

Security Considerations

Frontrunning Risk

To address the risk of frontrunning, where an actor could potentially observe and preemptively claim a prompt during the minting process, implementers of this proposal must incorporate a secure prompt-claiming mechanism. Implementations could include time-locks, commit-reveal schemes, or other anti-frontrunning techniques to ensure equitable and secured claim processes for AIGC-NFTs.

AIGC Data Change During Challenge Period

In the opML scenario, it is important to consider that the aigcData might change during the challenge period due to disputes or updates. The updatable extension defined here provides a way to handle these updates. Implementations must ensure that updates to aigcData are treated as critical state changes that require adherence to the same security and validation protocols as the initial minting process. Indexers should always check for any Update event emission.