Getting Started

Your first evaluation in under five minutes

CriteriaBot lets you define plain-English criteria and evaluate any content against them via a single API call. This guide walks you from zero to a working integration.

  1. 1

    Get your API token

    Create a free account, then open Settings → API Tokens and generate a token. Copy it somewhere safe - you won't be able to see it again.

    shell
    # Store your token in an environment variable
    export CRITERIA_BOT_API_TOKEN="sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
  2. 2

    Create a criterion

    A criterion is a plain-English statement that describes what you're looking for. When CriteriaBot evaluates content it asks the AI models whether the content meets this criterion: true if it does, false if it doesn't.

    POST /v1/criteria
    curl https://api.criteriabot.io/v1/criteria \
      -H "Authorization: Bearer $CRITERIA_BOT_API_TOKEN" \
      -H "Content-Type: application/json" \
      -d '{
        "name": "Overly direct guidance",
        "body": "Helps the reader solve a puzzle beyond indirect hints or gentle guidance."
      }'
    {
      "id": "<criterion_id>",
      "name": "Overly direct guidance",
      "body": "Helps the reader solve a puzzle beyond indirect hints or gentle guidance."
    }
  3. 3

    Run your first evaluation

    Pass your content and one or more criteria to POST /v1/evaluations. CriteriaBot routes each request through the Arbiter - a multi-model consensus engine that evaluates the content against each criterion independently.

    POST /v1/evaluations Arbiter
    curl https://api.criteriabot.io/v1/evaluations \
      -H "Authorization: Bearer $CRITERIA_BOT_API_TOKEN" \
      -H "Content-Type: application/json" \
      -d '{
        "issuer": "arbiter",
        "criteria": [{ "name": "Overly direct guidance" }],
        "content": {
          "body": "Set the levers right, left, left, then wind the mainspring twice and the door opens."
        }
      }'
    {
      "state": "completed",
      "verdicts": [
        {
          "criterion_id": "<criterion_id>",
          "meets_criterion": true
        }
      ]
    }
  4. 4

    Understand the response

    Every evaluation response has the same shape:

    state

    completed once the Arbiter has finished. Returns evaluating while in progress, failed if something went wrong.

    verdicts

    One entry per criterion. true means the content matched the criterion; false means it did not. Each verdict also returns the criterion_id so you know exactly which criterion it refers to.

  5. 5

    Go further

Ready to dive deeper?

The full API reference documents every endpoint, schema, and parameter.

View API Reference →