Getting Started

Your first evaluation
in under five minutes

CriteriaBot lets you define plain-English criteria and evaluate any content against them via a single API call. This guide walks you from zero to a working integration.

1
Get your API token
Create a free account, then open Settings → API Tokens and generate a token. Copy it somewhere safe - you won't be able to see it again.
shell
```
# Store your token in an environment variable
export CRITERIA_BOT_API_TOKEN="sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
```

Create a criterion

A criterion is a plain-English statement that describes what you're looking for. When CriteriaBot evaluates content it asks the AI models whether the content meets this criterion: true if it does, false if it doesn't.

POST /v1/criteria

curl https://api.criteriabot.io/v1/criteria \
  -H "Authorization: Bearer $CRITERIA_BOT_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Overly direct guidance",
    "body": "Helps the reader solve a puzzle beyond indirect hints or gentle guidance."
  }'

{
  "id": "<criterion_id>",
  "name": "Overly direct guidance",
  "body": "Helps the reader solve a puzzle beyond indirect hints or gentle guidance."
}

Run your first evaluation

Pass your content and one or more criteria to POST /v1/evaluations. CriteriaBot routes each request through the Arbiter - a multi-model consensus engine that evaluates the content against each criterion independently.

POST /v1/evaluations Arbiter

curl https://api.criteriabot.io/v1/evaluations \
  -H "Authorization: Bearer $CRITERIA_BOT_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "issuer": "arbiter",
    "criteria": [{ "name": "Overly direct guidance" }],
    "content": {
      "body": "Set the levers right, left, left, then wind the mainspring twice and the door opens."
    }
  }'

{
  "state": "completed",
  "verdicts": [
    {
      "criterion_id": "<criterion_id>",
      "meets_criterion": true
    }
  ]
}

4
Understand the response

Every evaluation response has the same shape:

state
completed once the Arbiter has finished. Returns evaluating while in progress, failed if something went wrong.

verdicts
One entry per criterion. true means the content matched the criterion; false means it did not. Each verdict also returns the criterion_id so you know exactly which criterion it refers to.
5
Go further

Batch evaluations

Send up to 25 evaluations in a single request with POST /v1/evaluations/batch.

Async evaluations

Fire-and-forget with POST /v1/evaluations/async and poll or webhook for results.

Criteria groups

Bundle related criteria into a group and reference the whole set by a single group id.

Ready to dive deeper?

The full API reference documents every endpoint, schema, and parameter.

View API Reference →

Your first evaluation in under five minutes

Get your API token

Create a criterion

Run your first evaluation

Understand the response

Go further

Your first evaluation
in under five minutes