Getting Started
Your first evaluation
in under five minutes
CriteriaBot lets you define plain-English criteria and evaluate any content against them via a single API call. This guide walks you from zero to a working integration.
-
1
Get your API token
Create a free account, then open Settings → API Tokens and generate a token. Copy it somewhere safe - you won't be able to see it again.
shell# Store your token in an environment variable export CRITERIA_BOT_API_TOKEN="sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
-
2
Create a criterion
A criterion is a plain-English statement that describes what you're looking for. When CriteriaBot evaluates content it asks the AI models whether the content meets this criterion:
trueif it does,falseif it doesn't.POST /v1/criteriacurl https://api.criteriabot.io/v1/criteria \ -H "Authorization: Bearer $CRITERIA_BOT_API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "name": "Overly direct guidance", "body": "Helps the reader solve a puzzle beyond indirect hints or gentle guidance." }'
{ "id": "<criterion_id>", "name": "Overly direct guidance", "body": "Helps the reader solve a puzzle beyond indirect hints or gentle guidance." } -
3
Run your first evaluation
Pass your content and one or more criteria to
POST /v1/evaluations. CriteriaBot routes each request through the Arbiter - a multi-model consensus engine that evaluates the content against each criterion independently.POST /v1/evaluations Arbitercurl https://api.criteriabot.io/v1/evaluations \ -H "Authorization: Bearer $CRITERIA_BOT_API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "issuer": "arbiter", "criteria": [{ "name": "Overly direct guidance" }], "content": { "body": "Set the levers right, left, left, then wind the mainspring twice and the door opens." } }'
{ "state": "completed", "verdicts": [ { "criterion_id": "<criterion_id>", "meets_criterion": true } ] } -
4
Understand the response
Every evaluation response has the same shape:
statecompletedonce the Arbiter has finished. Returnsevaluatingwhile in progress,failedif something went wrong.verdictsOne entry per criterion.
truemeans the content matched the criterion;falsemeans it did not. Each verdict also returns thecriterion_idso you know exactly which criterion it refers to. -
5
Go further
Ready to dive deeper?
The full API reference documents every endpoint, schema, and parameter.