Eval Cookbooks

Here are some cookbooks we’ve prepared to make it easy to set up and run evals using Athina.

Run a preset eval : This cookbook shows you how to run a single eval on your dataset
Run an eval suite : This cookbook shows you how to run a suite of evals
Run an experiment This cookbook shows how to run an eval using Athina, and also log the experiment configuration.

This is very similar to #1, but you are also describing an AthinaExperiment object, so the experiments will be logged to your develop dashboard, along with the metadata and experiment parameters (like prompt).

Run an eval with a custom grading criteria

A custom grading criteria is the easiest way to create your own eval. These evals take the format: “If X, then fail. Otherwise, pass” This gets wrapped inside our CoT prompt, and enforces a JSON output of pass / fail along with a reason. This is best used for very simple conditional evals (like the one below)

Why Athina Evals Overview

Getting Started

Datasets

Evals

Flows

Annotation

Prompts

Monitoring

Settings

Integrations

Self Hosting

Datasets

Eval Cookbooks