AthinaExperiment
object, so the experiments will be logged to your develop dashboard, along with the metadata and experiment parameters (like prompt).
A custom grading criteria is the easiest way to create your own eval.
These evals take the format: “If X, then fail. Otherwise, pass”
This gets wrapped inside our CoT prompt, and enforces a JSON output of pass / fail along with a reason.
This is best used for very simple conditional evals (like the one below)