There are a few ways to run evals using Athina:

Running evals in Athina IDE

For a more comprehensive video guide on running evals in Athina IDE, see this guide.

Running evals programmatically using Python SDK

Here’s a 1-minute video tutorial showcasing how you can quickly run pre-built evals, and view the results on the dashboard.

The easiest way to get started is to use one of our Example Notebooks as a starting point.

For more detailed guides, you can follow the links below to get started running evals using Athina.

Configure evals to run continuously on Production Traces

If you configure evaluations in the dashboard at , they will run automatically against all logged inferences that meet your filters.

Note: Logs may be sampled to ensure that evaluations run within your configured limits. You can adjust these limits in the Settings page.

Note: Continuous evaluation is only available for paid plans. Contact to upgrade your plan.

Running evals as guardrails around inference using athina.guard()

This is useful if you want to run evaluations at inference time to prevent bad user queries or bad responses.

Keep in mind there may be latency impacts here. We recommend running only low-latency evaluations if you’re using athina.guard().

Follow this example notebook

Run a single eval manually from the Inference (Trace) page.

  1. Open the inference you want to evaluate, and click the “Run Eval” button (located towards the top-right).
  2. Choose the evaluation you want to run (Note: function evals cannot be run from the inference page).
  3. Choose the LLM engine for your evaluation.

Eval results will appear shortly in the Evals tab on the right.

Run an eval manually from the inference page