Why Athina Evals

Why should I use Athina’s Eval framework instead of writing my own evals? You could build your own eval system from scratch, but here’s why Athina might be better for you.

Athina provides you with 40+ plug-and-play preset evals that have been well-tested
Athina allows you to write custom evals that have been well-tested
Athina evals can run in development, CI / CD, and production, giving you consistent metrics for evaluating model accuracy and safety.
Athina removes the need for your team to write boilerplate loaders, implement LLMs, normalize data formats, etc
Athina offers a modular, extensible framework for writing and running evals
Athina gives you a UI to compare datasets, run evaluations, and keep track of every run.
Athina calculate analytics like pass rate and flakiness, and segments them so you can compare across different prompts, models, topics, environments and customers.

The Athina Team is here for you

We are always improving our eval system.
We work closely with our users, and can even help design custom evals

If you want to talk, book a call with a founder directly.

Open Source Evaluations Cookbooks

Getting Started

Datasets

Evals

Flows

Annotation

Prompts

Monitoring

Settings

Integrations

Self Hosting

Datasets

Why Athina Evals