Online Evals

Athina users run millions of evals on their logged inferences every week. Sign up for free

Evaluating logs in production is the only way to know if your LLM application is working correctly in the real world.

Online evals are a critical part of running a successful LLM application. They allow you to measure the quality of your LLM application over time, detect performance and safety issues, and prevent regressions.

Why use Athina for Online Evals?

50+ preset evals
Support for custom evals
Support for popular eval libraries like Ragas, Guardrails, etc
Sampling: sample a subset of logs
Filtering: only run on logs WHERE X is true
Rate limiting: intelligent throttling to avoid rate limiting issues with your LLM provider
Use any model provider for LLM evals
View aggregate analytics
View traces with eval results
Track eval results over time

How does it work?

This is a simplified view of the architecture used to run evals on logged inferences in production at scale.

Key Features

Run the same evaluations in dev, CI / CD, and prod

Observability

Production-Grade Evaluation Without Ground Truth

Cost Management

Flexible Evaluation Framework (Open Source)

Enterprise-Scale Automation

Universal Architecture Support

View Traces with Eval Results

Comprehensive Analytics Suite

Team Collaboration

👋 Athina

We spent a lot of time working through these problems so you don’t need a dedicated team for this. You can see a demo video here. Website: Athina AI (Try our sandbox ). Sign Up for Athina. Github : Run any of our 40+ open source evaluations using our Python SDK to measure your LLM app.

Getting Started

Datasets

Evals

Flows

Annotation

Prompts

Monitoring

Settings

Integrations

Self Hosting

Datasets

Why use Athina for Online Evals?

How does it work?

Key Features

👋 Athina

Getting Started

Datasets

Evals

Flows

Annotation

Prompts

Monitoring

Settings

Integrations

Self Hosting

Datasets

​Why use Athina for Online Evals?

​How does it work?

​Key Features

​👋 Athina

Why use Athina for Online Evals?

How does it work?

Key Features

👋 Athina