This is a simplified view of the architecture used to support running evals in production at scale.

Major Challenges


👋 Athina

We spent a lot of time working through these problems so you don’t need a dedicated team for this. You can see a demo video here.

Website: Athina AI (Try our sandbox ).

Sign Up for Athina.

Github : Run any of our 40+ open source evaluations using our Python SDK to measure your LLM app.