Athina home page
Search...
⌘K
Ask AI
AI Hub
Website
Sign-up
Sign-up
Search...
Navigation
Getting Started
Getting Started Guides
Docs
API / SDK Reference
Guides
FAQs
Documentation
Open-Source Evals
Blog
Email us
Book a call
Getting Started
Overview
Prompts
Prompt Comparison
Prompt Versioning
Datasets
AWS Bedrock Models
Preparing Data for Fine-Tuning
Get Data from S3 Bucket
Compare and Evaluate Multiple Models
Comparing datasets using Athina IDE
Prototype and Evaluate a Prompt Chain
Run Prompts and Evaluate
Evals
Pairwise Evaluation
Evaluations in CI/CD Pipeline
A guide to RAG evaluation
Prompt Injection: Attacks and Defenses
Running evals as real-time guardrails
Different stages of evaluation
Improving Eval performance
Measuring Retrieval Accuracy in RAG Apps
Pairwise Evaluation
Evaluate Conversations
RAG Evaluators
Experiments
Compare Multiple Models
Flows
Create and Share Flow
Getting Started
Getting Started Guides
It might be helpful to start by reading this:
Athina Platform: Overview and Concepts
If you don’t have an Athina account yet, then start by
signing up and inviting your team
Prompts
Create and commit your first Prompt Template
Flows
Create and share your first Flow
TODO
Observability
Log a sample LLM inference (Python Notebook)
Configure an online eval
Re-run LLM calls with a different prompt or model
View analytics and compare different segments
TODO
Create a dataset from your logs
Export your logs
TODO
Datasets
Getting started: Datasets Overview & Concepts
Creating a dataset
Running an offline eval in Athina IDE
Running a prompt column in Athina IDE
Running dynamic columns in Athina IDE
Running an experiment in Athina IDE
Other Guides
Experimentation
Comparing different models and prompts
Comparing different datasets side-by-side
Prototyping a prompt chain in 3 mins without writing code
Evaluation
RAG Evaluation: A Guide
Measure and Improve retrieval in your RAGs
LLM-as-a-Judge Evaluation
Pairwise Evaluation
Evaluation Best Practices
Improving eval performance
Different stages of evaluation
Safeguarding
Prompt Injection: Attack and Defense
Running evals as real-time guardrails
Prompt Comparison
Assistant
Responses are generated using AI and may contain mistakes.