
- View the evaluators in the Athina IDE.
- View the evaluators on Github in Athina’s Open-Source Evaluation SDK.
Available Preset Evaluators
You can also create custom evaluators. See here for
more information.
RAG Evals
These evals are useful for evaluating LLM applications with Retrieval Augmented Generation (RAG):RAGAS Evals
RAGAS is a popular library with state-of-the-art evaluation metrics for RAG models:- Context Precision
- Context Relevancy
- Context Recall
- Faithfulness
- Answer Relevancy
- Answer Semantic Similarity
- Answer Correctness
- Coherence
- Conciseness
- Maliciousness
- Harmfulness
Safety Evals
These evals are useful for evaluating LLM applications with safety in mind:- PII Detection: Will fail if PII is found in the text
- Prompt Injection: Will fail if any known Prompt Injection attack is found in the text
- OpenAI Content Moderation: Will fail if text is potentially harmful
- Guardrails: A popular library for custom validators for LLM applications:
- Safe for work: Checks if text has inappropriate/NSFW content
- Not gibberish: Checks if response contains gibberish
- Contains no sensitive topics: Checks for sensitive topics
Summarization Evals
These evals are useful for evaluating LLM-powered summarization performance:JSON Evals
These evals are useful for validating JSON outputs:Function Evals
Unlike the previous evaluators which used an LLM for grading, function evals use simple functions to check if:- Text matches a given regular expression
- Text contains a link
- Text contains keywords
- Text contains no invalid links
- Text is missing keywords