Athina has a large library of preset evaluators to cover all kinds of common use cases.
You can also create custom evaluators. See here for more information.
These evals are useful for evaluating LLM applications with Retrieval Augmented Generation (RAG):
RAGAS is a popular library with state-of-the-art evaluation metrics for RAG models:
These evals are useful for evaluating LLM applications with safety in mind:
These evals are useful for evaluating LLM-powered summarization performance:
These evals are useful for validating JSON outputs:
Unlike the previous evaluators which used an LLM for grading, function evals use simple functions to check if:
Head over to the function evaluators page for further details.
These evaluators compare the response against reference data:
Head over to the grounded evaluators page for further details.
Athina has a large library of preset evaluators to cover all kinds of common use cases.
You can also create custom evaluators. See here for more information.
These evals are useful for evaluating LLM applications with Retrieval Augmented Generation (RAG):
RAGAS is a popular library with state-of-the-art evaluation metrics for RAG models:
These evals are useful for evaluating LLM applications with safety in mind:
These evals are useful for evaluating LLM-powered summarization performance:
These evals are useful for validating JSON outputs:
Unlike the previous evaluators which used an LLM for grading, function evals use simple functions to check if:
Head over to the function evaluators page for further details.
These evaluators compare the response against reference data:
Head over to the grounded evaluators page for further details.