You can use our preset evaluators to add evaluations to your dev stack rapidly.

Here are our preset evaluators:

RAG Evals

These evals are useful for evaluating LLM applications with Retrieval Augmented Generation (RAG).

RAGAS Evals

RAGAS is a popular library with state-of-the-art evaluation metrics for RAG models:


Safety Evals

These evals are useful for evaluating LLM applications with safety in mind.


Summarization Evals:

These evals are useful for evaluating LLM-powered summarization performance.


Custom Evals

These evals can help you create custom evaluation conditions.


Function Evals

Unlike the previous evaluators which used an LLM for grading, function evals do not use an LLM - they just use simple functions. For example the evaluator checks if the

Head over to the function evaluators page for further details.


Evals with ground-truth

So far all the previous evaluators do not compare the response against any reference data. Following are evaluators that can compare the LLM generated response against the expected_response or context. For example

Head over to the grounded evaluators page for further details.