Preset evaluators are a set of common turnkey evaluators that you can use to evaluate your LLM applications.

You can also create custom evaluators. See here for more information.

RAG Evals

These evals are useful for evaluating LLM applications with Retrieval Augmented Generation (RAG):

RAGAS Evals

RAGAS is a popular library with state-of-the-art evaluation metrics for RAG models:

Safety Evals

These evals are useful for evaluating LLM applications with safety in mind:

Summarization Evals

These evals are useful for evaluating LLM-powered summarization performance:

JSON Evals

These evals are useful for validating JSON outputs:

Function Evals

Unlike the previous evaluators which used an LLM for grading, function evals use simple functions to check if:

Head over to the function evaluators page for further details.

Evals with Ground Truth

These evaluators compare the response against reference data:

Head over to the grounded evaluators page for further details.