This is an LLM Graded Evaluator

Github

Info

This evaluator checks if the retrieved context contains enough information to answer the user’s query.

Required Args

  • query: The query, ideally in a question format.
  • context: The retrieved data that should contain the required information to answer the user’s query

Default Engine: gpt-4


Example

  • Query: How much equity does Y Combinator take?
  • Retrieved Context: YC invests $500,000 in 200 startups twice a year.

Eval Result

  • Result: Fail
  • Explanation: The context mentions that YC invests $500,000 but it does not mention how much equity they take, which is what the query is asking about.

Run the eval on a dataset

  1. Load your data with the Loader
from athina.loaders import Loader
 
# Load the data from CSV, JSON, Athina or Dictionary
dataset = Loader().load_json(json_file)
  1. Run the evaluator on your dataset
from athina.evals import ContextContainsEnoughInformation
 
# Checks if the context contains enough information to answer the user query provided
ContextContainsEnoughInformation().run_batch(data=dataset)

Run the eval on a single datapoint

from athina.evals import ContextContainsEnoughInformation
 
# Checks if the context contains enough information to answer the user query provided
ContextContainsEnoughInformation().run(
    query=query,
    context=context
)