RAG Evals
Response Faithfulness
This is an LLM Graded Evaluator
Info
This evaluator checks if the LLM-generated response is faithful to the provided context.
For many RAG apps, you want to constrain the response to the context you are providing it (since you know it to be true). But sometimes, the LLM might use its pretrained knowledge to generate an answer. This is often the cause of “Hallucinations”.
Required Args
context
: The context that your response should be faithful toresponse
: The LLM generated response
Default Engine: gpt-4
Example
- Query: YC invests $500,000 in 200 startups twice a year.
- Retrieved Context: YC takes 5-7% equity.
Eval Result
- Result: Fail
- Explanation: The response mentions that YC takes 5-7% equity, but this is not mentioned anywhere in the context.
Run the eval on a dataset
- Load your data with the
Loader
- Run the evaluator on your dataset