- Context Contains Enough Information: Does the retrieved context contains enough information to answer the query.
- Faithfulness: Is the response faithful to the context. (Unfaithful responses are correlated with hallucinations)
- Does Response Answer Query: Does the response answer the user’s query. Checks for relevance and answer completeness.
Context Contains Enough Information
Docs | Github
- Query: How much equity does Y Combinator take?
- Retrieved Context: YC invests $500,000 in 200 startups twice a year.
Eval Result - Result: Fail - Explanation: The context mentions that YC invests $500,000 but it does not mention how much equity they take, which is what the query is asking about.
Faithfulness
Docs | Github
- Query: YC invests $500,000 in 200 startups twice a year.
- Retrieved Context: YC takes 5-7% equity.
Eval Result
- Result: Fail
- Explanation: The response mentions that YC takes 5-7% equity, but this is not mentioned anywhere in the context.
Answer Completeness
Docs | Github
- Query: Which spaceship landed on the moon first?
- Retrieved Context: Neil Armstrong was the first man to set foot on the moon in 1969
Eval Result
- Result: Fail
- Explanation: The query is asking which spaceship landed on the moon first, but the response only mentions the name of the astronaut, and does not say anything about the name of the spaceship.
- Response is irrelevant or tangential to the query.
- Response does not sufficiently answer the query.