document
and an LLM-generated summary
.
summary
information.QuestionAnswerer
LLM is used to answer each question given ONLY the summary
as context.QuestionAnswerer
LLM is used to answer each question given ONLY the source document
as context.summary
and document
for each question to find contradictions.document
: The source document that contains the information that should be summarized.response
: The LLM generated summary of the source document.gpt-3.5-turbo
Metrics
Agreement Score
: The percentage of questions that had identical answers for both contexts.Hallucination Score
: The percentage of questions where the summary answered A definitively (Y/N) but the source document answered “Unknown”.Contradiction Score
: The percentage of questions where the summary answered A definitively (Y/N) but the source document answered definitively B (Y/N).SummaryLoader
n_questions: int
questions: List[str]
questions
.
question_answerer: QuestionAnswerer
QuestionAnswererBulk
(faster, cheaper, default): uses a single prompt to answer all the questions.QuestionAnswererChainOfThought
(slower, uses more tokens, better reasoning): will prompt the LLM separately for each question, wrapped in a chain of thought prompt.QuestionAnswererWithRetrieval
: (good for large documents) uses a simple similarity search to narrow-down context.QuestionAnswererChainOfThought
: