You can load your data for evals using llama-index
from athina.loaders import Loader
import pandas as pd
from llama_index import VectorStoreIndex, ServiceContext
from llama_index import download_loader
WikipediaReader = download_loader("WikipediaReader")
loader = WikipediaReader()
documents = loader.load_data(pages=['Berlin'])
vector_index = VectorStoreIndex.from_documents(
documents, service_context=ServiceContext.from_defaults(chunk_size=512)
)
query_engine = vector_index.as_query_engine()
raw_data_llama_index = [
{
"query": "Where is Berlin?",
"expected_response": "Berlin is the capital city of Germany"
},
{
"query": "What is the main cuisine of Rome?",
"expected_response": "Pasta dish with a sauce made with egg yolks"
},
]
llama_index_dataset = Loader().load_from_llama_index(raw_data_llama_index, query_engine)
That’s all you need to do to load your data!
To view the imported dataset as a pandas DataFrame:
pd.DataFrame(llama_index_dataset)
The output format will be different for different Loaders.
The Loader
will return a List[DataPoint]
type after you call the load function of choice.
class RagDataPoint(TypedDict):
query: str
context: List[str]
response: str
expected_response: str