Try it out on Google Colab →

Try Athina IDE →

This guide demonstrates how to log a dataset from HuggingFace into Athina using Python. We’ll walk through the process step-by-step, explaining each part of the code and its purpose.

Prerequisites

Before you begin, make sure you have:

  1. An Athina account and API key (you can sign up for free here)
  2. Python installed on your system
  3. The necessary Python libraries: datasets, athina-client

Step-by-Step Guide

0. Get your Athina API Key

You can get an Athina API key by signing up at https://app.athina.ai

1. Install Required Libraries

Install and import the required libraries to get started.

pip install datasets athina-client
import os
from athina_client import AthinaApiKey
from athina_client.datasets import Dataset
from datasets import load_dataset

Also, set your Athina API key:

AthinaApiKey.set_key(os.getenv("ATHINA_API_KEY"))

2. Load the Dataset from HuggingFace

HF_DATASET_ID = "openai/gsm8k"
SUBSET = "main"
SPLIT = "train"
LIMIT = 1000 # Number of rows to add - max. 1000

# Load a dataset from Hugging Face
hf_dataset = load_dataset(path=HF_DATASET_ID, data_dir=SUBSET, split=SPLIT)

# Define rows to add
rows = hf_dataset.to_list()[:1000]

Currently, you can add a maximum of 1000 rows to a dataset in Athina.

3. Log the Dataset to Athina

We’ll use the athina_client library to log the dataset to Athina.

# Create a dataset on Athina
athina_dataset = Dataset.create(name=f"{HF_DATASET_ID}-{SUBSET}-{SPLIT}", rows=rows)

# Print the dataset URL
print (f"View dataset on Athina: https://app.athina.ai/develop/{athina_dataset.id}")

Athina is a collaborative IDE that lets teams experiment, evaluate, and monitor AI applications in a spreadsheet-like UI.

What Can You Do After Creating a Dataset?

  • Run dynamic prompts on every row, using other columns as variables.
  • Transform the dataset by executing custom code.
  • Create custom evaluations or run 50+ preset evals and view metrics in a powerful dashboard.
  • Use dynamic columns to classify text, retrieve data, extract entities, transform data, fetch from external APIs, and more.
  • Experiment with multiple combinations of prompts and models simultaneously.