Getting Started with Athina
Athina IDE
Athina IDE is a collaborative editor for AI teams to prototype, experiment, and evaluate LLM-powered applications.
Watch a Demo Video
View a Demo Video to learn more about Athina IDE.
Create Datasets
Log a dataset programmatically, import your logs, or upload a JSON / CSV file to create a dataset.
Dynamic Columns
Create dynamic columns to run LLM prompts, execute code, make API calls, and perform data transformations.
Run Evaluations
Run evaluations on a dataset. Choose from our library of 50+ presets or create your own.
Prototype Pipelines
Chain prompts, API calls and more to prototype complex pipelines without writing any code.
Compare Datasets
Compare and evaluate multiple datasets at a time.
Generate Synthetic Data
Generate synthetic datasets using your own documents.
Logging
Logging your data is the first step to using Athina’s Observability Platform.
OpenAI Logging
Using OpenAI? Get set up in 2 lines of code.
Langchain Logging
Using Langchain? Get set up in 2 lines of code.
LiteLLM Logging
Using LiteLLM? Get set up in 2 lines of code.
Log via API
Log your inferences through our flexible API or SDK (works with all models!)
Log complex traces
Log complex traces - use our SDK to log complex traces
Prompt Management
You might find these guides useful.
Athina Playground
Experiment with different prompts and models
Prompt Management
Learn how prompt management works in Athina
Version Control
Learn how to version control your prompts in Athina
Manage and Run Prompts Programmatically
Learn how to run your saved prompts via API or SDK
Organize your Prompts
Learn how to organize your prompts into folders, favorites and more
Evaluations
Running evals using Athina SDK
Run 40+ preset evals or your own custom evals in just a few lines of code using our python SDK.
Running evals on Athina Platform
Run 40+ preset evals or your own custom evals on any dataset.
Comparing datasets using Athina
Compare retrievals and responses from different datasets, and run evaluations on both datasets.
Continuous evaluation in production
Configure evaluations to run continuously on production logs to measure quality, and detect hallucinations.
Setting up evals in CI / CD
Run evals in your CI / CD pipeline to prevent regressions, and ensure that bad prompts / models don’t get to production.
Analytics
Analytics
Want to view analytics for production logs?
Cost Tracking
How to track inference costs by prompt, model, or customer ID.
Comparing Metrics
Compare usage analytics and evaluation metrics for different models, prompts, topics or customer IDs.