Logging Attributes
These are all the fields you can log to Athina.
Required Fields
Identifier for the language model used for inference. This is just a string label, all models are supported.
The prompt sent for inference. This can be either a string
or the messages
array sent to OpenAI. Note that in case of Tool message content can be either
string or array.
The response from the LLM. This can be either a string
or the ChatCompletion
response object from OpenAI.
Eval Fields
For most RAG evals, you must also log these fields:
The user’s query. For conversational applications, this is usually the user’s last message.
The retrieved context (if using RAG).
Latency
The millisecond response time of the inference.
Status Code and Error
The HTTP status code of the inference call made to the Llm provider. Eg., 200, 500, etc.
The error message if the inference call failed. Eg “Internal Server Error” etc.
Token Usage and Cost
Athina will automatically calculate token usage and cost if the language_model_id
is a known model.
However, you can also log the cost and token usage manually. These are used for analytics.
The number of input (prompt) tokens used.
The number of output (completion) tokens used.
The number of total tokens used. If this is not logged, we will simply add
prompt_tokens
and completion_tokens
The cost of the inference. If this is not provided, we will automatically
calculate the cost if the language_model_id
is a known model.
Tip: If you log the entire OpenAI ChatCompletion
response object to us,
we’ll automatically extract the token counts and cost.
Segmentation Fields
Optionally, you can also add the following fields for better segmentation on the dashboard
The identifier for the prompt used for inference. This is useful for segmenting inference calls by prompt.
The environment your app is running in (ex: production
, staging
, etc).
This is useful for segmenting inference calls by environment.
The customer ID. This is useful for segmenting inference calls by customer.
The end user ID. This is useful for segmenting inference calls by the end user.
A list of tags to associate with the log. This is useful for segmenting inference calls by tags.
Topics
For additional segmentation of data, you can log a topic string to associate with the log. This topic will then be used to filter logs on the dashboard, show comparisons, and granular analytics.
Logging Conversations
To group inferences into a conversation or chat session, just include the session_id
field.
The session or conversation ID. This is used for grouping different inferences into a conversation or chain. Read more
Logging Custom Attributes
You can log any custom attributes to Athina to be shown as metadata.
An external reference ID for the inference. This can be used to update the logs later. Read more.
Additional metadata to add to this log.
Optionally, you can also log custom attributes with your prompt. You can pass attribute name and attribute value as key-value pair in the custom_attributes
object.
Note:- A prompt run cannot have duplicate attribute names
Ground Truth
If you have ground truth responses, you can log them here. Ground truth responses are required for some evals like Answer Similarity
Ground-truth or reference response.
Tools
For tools, you may log the following fields:
Lists tools (defined as JSON) the model may call. This should be an array of tool definitions.
Controls the model’s tool calls. This can be none
, auto
, or a specific
tool name.
Function Calling
For function calling, you may also log the following fields:
Function call request
Function call response
Tip: To avoid adding any latency to your application, log your inference as a fire-and-forget request.
Feedback
Learn how to update logs by ID or by external reference ID.
A number representing the end user’s feedback. For example, 1
for positive
feedback, -1
for negative feedback.
A comment from the end user about the response.
A number representing the grader’s feedback. For example, 1
for positive
feedback, -1
for negative feedback.
Model Options
Log model options to get insights into how model behavior affects your end users.
The model_options
key is optional but required to reproduce llm requests.
The model temperature, usually between 0.0 and 2.0.
The maximum number of tokens for the model to generate.
The stop sequence(s) for the model to use. This can be a single string or an array of strings that indicate where the model should stop generating further tokens.
Uses nucleus sampling for choosing tokens.
A object of key value pairs. Log any key value pairs you want that may help with recreating this llm request.