POST
/
api
/
v2
/
log
/
inference

These are all the fields you can log to Athina.

Required Fields

language_model_id
string
required

Identifier for the language model used for inference. This is just a string label, all models are supported.

prompt
string
required

The prompt sent for inference. This can be either a string or the messages array sent to OpenAI. Note that in case of Tool message content can be either string or array.

response
string
required

The response from the LLM. This can be either a string or the ChatCompletion response object from OpenAI.

Eval Fields

For most RAG evals, you must also log these fields:

user_query
string

The user’s query. For conversational applications, this is usually the user’s last message.

Tip: Although this isn’t required, this is highly recommended as several evals depend on this field.
context
string | object

The retrieved context (if using RAG).

Tip: Although this isn’t required, this is highly recommended as several evals depend on this field

Latency

response_time
int

The millisecond response time of the inference.


Status Code and Error

status_code
int

The HTTP status code of the inference call made to the Llm provider. Eg., 200, 500, etc.

error
string

The error message if the inference call failed. Eg “Internal Server Error” etc.


Token Usage and Cost

Athina will automatically calculate token usage and cost if the language_model_id is a known model.

However, you can also log the cost and token usage manually. These are used for analytics.

prompt_tokens
int

The number of input (prompt) tokens used.

completion_tokens
int

The number of output (completion) tokens used.

total_tokens
int

The number of total tokens used. If this is not logged, we will simply add prompt_tokens and completion_tokens

cost
float

The cost of the inference. If this is not provided, we will automatically calculate the cost if the language_model_id is a known model.

Tip: If you log the entire OpenAI ChatCompletion response object to us, we’ll automatically extract the token counts and cost.


Segmentation Fields

Optionally, you can also add the following fields for better segmentation on the dashboard

prompt_slug
string

The identifier for the prompt used for inference. This is useful for segmenting inference calls by prompt.

environment
string

The environment your app is running in (ex: production, staging, etc). This is useful for segmenting inference calls by environment.

customer_id
string

The customer ID. This is useful for segmenting inference calls by customer.

customer_user_id
string

The end user ID. This is useful for segmenting inference calls by the end user.

tags
array[string]

A list of tags to associate with the log. This is useful for segmenting inference calls by tags.


Topics

topic
string

For additional segmentation of data, you can log a topic string to associate with the log. This topic will then be used to filter logs on the dashboard, show comparisons, and granular analytics.


Logging Conversations

To group inferences into a conversation or chat session, just include the session_id field.

session_id
string

The session or conversation ID. This is used for grouping different inferences into a conversation or chain. Read more


Logging Custom Attributes

You can log any custom attributes to Athina to be shown as metadata.

external_reference_id
string

An external reference ID for the inference. This can be used to update the logs later. Read more.

custom_attributes
object

Additional metadata to add to this log.

Optionally, you can also log custom attributes with your prompt. You can pass attribute name and attribute value as key-value pair in the custom_attributes object.

Note:- A prompt run cannot have duplicate attribute names

Ground Truth

If you have ground truth responses, you can log them here. Ground truth responses are required for some evals like Answer Similarity

expected_response
string

Ground-truth or reference response.

Tools

For tools, you may log the following fields:

tools
array[object]

Lists tools (defined as JSON) the model may call. This should be an array of tool definitions.

tool_choice
string

Controls the model’s tool calls. This can be none, auto, or a specific tool name.

Function Calling

For function calling, you may also log the following fields:

functions
array[object]

Function call request

function_call_response
object (JSON)

Function call response

Tip: To avoid adding any latency to your application, log your inference as a fire-and-forget request.

Feedback

Learn how to update logs by ID or by external reference ID.

user_feedback
number

A number representing the end user’s feedback. For example, 1 for positive feedback, -1 for negative feedback.

user_feedback_comment
string

A comment from the end user about the response.

grader_feedback
number

A number representing the grader’s feedback. For example, 1 for positive feedback, -1 for negative feedback.

Model Options

Log model options to get insights into how model behavior affects your end users. The model_options key is optional but required to reproduce llm requests.

model_options.temperature
number

The model temperature, usually between 0.0 and 2.0.

model_options.max_completion_tokens
number

The maximum number of tokens for the model to generate.

model_options.stop
string | array[string]

The stop sequence(s) for the model to use. This can be a single string or an array of strings that indicate where the model should stop generating further tokens.

model_options.top_p
number

Uses nucleus sampling for choosing tokens.

model_options.extra_options
object

A object of key value pairs. Log any key value pairs you want that may help with recreating this llm request.