Logging Attributes

These are all the fields you can log to Athina.

{
    "language_model_id": "gpt-4",
    "prompt": [
        {
            "role": "system",
            "content": "Answer the following question using the information provided.\n ### INFORMATION ### Neil Armstrong landed on the moon in 1969.\n ### QUERY ###"
        },
        {
            "role": "user",
            "content": "Which spaceship was first to land on the moon?"
        }
    ],
    "response": "The Apollo 11 was the first spaceship to land on the moon.",
    "user_query": "Which spaceship was first to land on the moon?",
    "context": {
        "information": ["Neil Armstrong landed on the moon in 1969."]
    },
    "custom_attributes": {
        "company": "OpenAI",
        "links": ["https://openai.com"]
    },
    "prompt_tokens": 22,
    "completion_tokens": 9,
    "total_tokens": 31,
    "cost": 0.002,
    "response_time": 150,
    "prompt_slug": "qa_chatbot_response",
    "environment": "development",
    "customer_id": "xyz-123",
    "customer_user_id": "user-456",
    "session_id": "session-789",
    "external_reference_id": "ref-101112",
    "expected_response": "The Apollo 11 was the first spaceship to land on the moon.",
    "tools": [],
    "tool_calls": null,
    "functions": [],
    "function_call_response": {},
    "status_code": 500,
    "error": "Error message",
    "tags": ["qa", "chatbot"],
    "model_options": {
        "temperature": 0.7,
        "max_completion_tokens": 100,
        "top_p": 0.9,
        "extra_options": {
          "json_mode": true
        }
    }
}

Required Fields

language_model_id

string

required

Identifier for the language model used for inference. This is just a string label, all models are supported.

prompt

string

required

The prompt sent for inference. This can be either a string or the messages array sent to OpenAI. Note that in case of Tool message content can be either string or array.

response

string

required

The response from the LLM. This can be either a string or the ChatCompletion response object from OpenAI.

Eval Fields

For most RAG evals, you must also log these fields:

user_query

string

The user’s query. For conversational applications, this is usually the user’s last message.

Tip: Although this isn’t required, this is highly recommended as several evals depend on this field.

context

string | object

The retrieved context (if using RAG).

Tip: Although this isn’t required, this is highly recommended as several evals depend on this field

Latency

response_time

int

The millisecond response time of the inference.

Status Code and Error

status_code

int

The HTTP status code of the inference call made to the Llm provider. Eg., 200, 500, etc.

error

string

The error message if the inference call failed. Eg “Internal Server Error” etc.

Token Usage and Cost

Athina will automatically calculate token usage and cost if the language_model_id is a known model. However, you can also log the cost and token usage manually. These are used for analytics.

prompt_tokens

int

The number of input (prompt) tokens used.

completion_tokens

int

The number of output (completion) tokens used.

total_tokens

int

The number of total tokens used. If this is not logged, we will simply add prompt_tokens and completion_tokens

cost

float

The cost of the inference. If this is not provided, we will automatically calculate the cost if the language_model_id is a known model.

Tip: If you log the entire OpenAI ChatCompletion response object to us, we’ll automatically extract the token counts and cost.

Segmentation Fields

Optionally, you can also add the following fields for better segmentation on the dashboard

prompt_slug

string

The identifier for the prompt used for inference. This is useful for segmenting inference calls by prompt.

environment

string

The environment your app is running in (ex: production, staging, etc). This is useful for segmenting inference calls by environment.

customer_id

string

The customer ID. This is useful for segmenting inference calls by customer.

customer_user_id

string

The end user ID. This is useful for segmenting inference calls by the end user.

Topics

topic

string

For additional segmentation of data, you can log a topic string to associate with the log. This topic will then be used to filter logs on the dashboard, show comparisons, and granular analytics.

Logging Conversations

To group inferences into a conversation or chat session, just include the session_id field.

session_id

string

The session or conversation ID. This is used for grouping different inferences into a conversation or chain. Read more

Logging Custom Attributes

You can log any custom attributes to Athina to be shown as metadata.

external_reference_id

string

An external reference ID for the inference. This can be used to update the logs later. Read more.

custom_attributes

object

Additional metadata to add to this log.

Optionally, you can also log custom attributes with your prompt. You can pass attribute name and attribute value as key-value pair in the custom_attributes object. Note:- A prompt run cannot have duplicate attribute names

Ground Truth

If you have ground truth responses, you can log them here. Ground truth responses are required for some evals like Answer Similarity

expected_response

string

Ground-truth or reference response.

Tools

For tools, you may log the following fields:

tools

array[object]

Lists tools (defined as JSON) the model may call. This should be an array of tool definitions.

tool_choice

string

Controls the model’s tool calls. This can be none, auto, or a specific tool name.

{
  "language_model_id": "gpt-4",
  "prompt": [
    {
      "role": "user",
      "content": "How hot is it today in New York?"
    }
  ],
  "response": {
    "id": "chatcmpl-AAxPVOxthxY6I8infZcR2LUGPgrY",
    "choices": [
      {
        "finish_reason": "tool_calls",
        "index": 0,
        "logprobs": null,
        "message": {
          "content": null,
          "refusal": null,
          "role": "assistant",
          "function_call": null,
          "tool_calls": [
            {
              "id": "call_ngUsEYbYpmPJF1EJusi2UjU",
              "function": {
                "arguments": "{\n  \"location\": \"New York\",\n  \"format\": \"celsius\"\n}",
                "name": "get_current_weather"
              },
              "type": "function"
            }
          ]
        }
      }
    ],
    "created": 1727175017,
    "model": "gpt-4-0613",
    "object": "chat.completion",
    "service_tier": null,
    "system_fingerprint": null,
    "usage": {
      "completion_tokens": 25,
      "prompt_tokens": 93,
      "total_tokens": 118,
      "completion_tokens_details": {
        "reasoning_tokens": 0
      }
    }
  },
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            },
            "format": {
              "type": "string",
              "enum": ["celsius", "fahrenheit"],
              "description": "The temperature unit to use. Infer this from the users location."
            }
          },
          "required": ["location", "format"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}

Function Calling

For function calling, you may also log the following fields:

functions

array[object]

Function call request

function_call_response

object (JSON)

Function call response

Tip: To avoid adding any latency to your application, log your inference as a fire-and-forget request.

Feedback

Learn how to update logs by ID or by external reference ID.

user_feedback

number

A number representing the end user’s feedback. For example, 1 for positive feedback, -1 for negative feedback.

user_feedback_comment

string

A comment from the end user about the response.

grader_feedback

number

A number representing the grader’s feedback. For example, 1 for positive feedback, -1 for negative feedback.

Model Options

Log model options to get insights into how model behavior affects your end users. The model_options key is optional but required to reproduce llm requests.

model_options.temperature

number

The model temperature, usually between 0.0 and 2.0.

model_options.max_completion_tokens

number

The maximum number of tokens for the model to generate.

model_options.stop

string | array[string]

The stop sequence(s) for the model to use. This can be a single string or an array of strings that indicate where the model should stop generating further tokens.

model_options.top_p

number

Uses nucleus sampling for choosing tokens.

model_options.extra_options

object

A object of key value pairs. Log any key value pairs you want that may help with recreating this llm request.

Logging

Datasets

Evals

GraphQL API

Deprecated

Logging Attributes

Required Fields

Eval Fields

Latency

Status Code and Error

Token Usage and Cost

Segmentation Fields

Topics

Logging Conversations

Logging Custom Attributes

Ground Truth

Tools

Function Calling

Feedback

Model Options

Logging

Datasets

Evals

GraphQL API

Deprecated

​Required Fields

​Eval Fields

​Latency

​Status Code and Error

​Token Usage and Cost

​Segmentation Fields

​Topics

​Logging Conversations

​Logging Custom Attributes

​Ground Truth

​Tools

​Function Calling

​Feedback

​Model Options

Required Fields

Eval Fields

Latency

Status Code and Error

Token Usage and Cost

Segmentation Fields

Topics

Logging Conversations

Logging Custom Attributes

Ground Truth

Tools

Function Calling

Feedback

Model Options