Other Evaluators
These are evaluators we have custom built for specific customers. We are working on integrating them into our Github repository.
If you would like to use one of these, please contact us at hello@athina.ai, or sign up to stay notified as we release new evals.
API Call
Calls an external API - your custom evals plugged into Athina.
Language Mismatch
Engine: gpt-3.5-turbo
Detect when the LLM response is in a different language to the user’s query
Sensitive Data Leak
Engine: gpt-3.5-turbo
Detect when user query or LLM response contains any personally identifiable information.
Example: names, emails, phone numbers, social security numbers, credit card information, etc
You can configure evals for different types of PII leak.
Eval Explanation: Content Moderation
OpenAI
Uses OpenAI’s content moderation endpoint to determine if a response is harmful, toxic, violent, threatening or sexual.
Eval thresholds can be configured.
Prompt Injection Attacks
[Coming Soon]
Common Mistakes
[Coming Soon]
Restricted Keywords
String Match
Detect when your LLM output contains certain kinds of keywords
Eval: Critical Keywords
String Match
Detect when your LLM output is missing critical keywords
Hallucinated Link
Regex
HTTP
If your LLM response contains a link, we will check if the link is invalid (404).
No LLM, just good old regex + HTTP request.
Hallucinated Email
Engine: gpt-3.5-turbo | Regex
If your LLM response contains an email that was not a part of your provided context, it is likely a hallucinated email.
…