These evaluators run a defined function on the response.
How does it work
A function evaluator runs a provided function along with the arguments for this function on the response and return whether the function passed or not.
Required Args
Your dataset must contain these fields:
response
: The LLM generated response for the user queryMetrics
Passed
: Boolean(True/False) value specifying whether the function passed or not.Description: Checks if the response
contains the regex pattern.
Arguments:
pattern
: str
Pattern to search for.Sample Code:
Description: Checks if the response
contains any word from the list of keywords.
Arguments:
keywords
: List[str]
List of keywordscase_sensitive
: Optional[bool]
. Defaults to False
.Sample Code:
Description: Checks if the response
does not contain any of the specified substrings.
Arguments:
keywords
: List of strings - keywords to check for absence in the context.Sample Code:
Contains
Description:
Checks if the response
contains the specified keyword.
Arguments:
keyword
: string to check for presence in the response.Sample Code:
ContainsAll
Description:
Checks if all the provided keywords are present in the response
.
Arguments:
keywords
: List[str] - The list of keywords to search for in the response.case_sensitive
: bool, optional - If True
, the comparison is case-sensitive. Defaults to False
.Sample Code:
ContainsJson
Description:
Checks if the response
contains a valid JSON.
Arguments:
Sample Code:
ContainsEmail
Description:
Checks if the response
contains a valid email address.
Arguments:
Sample Code:
IsJson
Description:
Checks if the response
is a valid JSON.
Arguments:
Sample Code:
IsEmail
Description:
Checks if the response
is a valid email address.
Arguments:
Sample Code:
ContainsLink
Description:
Checks if the response
contains any links.
Arguments:
Sample Code:
ContainsValidLink
Description:
Checks if the response
contains valid links.
Arguments:
Sample Code:
NoInvalidLinks
Description:
Checks if the response
does not contain any invalid links.
Arguments:
Sample Code:
ApiCall
Description:
Performs an API call to a specified endpoint and picks up the evaluation result from the response. This evaluator is useful when you want to run some complex or custom logic on the response.
Arguments:
url
: string - API endpoint to call. Note that this API should accept POST request.headers
: dict - Headers to include in the API call.payload
: dict - Body to send with the API call. This payload will have the Response added to it.Sample Code:
result
and reason
. - The result
key should contain the evaluation result
which should be a boolean value. - The reason
key should contain the reason
for the evaluation result which should be a string. - The dataset should
contain the response
and optionally the query
, context
and
expected_response
to be passed to the API.Description: Checks if the response
is exactly equal to the specified string.
Arguments:
expected_response
: str
String to compare the response with.Sample Code:
Description: checks if the response
starts with the specified substring.
Arguments:
substring
: str
string to check at the start of the response
.Sample Code:
Description: checks if the response
ends with the specified substring.
Arguments:
substring
: str
string to check at the end of the response
.Sample Code:
Description: Checks if the length of the response
is less than a maximum length.
Arguments:
max_length
: int
the maximum allowable length for the response
.Sample Code:
Description: Checks if the length of the response
is more than a minimum length.
Arguments:
min_length
: int
the minimum allowable length for the response
.Sample Code:
Description: Checks if the length of the response
is between the minimum and maximum length.
Arguments:
min_length
: int
the minimum allowable length for the response
.max_length
: int
the maximum allowable length for the response
.Sample Code:
Description: Checks if the response
is a single line.
Arguments:
Sample Code:
Description: Runs a custom code as an evaluator.
Arguments:
code
: str
Code to be executed. The code should contain a function named main
which takes **kwargs
as input and returns a boolean value.Sample Code:
Read more about CustomCodeEval
Description: Validates the JSON structure against a specified JSON schema.
Arguments:
schema
: str
The JSON schema to validate against.Sample Code:
Description: Validates the value of a JSON field against a specified condition.
Arguments:
validations: list A list of validation rules. Each rule is a dictionary with the following keys: json_path: str The JSON path to the field to validate. validating_function: str The name of the validation function to use.
validations
: list
The validations listSample Code:
These evaluators run a defined function on the response.
How does it work
A function evaluator runs a provided function along with the arguments for this function on the response and return whether the function passed or not.
Required Args
Your dataset must contain these fields:
response
: The LLM generated response for the user queryMetrics
Passed
: Boolean(True/False) value specifying whether the function passed or not.Description: Checks if the response
contains the regex pattern.
Arguments:
pattern
: str
Pattern to search for.Sample Code:
Description: Checks if the response
contains any word from the list of keywords.
Arguments:
keywords
: List[str]
List of keywordscase_sensitive
: Optional[bool]
. Defaults to False
.Sample Code:
Description: Checks if the response
does not contain any of the specified substrings.
Arguments:
keywords
: List of strings - keywords to check for absence in the context.Sample Code:
Contains
Description:
Checks if the response
contains the specified keyword.
Arguments:
keyword
: string to check for presence in the response.Sample Code:
ContainsAll
Description:
Checks if all the provided keywords are present in the response
.
Arguments:
keywords
: List[str] - The list of keywords to search for in the response.case_sensitive
: bool, optional - If True
, the comparison is case-sensitive. Defaults to False
.Sample Code:
ContainsJson
Description:
Checks if the response
contains a valid JSON.
Arguments:
Sample Code:
ContainsEmail
Description:
Checks if the response
contains a valid email address.
Arguments:
Sample Code:
IsJson
Description:
Checks if the response
is a valid JSON.
Arguments:
Sample Code:
IsEmail
Description:
Checks if the response
is a valid email address.
Arguments:
Sample Code:
ContainsLink
Description:
Checks if the response
contains any links.
Arguments:
Sample Code:
ContainsValidLink
Description:
Checks if the response
contains valid links.
Arguments:
Sample Code:
NoInvalidLinks
Description:
Checks if the response
does not contain any invalid links.
Arguments:
Sample Code:
ApiCall
Description:
Performs an API call to a specified endpoint and picks up the evaluation result from the response. This evaluator is useful when you want to run some complex or custom logic on the response.
Arguments:
url
: string - API endpoint to call. Note that this API should accept POST request.headers
: dict - Headers to include in the API call.payload
: dict - Body to send with the API call. This payload will have the Response added to it.Sample Code:
result
and reason
. - The result
key should contain the evaluation result
which should be a boolean value. - The reason
key should contain the reason
for the evaluation result which should be a string. - The dataset should
contain the response
and optionally the query
, context
and
expected_response
to be passed to the API.Description: Checks if the response
is exactly equal to the specified string.
Arguments:
expected_response
: str
String to compare the response with.Sample Code:
Description: checks if the response
starts with the specified substring.
Arguments:
substring
: str
string to check at the start of the response
.Sample Code:
Description: checks if the response
ends with the specified substring.
Arguments:
substring
: str
string to check at the end of the response
.Sample Code:
Description: Checks if the length of the response
is less than a maximum length.
Arguments:
max_length
: int
the maximum allowable length for the response
.Sample Code:
Description: Checks if the length of the response
is more than a minimum length.
Arguments:
min_length
: int
the minimum allowable length for the response
.Sample Code:
Description: Checks if the length of the response
is between the minimum and maximum length.
Arguments:
min_length
: int
the minimum allowable length for the response
.max_length
: int
the maximum allowable length for the response
.Sample Code:
Description: Checks if the response
is a single line.
Arguments:
Sample Code:
Description: Runs a custom code as an evaluator.
Arguments:
code
: str
Code to be executed. The code should contain a function named main
which takes **kwargs
as input and returns a boolean value.Sample Code:
Read more about CustomCodeEval
Description: Validates the JSON structure against a specified JSON schema.
Arguments:
schema
: str
The JSON schema to validate against.Sample Code:
Description: Validates the value of a JSON field against a specified condition.
Arguments:
validations: list A list of validation rules. Each rule is a dictionary with the following keys: json_path: str The JSON path to the field to validate. validating_function: str The name of the validation function to use.
validations
: list
The validations listSample Code: