Skip to main content

TestCase

A TestCase represents a single evaluation scenario for testing agent behavior.

Import

from openstackai.evaluation import TestCase

Constructor

TestCase(
input: str, # Input prompt/question
expected_output: str = None, # Expected response (optional)
criteria: list[str] = None, # Criteria to evaluate
metadata: dict = None, # Additional metadata
name: str = None, # Test case name
tags: list[str] = None, # Tags for filtering
timeout: float = 30.0 # Timeout in seconds
)

Creating Test Cases

Basic Test Case

test = TestCase(
input="What is the capital of France?",
expected_output="Paris",
criteria=["accuracy"]
)

Open-Ended Test Case

test = TestCase(
input="Write a haiku about coding",
expected_output=None, # No exact expected output
criteria=["creativity", "format"]
)

With Metadata

test = TestCase(
input="Solve: 15 * 7 + 3",
expected_output="108",
name="math_test_001",
tags=["math", "arithmetic"],
metadata={
"difficulty": "easy",
"category": "multiplication"
}
)

Class Methods

from_dict()

Create from dictionary:

data = {
"input": "What is 2+2?",
"expected_output": "4",
"criteria": ["accuracy"]
}
test = TestCase.from_dict(data)

from_yaml()

Load from YAML file:

# test_case.yaml
input: "Summarize this article"
expected_output: null
criteria:
- relevance
- conciseness
tags:
- summarization
test = TestCase.from_yaml("test_case.yaml")

Batch Creation

# Create multiple test cases
test_cases = [
TestCase(input="2+2", expected_output="4"),
TestCase(input="3*4", expected_output="12"),
TestCase(input="10/2", expected_output="5"),
]

# Or from a list of dicts
test_cases = TestCase.batch_create([
{"input": "Hello", "expected_output": "Hi"},
{"input": "Goodbye", "expected_output": "Bye"},
])

Properties

PropertyTypeDescription
inputstrThe input prompt
expected_outputstrExpected response
criterialistEvaluation criteria
namestrTest case identifier
tagslistCategorization tags
metadatadictAdditional data

See Also