TestCase

A TestCase represents a single evaluation scenario for testing agent behavior.

Import

from openstackai.evaluation import TestCase

Constructor

TestCase(
    input: str,                      # Input prompt/question
    expected_output: str = None,     # Expected response (optional)
    criteria: list[str] = None,      # Criteria to evaluate
    metadata: dict = None,           # Additional metadata
    name: str = None,                # Test case name
    tags: list[str] = None,          # Tags for filtering
    timeout: float = 30.0            # Timeout in seconds
)

Creating Test Cases

Basic Test Case

test = TestCase(
    input="What is the capital of France?",
    expected_output="Paris",
    criteria=["accuracy"]
)

Open-Ended Test Case

test = TestCase(
    input="Write a haiku about coding",
    expected_output=None,  # No exact expected output
    criteria=["creativity", "format"]
)

With Metadata

test = TestCase(
    input="Solve: 15 * 7 + 3",
    expected_output="108",
    name="math_test_001",
    tags=["math", "arithmetic"],
    metadata={
        "difficulty": "easy",
        "category": "multiplication"
    }
)

Class Methods

from_dict()

Create from dictionary:

data = {
    "input": "What is 2+2?",
    "expected_output": "4",
    "criteria": ["accuracy"]
}
test = TestCase.from_dict(data)

from_yaml()

Load from YAML file:

# test_case.yaml
input: "Summarize this article"
expected_output: null
criteria:
  - relevance
  - conciseness
tags:
  - summarization

test = TestCase.from_yaml("test_case.yaml")

Batch Creation

# Create multiple test cases
test_cases = [
    TestCase(input="2+2", expected_output="4"),
    TestCase(input="3*4", expected_output="12"),
    TestCase(input="10/2", expected_output="5"),
]

# Or from a list of dicts
test_cases = TestCase.batch_create([
    {"input": "Hello", "expected_output": "Hi"},
    {"input": "Goodbye", "expected_output": "Bye"},
])

Properties

Property	Type	Description
`input`	str	The input prompt
`expected_output`	str	Expected response
`criteria`	list	Evaluation criteria
`name`	str	Test case identifier
`tags`	list	Categorization tags
`metadata`	dict	Additional data

Import​

Constructor​

Creating Test Cases​

Basic Test Case​

Open-Ended Test Case​

With Metadata​

Class Methods​

from_dict()​

from_yaml()​

Batch Creation​

Properties​

See Also​

Import

Constructor

Creating Test Cases

Basic Test Case

Open-Ended Test Case

With Metadata

Class Methods

from_dict()

from_yaml()

Batch Creation

Properties

See Also