Verification email sent successfully!

Learning Centre›AI & Test Case Generation›Writing Effective Inputs

Writing Effective Inputs for AI Test Case Generation

The quality of your AI-generated test cases depends entirely on what you put in. Learn how to write test types, component fields, custom fields, and acceptance criteria that unlock better, more precise test suites — and how to iterate when the first generation is not quite right.

For QA Engineers and Product Owners setting up scenarios in Evaficy Smart Test.

AI test generation

Test inputs

Acceptance criteria

Test scenarios

QA best practices

Why Inputs Matter More Than the AI Model

AI test case generation it is a pattern recognition applied to the inputs you provide. The model analyzes your test type, affected component, custom fields, and acceptance criteria, then generates cases that are logically consistent with what you described. If you describe the wrong thing, you get the wrong cases. If you describe too little, you get too few.

This is the GIGO principle applied to QA: Garbage In, Garbage Out. A powerful AI model given vague inputs produces vague test cases. A weaker model given precise, structured inputs produces usable, specific test cases. The bottleneck is almost always the inputs, not the model.

The good news is that improving your inputs is a learnable, repeatable skill. The sections below cover each input field in order of impact — starting with test type selection and ending with the acceptance criteria patterns that unlock the most thorough AI-generated coverage.

The single highest-impact change you can make

If you only change one thing about how you write scenarios, make it this: add explicit "must not" constraints to your acceptance criteria. Positive-only AC produces happy-path cases. Negative constraints produce the failure mode cases that actually find bugs.

Choosing the Right Test Type

Test type is the highest-level signal the AI receives. It shifts what kinds of cases the model prioritises, how broad the scope of generation is, and what failure modes it looks for. Choosing the wrong type does not break generation — but it misaligns the output with your actual testing goal.

What the AI generates for this type:

Happy path flows (user completes the feature as designed)
Negative paths and invalid input handling
State transitions (e.g., draft → submitted → approved)
Boundary conditions at field limits
Expected error messages and validation responses

Input tip for this type

Use a precise component name ("Password Reset > Email delivery") and write acceptance criteria as individual "must" and "must not" behaviors. The more complete your AC, the more complete the generated case set.

What the AI generates for this type:

Integration boundary cases (areas that interact with the changed component)
Cross-feature dependency checks
Previously passing paths that could be affected
Data persistence and state retention scenarios
Permission and access control edge cases

Input tip for this type

Broaden the component scope intentionally ("User Authentication module" rather than a single button) and include requirement references or prior story IDs in the Requirement field. This signals to the AI that coverage across related areas matters.

What the AI generates for this type:

Critical path steps only — login, core navigation, key actions
Primary happy paths with no variation
Build-breaking assertions (page loads, auth works, core data displays)

Input tip for this type

Keep acceptance criteria minimal — only the "must work" behaviors that indicate a stable build. Over-specifying AC for a smoke test causes the AI to generate too many cases, defeating the purpose of a fast build check.

What the AI generates for this type:

Unusual input combinations and character encodings
Multi-step user journeys with unexpected mid-flow state changes
Concurrent session or shared-state edge cases
Permission boundary probes (what can this role access?)
Recovery scenarios after failed or interrupted flows

Input tip for this type

In the Component field, describe user personas or session states ("Guest user with partially completed onboarding"). In Acceptance Criteria, add "must not" constraints — these unlock the negative path seed cases exploratory testers start from.

The Affected Page or Component Field

The component field tells the AI where in the product this scenario lives. It sets the spatial scope of generation — which UI elements, API endpoints, user states, and data flows are in scope. A vague component field forces the AI to guess at scope and produces cases that may apply to multiple unrelated parts of the product.

Weak — too broad

"Checkout"
"User profile"
"Forms"
"Settings"
"Admin"

Strong — precise scope

"Checkout > Payment step (logged-in user)"
"User profile > Avatar upload"
"Signup form > Email field validation"
"Settings > Email notification preferences"
"Admin > User role assignment"

Including user state context in the component field is particularly valuable. "Login page" and "Login page (user with MFA enabled)" generate substantially different cases. The AI uses the state context to scope precondition steps, test data requirements, and the specific failure modes that apply in that state.

Use the path structure your team already uses

If your team refers to features as "Checkout > Payment" or "Settings > Notifications" in your product backlog or design system, use that exact naming in the Component field. Consistency between your product language and your test scenarios makes review and validation faster for everyone.

Custom Fields That Shape Output

Beyond test type and component, four custom fields give the AI additional context that refines the scope, environment, and requirements of the scenario. Each field adds a dimension of specificity that directly translates into more targeted generated cases.

Feature

Narrows scope to a single user-facing capability

The Feature field tells the AI which product capability this scenario covers. A vague feature ("Settings") generates generic cases across the entire settings area. A specific one ("Settings > Email notification preferences") targets a single, testable capability and produces directly executable steps.

WeakSettings

StrongSettings > Email notification preferences

Browser / Environment

Generates environment-specific edge cases

Specifying an environment causes the AI to generate cases for rendering differences, input behaviour variations, and platform-specific constraints. Without it, the AI generates generic cases that assume a consistent environment — missing the real-world divergences that cause production defects.

Weak(not specified)

StrongSafari on iOS 17 / mobile viewport

Component

Targets a specific UI element or API layer

The Component field pins the AI to a specific implementation layer. Use it to distinguish between the UI surface ("Checkout form — payment step"), the API endpoint ("POST /orders"), or a backend process ("Order confirmation email trigger"). This prevents the AI from generating cases that belong to adjacent components.

WeakCheckout

StrongCheckout > Payment step (logged-in user, non-empty cart)

Requirement

Attaches user stories or acceptance criteria for richer output

The Requirement field accepts user story text, story IDs, or raw acceptance criteria. Linking a user story gives the AI the business context behind the feature — not just what it does, but why, and who for. This context is particularly valuable for generating UAT-style cases and business-logic validations.

Weak(not specified)

StrongJIRA-1042 — As a returning customer, I want to reuse my saved card so that I can complete checkout faster

Writing Acceptance Criteria That the AI Can Use

Acceptance criteria is the most important input. It is the specification the AI generates from — the source of truth for what the feature must do, what it must not do, and what the boundaries of correct behavior are. Three principles separate AC that produces thorough test coverage from AC that produces surface-level cases.

Structured over free-text

Given/When/Then (Gherkin) or bulleted one-line criteria outperform paragraphs significantly. Free-text AC forces the AI to infer intent from prose — it will produce cases, but they will be broader and less precise. Structured AC gives the AI discrete, testable statements to map directly to test steps and expected results.

# Paragraph (AI has to interpret)

"When the user submits the login form with valid credentials,

they should be authenticated and taken to their dashboard,

unless the account is locked or inactive."

# Structured (AI maps directly to test cases)

✓ Must log in with valid email + correct password

✓ Must redirect to /dashboard after successful login

✗ Must not log in with correct email + wrong password

✗ Must not log in if account status is "suspended"

✗ Must not log in if account status is "pending verification"

One criterion per line

Each line of acceptance criteria maps to one or more test cases. When you group multiple behaviors into a single line, the AI either generates one combined case (which is hard to execute and harder to fail independently) or misses some behaviors entirely. One behavior per line is the single most reliable way to increase both the count and precision of generated cases.

# Grouped (two behaviors, generates one case)

"User can upload profile photo and it must be under 5 MB"

# Separated (two lines, generates complete coverage)

✓ Must accept JPEG, PNG, and WebP image formats

✓ Must accept files up to 5 MB in size

✗ Must reject files larger than 5 MB with an error message

✗ Must reject files in unsupported formats (PDF, GIF, SVG)

✗ Must not silently ignore an upload failure

Include "must not" constraints

Positive criteria ("must do X") generate happy path and positive validation cases. Negative criteria ("must not allow Y") generate negative paths, invalid input handling, and security boundary cases. Most AC is written positively and most AI generation reflects that gap — the cases most likely to catch real bugs (wrong password accepted, expired session reused, oversized file silently accepted) are generated only when the constraint is explicitly written.

# Positive-only AC (misses most failure modes)

✓ Session must expire after 30 minutes of inactivity

# With negatives (generates the cases that matter)

✓ Session must expire after 30 minutes of inactivity

✗ Must not allow access to authenticated routes after session expiry

✗ Must not extend session on page refresh without user interaction

✗ Must not store session token after explicit logout

Before/After Examples: Weak Inputs vs. Strong Inputs

The following examples show the same four feature scenarios written with weak inputs and with strong inputs. The transformation is specific, repeatable, and directly reflects the three AC principles above.

Weak inputs

Test type

Functional

Component

Acceptance criteria

User should be able to log in with their credentials.

Strong inputs

Test type

Functional

Component

Acceptance criteria

✓ Must accept valid email address + correct password

✓ Must redirect to /dashboard after successful login

✓ Must show a persistent session (remain logged in on browser refresh)

✗ Must not accept correct email + wrong password

✗ Must not accept a deactivated account regardless of password

✗ Must not accept an unverified email account

✗ Must display a specific error for wrong password (not a generic failure)

✗ Must lock the account after 5 consecutive failed attempts

Why this matters

The weak version generates one or two generic cases. The strong version generates 8+ cases covering the full authentication surface — including the security edge cases most likely to cause production incidents.

Weak inputs

Test type

Functional

Component

Checkout

Acceptance criteria

Payment should work correctly.

Strong inputs

Test type

Functional

Component

Checkout > Payment step (logged-in user, cart value > £0)

Acceptance criteria

✓ Must accept a valid Visa card with correct CVV and future expiry

✓ Must display an order confirmation page after successful payment

✓ Must send a confirmation email within 2 minutes of successful payment

✗ Must not accept an expired card

✗ Must not accept an incorrect CVV

✗ Must not proceed if the card is declined by the payment gateway

✗ Must not charge the card twice if the user refreshes the confirmation page

✗ Must display a clear, actionable error message for each failure type

Why this matters

"Payment should work correctly" is not a testable criterion — it tells the AI almost nothing. The strong version locks the scope to a specific step, user state, and cart state, then defines both success conditions and every meaningful failure mode the payment flow can encounter.

Weak inputs

Test type

Functional

Component

Signup form

Acceptance criteria

Form validation should work. Required fields must be filled in. Email should be valid.

Strong inputs

Test type

Functional

Component

Signup form (unauthenticated user)

Acceptance criteria

✓ Must accept a valid email address format (user@domain.tld)

✓ Must accept a password with 8+ characters including at least one number

✓ Must show inline validation errors on blur (not only on submit)

✗ Must not accept an email address without an @ symbol

✗ Must not accept an email address with spaces

✗ Must not accept a password shorter than 8 characters

✗ Must not accept a password with no numeric characters

✗ Must not submit the form if any required field is empty

✗ Must not allow submission of an email address already registered

Why this matters

Grouping "form validation should work" into one criterion generates one or two surface-level cases. Separated criteria — each with a specific format rule — generate individual test cases for every validation path, including the boundary conditions that catch real user-facing bugs.

Weak inputs

Test type

Functional

Component

Notifications

Acceptance criteria

Users should receive email notifications when relevant events happen.

Strong inputs

Test type

Functional

Component

Email notification — task assigned trigger

Acceptance criteria

✓ Must send an email to the assignee within 60 seconds of a task being assigned

✓ Email must include: task title, assigning user's name, and a direct link to the task

✓ Must send even if the assignee is currently logged in

✗ Must not send if the assignee has disabled task assignment notifications in settings

✗ Must not send a duplicate notification if the task is reassigned to the same user

✗ Must not send if the task is assigned to the currently authenticated user by themselves

Why this matters

The vague version generates a single case that checks "email received" and nothing else. The strong version generates 6+ cases covering trigger timing, content requirements, suppression rules, and duplicate prevention — the scenarios that actually fail in production.

Iterating: When to Regenerate vs. When to Edit

AI generation is rarely perfect on the first pass — and it should not need to be. The right approach is to treat generation as a first draft and decide quickly whether you need to improve the inputs and regenerate, or whether the draft is close enough to refine by editing. The two are not equivalent: editing a fundamentally misaligned generation wastes time; regenerating when the draft is 90% correct is unnecessary.

When to regenerate

The test type was wrong — you asked for Smoke but needed Functional (the scope is completely different)
Acceptance criteria were too vague to generate useful cases — the AI produced placeholder-level steps
The component scope was too broad — you got cases for the whole module instead of the specific page
Your requirements changed significantly since generation — the existing cases describe the old behavior
Generated cases test a path that does not match the actual UI — the AC was misaligned with the implementation

When to edit

AI output is 80–90% correct — a few step descriptions need adjusting to match the actual UI wording
One or two specific edge cases are missing that you know from domain knowledge
Step wording is accurate but could be clearer for the team executing the run
A generated case combines two assertions that should be separate test cases for cleaner failure isolation
You want to add manual exploratory cases alongside the AI-generated structural ones

Improve your inputs before regenerating — not after

The most efficient workflow is: write strong inputs first, generate once, then edit the 10–20% that needs adjustment. Teams that generate with weak inputs and then regenerate multiple times are doing rework that better inputs would have avoided entirely.

Related guides

Try it yourself — create a scenario

Write your acceptance criteria, choose a test type, and let the AI generate your first complete test suite. Edit, extend, and execute — all in one platform.

Start your trial

Writing Effective Inputs for AI Test Case Generation

Why Inputs Matter More Than the AI Model

The single highest-impact change you can make

Choosing the Right Test Type

Functional TestingTests whether a specific feature works as defined — correct outputs for correct inputs, and the right errors for invalid ones.

Functional Testing

What the AI generates for this type:

Regression TestingVerifies that existing functionality has not broken after a change — a new feature, a bug fix, a dependency update, or a refactor.

Regression Testing

What the AI generates for this type:

Smoke TestingConfirms the build is stable enough for deeper testing. Smoke tests are fast, high-level, and non-exhaustive by design.

Smoke Testing

What the AI generates for this type:

Exploratory TestingOpen-ended sessions designed to find failures that structured cases miss. AI generates seed scenarios — edge state combinations a human tester then investigates freely.

Exploratory Testing

What the AI generates for this type:

The Affected Page or Component Field

Weak — too broad

Strong — precise scope

Use the path structure your team already uses

Custom Fields That Shape Output

Feature

Browser / Environment

Component

Requirement

Writing Acceptance Criteria That the AI Can Use

Structured over free-text

One criterion per line

Include "must not" constraints

Before/After Examples: Weak Inputs vs. Strong Inputs

User Authentication — Login

User Authentication — Login

Weak inputs

Strong inputs

E-commerce Checkout — Payment Step

E-commerce Checkout — Payment Step

Weak inputs

Strong inputs

Form Validation — Signup Form

Form Validation — Signup Form

Weak inputs

Strong inputs

Notification — Email Trigger

Notification — Email Trigger

Weak inputs

Strong inputs

Iterating: When to Regenerate vs. When to Edit

When to regenerate

When to edit

Improve your inputs before regenerating — not after

Related guides

AI Test Case Generation — How It Works

Acceptance Criteria — The Foundation of Effective QA

How to Write Test Cases That Actually Catch Bugs

Try it yourself — create a scenario