Evaficy Smart Test
Learning CentreAI & Test Case GenerationWriting Effective Inputs

Writing Effective Inputs for AI Test Case Generation

The quality of your AI-generated test cases depends entirely on what you put in. Learn how to write test types, component fields, custom fields, and acceptance criteria that unlock better, more precise test suites — and how to iterate when the first generation is not quite right.

For QA Engineers and Product Owners setting up scenarios in Evaficy Smart Test.

AI test generation
Test inputs
Acceptance criteria
Test scenarios
QA best practices

Why Inputs Matter More Than the AI Model

AI test case generation it is a pattern recognition applied to the inputs you provide. The model analyzes your test type, affected component, custom fields, and acceptance criteria, then generates cases that are logically consistent with what you described. If you describe the wrong thing, you get the wrong cases. If you describe too little, you get too few.

This is the GIGO principle applied to QA: Garbage In, Garbage Out. A powerful AI model given vague inputs produces vague test cases. A weaker model given precise, structured inputs produces usable, specific test cases. The bottleneck is almost always the inputs, not the model.

The good news is that improving your inputs is a learnable, repeatable skill. The sections below cover each input field in order of impact — starting with test type selection and ending with the acceptance criteria patterns that unlock the most thorough AI-generated coverage.

The single highest-impact change you can make

If you only change one thing about how you write scenarios, make it this: add explicit "must not" constraints to your acceptance criteria. Positive-only AC produces happy-path cases. Negative constraints produce the failure mode cases that actually find bugs.


Choosing the Right Test Type

Test type is the highest-level signal the AI receives. It shifts what kinds of cases the model prioritises, how broad the scope of generation is, and what failure modes it looks for. Choosing the wrong type does not break generation — but it misaligns the output with your actual testing goal.

What the AI generates for this type:
  • Happy path flows (user completes the feature as designed)
  • Negative paths and invalid input handling
  • State transitions (e.g., draft → submitted → approved)
  • Boundary conditions at field limits
  • Expected error messages and validation responses
Input tip for this type

Use a precise component name ("Password Reset > Email delivery") and write acceptance criteria as individual "must" and "must not" behaviors. The more complete your AC, the more complete the generated case set.

What the AI generates for this type:
  • Integration boundary cases (areas that interact with the changed component)
  • Cross-feature dependency checks
  • Previously passing paths that could be affected
  • Data persistence and state retention scenarios
  • Permission and access control edge cases
Input tip for this type

Broaden the component scope intentionally ("User Authentication module" rather than a single button) and include requirement references or prior story IDs in the Requirement field. This signals to the AI that coverage across related areas matters.

What the AI generates for this type:
  • Critical path steps only — login, core navigation, key actions
  • Primary happy paths with no variation
  • Build-breaking assertions (page loads, auth works, core data displays)
Input tip for this type

Keep acceptance criteria minimal — only the "must work" behaviors that indicate a stable build. Over-specifying AC for a smoke test causes the AI to generate too many cases, defeating the purpose of a fast build check.

What the AI generates for this type:
  • Unusual input combinations and character encodings
  • Multi-step user journeys with unexpected mid-flow state changes
  • Concurrent session or shared-state edge cases
  • Permission boundary probes (what can this role access?)
  • Recovery scenarios after failed or interrupted flows
Input tip for this type

In the Component field, describe user personas or session states ("Guest user with partially completed onboarding"). In Acceptance Criteria, add "must not" constraints — these unlock the negative path seed cases exploratory testers start from.


The Affected Page or Component Field

The component field tells the AI where in the product this scenario lives. It sets the spatial scope of generation — which UI elements, API endpoints, user states, and data flows are in scope. A vague component field forces the AI to guess at scope and produces cases that may apply to multiple unrelated parts of the product.

Weak — too broad
  • "Checkout"
  • "User profile"
  • "Forms"
  • "Settings"
  • "Admin"
Strong — precise scope
  • "Checkout > Payment step (logged-in user)"
  • "User profile > Avatar upload"
  • "Signup form > Email field validation"
  • "Settings > Email notification preferences"
  • "Admin > User role assignment"

Including user state context in the component field is particularly valuable. "Login page" and "Login page (user with MFA enabled)" generate substantially different cases. The AI uses the state context to scope precondition steps, test data requirements, and the specific failure modes that apply in that state.

Use the path structure your team already uses

If your team refers to features as "Checkout > Payment" or "Settings > Notifications" in your product backlog or design system, use that exact naming in the Component field. Consistency between your product language and your test scenarios makes review and validation faster for everyone.


Custom Fields That Shape Output

Beyond test type and component, four custom fields give the AI additional context that refines the scope, environment, and requirements of the scenario. Each field adds a dimension of specificity that directly translates into more targeted generated cases.

Feature
Narrows scope to a single user-facing capability

The Feature field tells the AI which product capability this scenario covers. A vague feature ("Settings") generates generic cases across the entire settings area. A specific one ("Settings > Email notification preferences") targets a single, testable capability and produces directly executable steps.

WeakSettings
StrongSettings > Email notification preferences
Browser / Environment
Generates environment-specific edge cases

Specifying an environment causes the AI to generate cases for rendering differences, input behaviour variations, and platform-specific constraints. Without it, the AI generates generic cases that assume a consistent environment — missing the real-world divergences that cause production defects.

Weak(not specified)
StrongSafari on iOS 17 / mobile viewport
Component
Targets a specific UI element or API layer

The Component field pins the AI to a specific implementation layer. Use it to distinguish between the UI surface ("Checkout form — payment step"), the API endpoint ("POST /orders"), or a backend process ("Order confirmation email trigger"). This prevents the AI from generating cases that belong to adjacent components.

WeakCheckout
StrongCheckout > Payment step (logged-in user, non-empty cart)
Requirement
Attaches user stories or acceptance criteria for richer output

The Requirement field accepts user story text, story IDs, or raw acceptance criteria. Linking a user story gives the AI the business context behind the feature — not just what it does, but why, and who for. This context is particularly valuable for generating UAT-style cases and business-logic validations.

Weak(not specified)
StrongJIRA-1042 — As a returning customer, I want to reuse my saved card so that I can complete checkout faster

Writing Acceptance Criteria That the AI Can Use

Acceptance criteria is the most important input. It is the specification the AI generates from — the source of truth for what the feature must do, what it must not do, and what the boundaries of correct behavior are. Three principles separate AC that produces thorough test coverage from AC that produces surface-level cases.

01

Structured over free-text

Given/When/Then (Gherkin) or bulleted one-line criteria outperform paragraphs significantly. Free-text AC forces the AI to infer intent from prose — it will produce cases, but they will be broader and less precise. Structured AC gives the AI discrete, testable statements to map directly to test steps and expected results.

# Paragraph (AI has to interpret)
"When the user submits the login form with valid credentials,
they should be authenticated and taken to their dashboard,
unless the account is locked or inactive."
# Structured (AI maps directly to test cases)
✓ Must log in with valid email + correct password
✓ Must redirect to /dashboard after successful login
✗ Must not log in with correct email + wrong password
✗ Must not log in if account status is "suspended"
✗ Must not log in if account status is "pending verification"

02

One criterion per line

Each line of acceptance criteria maps to one or more test cases. When you group multiple behaviors into a single line, the AI either generates one combined case (which is hard to execute and harder to fail independently) or misses some behaviors entirely. One behavior per line is the single most reliable way to increase both the count and precision of generated cases.

# Grouped (two behaviors, generates one case)
"User can upload profile photo and it must be under 5 MB"
# Separated (two lines, generates complete coverage)
✓ Must accept JPEG, PNG, and WebP image formats
✓ Must accept files up to 5 MB in size
✗ Must reject files larger than 5 MB with an error message
✗ Must reject files in unsupported formats (PDF, GIF, SVG)
✗ Must not silently ignore an upload failure

03

Include "must not" constraints

Positive criteria ("must do X") generate happy path and positive validation cases. Negative criteria ("must not allow Y") generate negative paths, invalid input handling, and security boundary cases. Most AC is written positively and most AI generation reflects that gap — the cases most likely to catch real bugs (wrong password accepted, expired session reused, oversized file silently accepted) are generated only when the constraint is explicitly written.

# Positive-only AC (misses most failure modes)
✓ Session must expire after 30 minutes of inactivity
# With negatives (generates the cases that matter)
✓ Session must expire after 30 minutes of inactivity
✗ Must not allow access to authenticated routes after session expiry
✗ Must not extend session on page refresh without user interaction
✗ Must not store session token after explicit logout

Before/After Examples: Weak Inputs vs. Strong Inputs

The following examples show the same four feature scenarios written with weak inputs and with strong inputs. The transformation is specific, repeatable, and directly reflects the three AC principles above.

Weak inputs
Test type

Functional

Component

Login

Acceptance criteria

User should be able to log in with their credentials.

Strong inputs
Test type

Functional

Component

Login page (returning user, active account)

Acceptance criteria

✓ Must accept valid email address + correct password

✓ Must redirect to /dashboard after successful login

✓ Must show a persistent session (remain logged in on browser refresh)

✗ Must not accept correct email + wrong password

✗ Must not accept a deactivated account regardless of password

✗ Must not accept an unverified email account

✗ Must display a specific error for wrong password (not a generic failure)

✗ Must lock the account after 5 consecutive failed attempts

Why this matters

The weak version generates one or two generic cases. The strong version generates 8+ cases covering the full authentication surface — including the security edge cases most likely to cause production incidents.

Weak inputs
Test type

Functional

Component

Checkout

Acceptance criteria

Payment should work correctly.

Strong inputs
Test type

Functional

Component

Checkout > Payment step (logged-in user, cart value > £0)

Acceptance criteria

✓ Must accept a valid Visa card with correct CVV and future expiry

✓ Must display an order confirmation page after successful payment

✓ Must send a confirmation email within 2 minutes of successful payment

✗ Must not accept an expired card

✗ Must not accept an incorrect CVV

✗ Must not proceed if the card is declined by the payment gateway

✗ Must not charge the card twice if the user refreshes the confirmation page

✗ Must display a clear, actionable error message for each failure type

Why this matters

"Payment should work correctly" is not a testable criterion — it tells the AI almost nothing. The strong version locks the scope to a specific step, user state, and cart state, then defines both success conditions and every meaningful failure mode the payment flow can encounter.

Weak inputs
Test type

Functional

Component

Signup form

Acceptance criteria

Form validation should work. Required fields must be filled in. Email should be valid.

Strong inputs
Test type

Functional

Component

Signup form (unauthenticated user)

Acceptance criteria

✓ Must accept a valid email address format (user@domain.tld)

✓ Must accept a password with 8+ characters including at least one number

✓ Must show inline validation errors on blur (not only on submit)

✗ Must not accept an email address without an @ symbol

✗ Must not accept an email address with spaces

✗ Must not accept a password shorter than 8 characters

✗ Must not accept a password with no numeric characters

✗ Must not submit the form if any required field is empty

✗ Must not allow submission of an email address already registered

Why this matters

Grouping "form validation should work" into one criterion generates one or two surface-level cases. Separated criteria — each with a specific format rule — generate individual test cases for every validation path, including the boundary conditions that catch real user-facing bugs.

Weak inputs
Test type

Functional

Component

Notifications

Acceptance criteria

Users should receive email notifications when relevant events happen.

Strong inputs
Test type

Functional

Component

Email notification — task assigned trigger

Acceptance criteria

✓ Must send an email to the assignee within 60 seconds of a task being assigned

✓ Email must include: task title, assigning user's name, and a direct link to the task

✓ Must send even if the assignee is currently logged in

✗ Must not send if the assignee has disabled task assignment notifications in settings

✗ Must not send a duplicate notification if the task is reassigned to the same user

✗ Must not send if the task is assigned to the currently authenticated user by themselves

Why this matters

The vague version generates a single case that checks "email received" and nothing else. The strong version generates 6+ cases covering trigger timing, content requirements, suppression rules, and duplicate prevention — the scenarios that actually fail in production.


Iterating: When to Regenerate vs. When to Edit

AI generation is rarely perfect on the first pass — and it should not need to be. The right approach is to treat generation as a first draft and decide quickly whether you need to improve the inputs and regenerate, or whether the draft is close enough to refine by editing. The two are not equivalent: editing a fundamentally misaligned generation wastes time; regenerating when the draft is 90% correct is unnecessary.

When to regenerate
  • The test type was wrong — you asked for Smoke but needed Functional (the scope is completely different)
  • Acceptance criteria were too vague to generate useful cases — the AI produced placeholder-level steps
  • The component scope was too broad — you got cases for the whole module instead of the specific page
  • Your requirements changed significantly since generation — the existing cases describe the old behavior
  • Generated cases test a path that does not match the actual UI — the AC was misaligned with the implementation
When to edit
  • AI output is 80–90% correct — a few step descriptions need adjusting to match the actual UI wording
  • One or two specific edge cases are missing that you know from domain knowledge
  • Step wording is accurate but could be clearer for the team executing the run
  • A generated case combines two assertions that should be separate test cases for cleaner failure isolation
  • You want to add manual exploratory cases alongside the AI-generated structural ones
Improve your inputs before regenerating — not after

The most efficient workflow is: write strong inputs first, generate once, then edit the 10–20% that needs adjustment. Teams that generate with weak inputs and then regenerate multiple times are doing rework that better inputs would have avoided entirely.


Related guides
AI Test Case Generation — How It Works
What the AI analyzes before it generates, the types of cases it produces, and how generation fits into the QA workflow.
Acceptance Criteria — The Foundation of Effective QA
How to write acceptance criteria in formats that drive test coverage — with before/after examples across common feature types.
How to Write Test Cases That Actually Catch Bugs
Anatomy of a good test case, common mistakes, and when to supplement AI output with manually written cases.
Try it yourself — create a scenario

Write your acceptance criteria, choose a test type, and let the AI generate your first complete test suite. Edit, extend, and execute — all in one platform.

Start your trial