The QA Testing Glossary: 20 Terms Explained the Way They Actually Work
Every QA team has its own version of the same problem: half the team uses “test plan” and “test suite” interchangeably, someone calls every unplanned session “exploratory testing,” and nobody can agree on whether what they ran before the release was a smoke test or a sanity check. It doesn't matter until it does — and then it really does.
These aren't textbook definitions. They're working definitions: what these terms mean on an actual team, why the distinctions matter, and where the common confusions live.
The core building blocks
Test case
A test case is a documented procedure that describes what to test, how to test it, and what the expected result is. It has a title, preconditions, numbered steps, and an expected outcome. That's the whole thing.
A bad test case is either too vague to reproduce (“check that login works”) or so granular it becomes maintenance debt (“click the username field, verify the cursor appears, type the letter A”). The goal is a level of detail where someone unfamiliar with the system can run it and get a consistent result.
The fastest way to know if a test case is well-written: hand it to a new team member and see if they run it the same way you would. If they don't, the case is the problem.
Test suite
A test suite is a collection of test cases grouped by a shared purpose. The grouping can be by feature (checkout suite), by type (regression suite), by priority (smoke suite), or by whatever logical boundary makes sense for your product.
One test case can live in multiple suites. A test for user login belongs in the smoke suite, the regression suite, and the authentication feature suite simultaneously. That's not duplication — that's flexibility. Tools like SmartRuns handle this with tagging and filtering. In a spreadsheet, you copy-paste and immediately lose track of which version is correct.
Test plan
A test plan is a document describing the scope, approach, and schedule for a testing effort. It answers: what are we testing, who is testing it, when, and how do we know when we're done?
Most teams produce something between a formal test plan and nothing at all. That's fine. What matters is that the intent is somewhere explicit: what features are in scope for this release, which risk areas are getting extra attention, and what the exit criteria are. A test plan doesn't have to be a 30-page document. It has to answer those questions.
Test run
A test run is a specific execution of a test suite at a point in time. It's the event — not the collection of cases. Sprint 22, regression run, Tuesday afternoon before the release. The run records which cases passed, which failed, which were blocked, and who ran them.
This is where test management tools earn their keep. A test run in SmartRuns or Zephyr gives you a timestamped record of execution results tied to a specific build. A test run in a spreadsheet is a column you added called “Sprint 22 Status” that you'll overwrite next sprint.
Test case → Test suite → Test plan → Test run
Cases are the atoms. Suites group them. Plans scope the effort. Runs execute it and record what happened. Confuse any of these and your team is talking about different things.
Testing approaches
Regression testing
Regression testing is verifying that existing functionality still works after changes. Every time you ship something new, you run regression tests to confirm you didn't break something old. This is the thing you skip when you're in a hurry and the thing you regret when production breaks on a flow that hasn't changed in six months.
Full regression is expensive. Most teams run a subset — the highest-risk flows for the changes made. Knowing which subset to run is one of the skills that actually takes years to develop.
Smoke test vs. sanity test
These are not the same thing. Most teams use them as synonyms. That's wrong, and the distinction is useful.
A smoke test is a broad, shallow test of the entire system. Does the application start? Can you log in? Can you complete a core flow? It's named after hardware testing: you turn it on and check if anything starts smoking. A smoke test runs after a build to confirm it's worth testing further. It doesn't go deep — it goes wide.
A sanity test is narrow and deep. After a specific bug fix, you run a sanity check on that fix: does the thing that was broken now work correctly? You're not testing the whole system. You're verifying a specific change makes sense.
Smoke = broad. Sanity = specific. Run both. Name them correctly on your team.
Exploratory testing
Exploratory testing is unscripted investigation of a feature with the goal of finding issues that scripted tests miss. It's not random clicking. It's purposeful, experienced, hypothesis-driven testing without a predetermined checklist.
The confusion is calling everything that isn't scripted “exploratory.” Real exploratory testing requires a charter (what area, what risk, what time), note-taking, and a skilled tester who knows what “wrong” looks like. Junior testers clicking around an app is not exploratory testing. It's guessing.
AI test generation doesn't replace exploratory testing. It replaces the mechanical work of scripting obvious paths — which frees up time for the exploratory work that actually requires a human brain.
Shift-left testing
Shift-left means involving QA earlier in the development cycle — before code is written, not after. The “left” is a reference to a timeline where development is on the left and release is on the right.
In practice, shift-left means QA reviews acceptance criteria before a ticket enters development, raises edge cases during planning, and writes test cases from specs rather than finished features. Bugs caught in planning cost a fraction of bugs caught in production. Everyone knows this. Fewer teams actually do it.
Shift-left isn't a methodology. It's a habit. The QA engineer who reads every ticket before it enters the sprint, asks one awkward question, and saves a day of re-work — that's shift-left in practice.
The numbers game
Test coverage
Test coverage measures how much of the application is exercised by your test suite. It can be measured by feature area, by user flow, by code lines executed (code coverage), or by requirements covered.
100% coverage is not the goal. It's not achievable in any meaningful sense, and chasing it produces bloated test suites full of low-value cases. The goal is meaningful coverage of the areas that matter most — your critical user flows, your highest-risk features, your money paths. Coverage is a tool for prioritization, not a number to maximize.
Flaky test
A flaky test is a test that passes and fails inconsistently without any change to the code under test. It's one of the most corrosive problems in a test suite because it erodes trust in your results.
When a test flakes, the instinct is to re-run it. Then it passes. Then you ship. Then production breaks. Flaky tests are not a minor inconvenience — they're a reliability signal you've started ignoring. The correct response is to fix or delete the flaky test immediately, not to “keep an eye on it.”
Ignored flaky test
~0%
Chance your team acts on its failures
Fixed flaky test
100%
Chance it means something when it fails
Test management
Test management is the software category that covers organizing, executing, and reporting on test cases and test runs. Tools in this category include SmartRuns, TestRail, and Zephyr (Zephyr Scale and Zephyr Squad, both Jira-native). Jira itself is not a test management tool — it's an issue tracker. You can use it to track bugs, but tracking which test cases ran and what they produced requires dedicated tooling.
Terms that live in the overlap
Bug vs. defect
Technically, a defect is the correct term: a deviation from a requirement or expected behavior. A bug is slang — useful slang that everyone uses including everyone at every QA tool company, but slang nonetheless.
The more useful distinction is between a defect and an enhancement. A defect is something that doesn't work as specified. An enhancement is something that works as specified but could work better. The spec is the arbiter. If the spec is wrong, that's a different conversation — usually a longer one.
Acceptance criteria
Acceptance criteria are the conditions a feature must meet to be accepted as complete. They live on the ticket, they're written before development starts (ideally), and they define what “done” means.
The relationship between acceptance criteria and test cases is direct: each acceptance criterion should map to at least one test case. If you have acceptance criteria with no test cases, you've defined “done” but you have no way to confirm you got there. If you have test cases with no acceptance criteria, ask what you're actually verifying.
Edge case
An edge case is a scenario that occurs at the boundary of normal operating parameters. The user who enters a 200-character name. The checkout flow at exactly midnight on a DST changeover. The API response that takes 29.9 seconds.
Edge cases are where software breaks in production and nobody saw it coming because nobody tested it. They're also the first thing cut when you're behind schedule. That tension never goes away. What you can do is document which edge cases you're explicitly skipping, so you know what you're betting on.
Happy path
The happy path is the ideal user flow through a feature, with no errors, edge cases, or unexpected inputs. User opens the app, logs in, completes the core action, succeeds. Everything works as intended.
Testing the happy path is necessary. It's not sufficient. The happy path is where you start, not where you stop. Every feature has multiple unhappy paths — wrong input, network failure, session expiry, concurrent access — and those paths are where real users actually spend their time when something goes wrong.
AI-generated test cases cover the happy path reliably. The edge cases and error states are where human QA judgment earns its place.
Put these terms into practice with SmartRuns
14-day free trial. 5-minute setup. No credit card required.
★ 4.9 rating · 500+ QA teams