335 lines
14 KiB
Markdown
335 lines
14 KiB
Markdown
|
|
---
|
|||
|
|
name: qa-only
|
|||
|
|
description: "Report-only QA testing. Systematically tests a web application and produces a structured report with health score, screenshots, and repro steps — but never fixes anything. Use when asked to "just re"
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Test Plan Context
|
|||
|
|
|
|||
|
|
Before falling back to git diff heuristics, check for richer test plan sources:
|
|||
|
|
|
|||
|
|
1. **Project-scoped test plans:** Check `~/.gstack/projects/` for recent `*-test-plan-*.md` files for this repo
|
|||
|
|
```bash
|
|||
|
|
eval "$(${GSTACK_OPENCODE_DIR}/bin/gstack-slug 2>/dev/null)"
|
|||
|
|
ls -t ~/.gstack/projects/$SLUG/*-test-plan-*.md 2>/dev/null | head -1
|
|||
|
|
```
|
|||
|
|
2. **Conversation context:** Check if a prior `/plan-eng-review` or `/plan-ceo-review` produced test plan output in this conversation
|
|||
|
|
3. **Use whichever source is richer.** Fall back to git diff analysis only if neither is available.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Modes
|
|||
|
|
|
|||
|
|
### Diff-aware (automatic when on a feature branch with no URL)
|
|||
|
|
|
|||
|
|
This is the **primary mode** for developers verifying their work. When the user says `/qa` without a URL and the repo is on a feature branch, automatically:
|
|||
|
|
|
|||
|
|
1. **Analyze the branch diff** to understand what changed:
|
|||
|
|
```bash
|
|||
|
|
git diff main...HEAD --name-only
|
|||
|
|
git log main..HEAD --oneline
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
2. **Identify affected pages/routes** from the changed files:
|
|||
|
|
- Controller/route files → which URL paths they serve
|
|||
|
|
- View/template/component files → which pages render them
|
|||
|
|
- Model/service files → which pages use those models (check controllers that reference them)
|
|||
|
|
- CSS/style files → which pages include those stylesheets
|
|||
|
|
- API endpoints → test them directly with `${GSTACK_BROWSE} js "await fetch('/api/...')"`
|
|||
|
|
- Static pages (markdown, HTML) → navigate to them directly
|
|||
|
|
|
|||
|
|
**If no obvious pages/routes are identified from the diff:** Do not skip browser testing. The user invoked /qa because they want browser-based verification. Fall back to Quick mode — navigate to the homepage, follow the top 5 navigation targets, check console for errors, and test any interactive elements found. Backend, config, and infrastructure changes affect app behavior — always verify the app still works.
|
|||
|
|
|
|||
|
|
3. **Detect the running app** — check common local dev ports:
|
|||
|
|
```bash
|
|||
|
|
${GSTACK_BROWSE} goto http://localhost:3000 2>/dev/null && echo "Found app on :3000" || \
|
|||
|
|
${GSTACK_BROWSE} goto http://localhost:4000 2>/dev/null && echo "Found app on :4000" || \
|
|||
|
|
${GSTACK_BROWSE} goto http://localhost:8080 2>/dev/null && echo "Found app on :8080"
|
|||
|
|
```
|
|||
|
|
If no local app is found, check for a staging/preview URL in the PR or environment. If nothing works, ask the user for the URL.
|
|||
|
|
|
|||
|
|
4. **Test each affected page/route:**
|
|||
|
|
- Navigate to the page
|
|||
|
|
- Take a screenshot
|
|||
|
|
- Check console for errors
|
|||
|
|
- If the change was interactive (forms, buttons, flows), test the interaction end-to-end
|
|||
|
|
- Use `snapshot -D` before and after actions to verify the change had the expected effect
|
|||
|
|
|
|||
|
|
5. **Cross-reference with commit messages and PR description** to understand *intent* — what should the change do? Verify it actually does that.
|
|||
|
|
|
|||
|
|
6. **Check TODOS.md** (if it exists) for known bugs or issues related to the changed files. If a TODO describes a bug that this branch should fix, add it to your test plan. If you find a new bug during QA that isn't in TODOS.md, note it in the report.
|
|||
|
|
|
|||
|
|
7. **Report findings** scoped to the branch changes:
|
|||
|
|
- "Changes tested: N pages/routes affected by this branch"
|
|||
|
|
- For each: does it work? Screenshot evidence.
|
|||
|
|
- Any regressions on adjacent pages?
|
|||
|
|
|
|||
|
|
**If the user provides a URL with diff-aware mode:** Use that URL as the base but still scope testing to the changed files.
|
|||
|
|
|
|||
|
|
### Full (default when URL is provided)
|
|||
|
|
Systematic exploration. Visit every reachable page. Document 5-10 well-evidenced issues. Produce health score. Takes 5-15 minutes depending on app size.
|
|||
|
|
|
|||
|
|
### Quick (`--quick`)
|
|||
|
|
30-second smoke test. Visit homepage + top 5 navigation targets. Check: page loads? Console errors? Broken links? Produce health score. No detailed issue documentation.
|
|||
|
|
|
|||
|
|
### Regression (`--regression <baseline>`)
|
|||
|
|
Run full mode, then load `baseline.json` from a previous run. Diff: which issues are fixed? Which are new? What's the score delta? Append regression section to report.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Workflow
|
|||
|
|
|
|||
|
|
### Phase 1: Initialize
|
|||
|
|
|
|||
|
|
1. Find browse binary (see Setup above)
|
|||
|
|
2. Create output directories
|
|||
|
|
3. Copy report template from `qa/templates/qa-report-template.md` to output dir
|
|||
|
|
4. Start timer for duration tracking
|
|||
|
|
|
|||
|
|
### Phase 2: Authenticate (if needed)
|
|||
|
|
|
|||
|
|
**If the user specified auth credentials:**
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
${GSTACK_BROWSE} goto <login-url>
|
|||
|
|
${GSTACK_BROWSE} snapshot -i # find the login form
|
|||
|
|
${GSTACK_BROWSE} fill @e3 "user@example.com"
|
|||
|
|
${GSTACK_BROWSE} fill @e4 "[REDACTED]" # NEVER include real passwords in report
|
|||
|
|
${GSTACK_BROWSE} click @e5 # submit
|
|||
|
|
${GSTACK_BROWSE} snapshot -D # verify login succeeded
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**If the user provided a cookie file:**
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
${GSTACK_BROWSE} cookie-import cookies.json
|
|||
|
|
${GSTACK_BROWSE} goto <target-url>
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**If 2FA/OTP is required:** Ask the user for the code and wait.
|
|||
|
|
|
|||
|
|
**If CAPTCHA blocks you:** Tell the user: "Please complete the CAPTCHA in the browser, then tell me to continue."
|
|||
|
|
|
|||
|
|
### Phase 3: Orient
|
|||
|
|
|
|||
|
|
Get a map of the application:
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
${GSTACK_BROWSE} goto <target-url>
|
|||
|
|
${GSTACK_BROWSE} snapshot -i -a -o "$REPORT_DIR/screenshots/initial.png"
|
|||
|
|
${GSTACK_BROWSE} links # map navigation structure
|
|||
|
|
${GSTACK_BROWSE} console --errors # any errors on landing?
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Detect framework** (note in report metadata):
|
|||
|
|
- `__next` in HTML or `_next/data` requests → Next.js
|
|||
|
|
- `csrf-token` meta tag → Rails
|
|||
|
|
- `wp-content` in URLs → WordPress
|
|||
|
|
- Client-side routing with no page reloads → SPA
|
|||
|
|
|
|||
|
|
**For SPAs:** The `links` command may return few results because navigation is client-side. Use `snapshot -i` to find nav elements (buttons, menu items) instead.
|
|||
|
|
|
|||
|
|
### Phase 4: Explore
|
|||
|
|
|
|||
|
|
Visit pages systematically. At each page:
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
${GSTACK_BROWSE} goto <page-url>
|
|||
|
|
${GSTACK_BROWSE} snapshot -i -a -o "$REPORT_DIR/screenshots/page-name.png"
|
|||
|
|
${GSTACK_BROWSE} console --errors
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Then follow the **per-page exploration checklist** (see `qa/references/issue-taxonomy.md`):
|
|||
|
|
|
|||
|
|
1. **Visual scan** — Look at the annotated screenshot for layout issues
|
|||
|
|
2. **Interactive elements** — Click buttons, links, controls. Do they work?
|
|||
|
|
3. **Forms** — Fill and submit. Test empty, invalid, edge cases
|
|||
|
|
4. **Navigation** — Check all paths in and out
|
|||
|
|
5. **States** — Empty state, loading, error, overflow
|
|||
|
|
6. **Console** — Any new JS errors after interactions?
|
|||
|
|
7. **Responsiveness** — Check mobile viewport if relevant:
|
|||
|
|
```bash
|
|||
|
|
${GSTACK_BROWSE} viewport 375x812
|
|||
|
|
${GSTACK_BROWSE} screenshot "$REPORT_DIR/screenshots/page-mobile.png"
|
|||
|
|
${GSTACK_BROWSE} viewport 1280x720
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Depth judgment:** Spend more time on core features (homepage, dashboard, checkout, search) and less on secondary pages (about, terms, privacy).
|
|||
|
|
|
|||
|
|
**Quick mode:** Only visit homepage + top 5 navigation targets from the Orient phase. Skip the per-page checklist — just check: loads? Console errors? Broken links visible?
|
|||
|
|
|
|||
|
|
### Phase 5: Document
|
|||
|
|
|
|||
|
|
Document each issue **immediately when found** — don't batch them.
|
|||
|
|
|
|||
|
|
**Two evidence tiers:**
|
|||
|
|
|
|||
|
|
**Interactive bugs** (broken flows, dead buttons, form failures):
|
|||
|
|
1. Take a screenshot before the action
|
|||
|
|
2. Perform the action
|
|||
|
|
3. Take a screenshot showing the result
|
|||
|
|
4. Use `snapshot -D` to show what changed
|
|||
|
|
5. Write repro steps referencing screenshots
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
${GSTACK_BROWSE} screenshot "$REPORT_DIR/screenshots/issue-001-step-1.png"
|
|||
|
|
${GSTACK_BROWSE} click @e5
|
|||
|
|
${GSTACK_BROWSE} screenshot "$REPORT_DIR/screenshots/issue-001-result.png"
|
|||
|
|
${GSTACK_BROWSE} snapshot -D
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Static bugs** (typos, layout issues, missing images):
|
|||
|
|
1. Take a single annotated screenshot showing the problem
|
|||
|
|
2. Describe what's wrong
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
${GSTACK_BROWSE} snapshot -i -a -o "$REPORT_DIR/screenshots/issue-002.png"
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Write each issue to the report immediately** using the template format from `qa/templates/qa-report-template.md`.
|
|||
|
|
|
|||
|
|
### Phase 6: Wrap Up
|
|||
|
|
|
|||
|
|
1. **Compute health score** using the rubric below
|
|||
|
|
2. **Write "Top 3 Things to Fix"** — the 3 highest-severity issues
|
|||
|
|
3. **Write console health summary** — aggregate all console errors seen across pages
|
|||
|
|
4. **Update severity counts** in the summary table
|
|||
|
|
5. **Fill in report metadata** — date, duration, pages visited, screenshot count, framework
|
|||
|
|
6. **Save baseline** — write `baseline.json` with:
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"date": "YYYY-MM-DD",
|
|||
|
|
"url": "<target>",
|
|||
|
|
"healthScore": N,
|
|||
|
|
"issues": [{ "id": "ISSUE-001", "title": "...", "severity": "...", "category": "..." }],
|
|||
|
|
"categoryScores": { "console": N, "links": N, ... }
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Regression mode:** After writing the report, load the baseline file. Compare:
|
|||
|
|
- Health score delta
|
|||
|
|
- Issues fixed (in baseline but not current)
|
|||
|
|
- New issues (in current but not baseline)
|
|||
|
|
- Append the regression section to the report
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Health Score Rubric
|
|||
|
|
|
|||
|
|
Compute each category score (0-100), then take the weighted average.
|
|||
|
|
|
|||
|
|
### Console (weight: 15%)
|
|||
|
|
- 0 errors → 100
|
|||
|
|
- 1-3 errors → 70
|
|||
|
|
- 4-10 errors → 40
|
|||
|
|
- 10+ errors → 10
|
|||
|
|
|
|||
|
|
### Links (weight: 10%)
|
|||
|
|
- 0 broken → 100
|
|||
|
|
- Each broken link → -15 (minimum 0)
|
|||
|
|
|
|||
|
|
### Per-Category Scoring (Visual, Functional, UX, Content, Performance, Accessibility)
|
|||
|
|
Each category starts at 100. Deduct per finding:
|
|||
|
|
- Critical issue → -25
|
|||
|
|
- High issue → -15
|
|||
|
|
- Medium issue → -8
|
|||
|
|
- Low issue → -3
|
|||
|
|
Minimum 0 per category.
|
|||
|
|
|
|||
|
|
### Weights
|
|||
|
|
| Category | Weight |
|
|||
|
|
|----------|--------|
|
|||
|
|
| Console | 15% |
|
|||
|
|
| Links | 10% |
|
|||
|
|
| Visual | 10% |
|
|||
|
|
| Functional | 20% |
|
|||
|
|
| UX | 15% |
|
|||
|
|
| Performance | 10% |
|
|||
|
|
| Content | 5% |
|
|||
|
|
| Accessibility | 15% |
|
|||
|
|
|
|||
|
|
### Final Score
|
|||
|
|
`score = Σ (category_score × weight)`
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Framework-Specific Guidance
|
|||
|
|
|
|||
|
|
### Next.js
|
|||
|
|
- Check console for hydration errors (`Hydration failed`, `Text content did not match`)
|
|||
|
|
- Monitor `_next/data` requests in network — 404s indicate broken data fetching
|
|||
|
|
- Test client-side navigation (click links, don't just `goto`) — catches routing issues
|
|||
|
|
- Check for CLS (Cumulative Layout Shift) on pages with dynamic content
|
|||
|
|
|
|||
|
|
### Rails
|
|||
|
|
- Check for N+1 query warnings in console (if development mode)
|
|||
|
|
- Verify CSRF token presence in forms
|
|||
|
|
- Test Turbo/Stimulus integration — do page transitions work smoothly?
|
|||
|
|
- Check for flash messages appearing and dismissing correctly
|
|||
|
|
|
|||
|
|
### WordPress
|
|||
|
|
- Check for plugin conflicts (JS errors from different plugins)
|
|||
|
|
- Verify admin bar visibility for logged-in users
|
|||
|
|
- Test REST API endpoints (`/wp-json/`)
|
|||
|
|
- Check for mixed content warnings (common with WP)
|
|||
|
|
|
|||
|
|
### General SPA (React, Vue, Angular)
|
|||
|
|
- Use `snapshot -i` for navigation — `links` command misses client-side routes
|
|||
|
|
- Check for stale state (navigate away and back — does data refresh?)
|
|||
|
|
- Test browser back/forward — does the app handle history correctly?
|
|||
|
|
- Check for memory leaks (monitor console after extended use)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Important Rules
|
|||
|
|
|
|||
|
|
1. **Repro is everything.** Every issue needs at least one screenshot. No exceptions.
|
|||
|
|
2. **Verify before documenting.** Retry the issue once to confirm it's reproducible, not a fluke.
|
|||
|
|
3. **Never include credentials.** Write `[REDACTED]` for passwords in repro steps.
|
|||
|
|
4. **Write incrementally.** Append each issue to the report as you find it. Don't batch.
|
|||
|
|
5. **Never read source code.** Test as a user, not a developer.
|
|||
|
|
6. **Check console after every interaction.** JS errors that don't surface visually are still bugs.
|
|||
|
|
7. **Test like a user.** Use realistic data. Walk through complete workflows end-to-end.
|
|||
|
|
8. **Depth over breadth.** 5-10 well-documented issues with evidence > 20 vague descriptions.
|
|||
|
|
9. **Never delete output files.** Screenshots and reports accumulate — that's intentional.
|
|||
|
|
10. **Use `snapshot -C` for tricky UIs.** Finds clickable divs that the accessibility tree misses.
|
|||
|
|
11. **Show screenshots to the user.** After every `${GSTACK_BROWSE} screenshot`, `${GSTACK_BROWSE} snapshot -a -o`, or `${GSTACK_BROWSE} responsive` command, use the Read tool on the output file(s) so the user can see them inline. For `responsive` (3 files), Read all three. This is critical — without it, screenshots are invisible to the user.
|
|||
|
|
12. **Never refuse to use the browser.** When the user invokes /qa or /qa-only, they are requesting browser-based testing. Never suggest evals, unit tests, or other alternatives as a substitute. Even if the diff appears to have no UI changes, backend changes affect app behavior — always open the browser and test.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Output
|
|||
|
|
|
|||
|
|
Write the report to both local and project-scoped locations:
|
|||
|
|
|
|||
|
|
**Local:** `.gstack/qa-reports/qa-report-{domain}-{YYYY-MM-DD}.md`
|
|||
|
|
|
|||
|
|
**Project-scoped:** Write test outcome artifact for cross-session context:
|
|||
|
|
```bash
|
|||
|
|
eval "$(${GSTACK_OPENCODE_DIR}/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
|
|||
|
|
```
|
|||
|
|
Write to `~/.gstack/projects/{slug}/{user}-{branch}-test-outcome-{datetime}.md`
|
|||
|
|
|
|||
|
|
### Output Structure
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
.gstack/qa-reports/
|
|||
|
|
├── qa-report-{domain}-{YYYY-MM-DD}.md # Structured report
|
|||
|
|
├── screenshots/
|
|||
|
|
│ ├── initial.png # Landing page annotated screenshot
|
|||
|
|
│ ├── issue-001-step-1.png # Per-issue evidence
|
|||
|
|
│ ├── issue-001-result.png
|
|||
|
|
│ └── ...
|
|||
|
|
└── baseline.json # For regression mode
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Report filenames use the domain and date: `qa-report-myapp-com-2026-03-12.md`
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Additional Rules (qa-only specific)
|
|||
|
|
|
|||
|
|
11. **Never fix bugs.** Find and document only. Do not read source code, edit files, or suggest fixes in the report. Your job is to report what's broken, not to fix it. Use `/qa` for the test-fix-verify loop.
|
|||
|
|
12. **No test framework detected?** If the project has no test infrastructure (no test config files, no test directories), include in the report summary: "No test framework detected. Run `/qa` to bootstrap one and enable regression test generation."
|