opencode-workflow/skills/challenge-architecture/SKILL.md

181 lines
7.1 KiB
Markdown
Raw Normal View History

2026-04-10 09:24:59 +00:00
---
name: challenge-architecture
description: "Stress-test architecture decisions, check PRD traceability, detect over-engineering, and validate storage and pattern selections. Comparable to grill-me in the PM pipeline."
---
Interview the architect relentlessly about every aspect of this architecture until it passes quality gates. Walk down each branch of the architecture decision tree, validating traceability, necessity, and soundness one-by-one.
Focus on system design validation, not implementation details. If a question drifts into code-level patterns, library choices, or implementation specifics, redirect it back to architecture-level concerns.
**Announce at start:** "I'm using the challenge-architecture skill to validate and stress-test the architecture."
Ask the questions one at a time.
## Primary Input
- `docs/architecture/{feature}.md`
- `docs/prd/{feature}.md`
## Primary Output
- Updated `docs/architecture/{feature}.md`
## Process
### Phase 1: Traceability Audit
For every architectural element, verify it traces back to at least one PRD requirement:
- Does every API endpoint serve a PRD functional requirement?
- Does every DB table serve a data requirement from functional requirements or NFRs?
- Does every service boundary serve a domain responsibility from the PRD scope?
- Does every async flow serve a PRD requirement?
- Does every error handling strategy serve a PRD edge case or NFR?
- Does every idempotency design serve a PRD requirement?
Flag any architectural element that exists without PRD traceability as **potential over-engineering**.
### Phase 2: Requirement Coverage Audit
For every PRD requirement, verify it is covered by the architecture:
- Does every functional requirement have at least one architectural component serving it?
- Does every NFR have at least one architectural decision addressing it?
- Does every edge case have an error handling strategy?
- Does every acceptance criterion have architectural support?
- Are there PRD requirements that the architecture does not address?
Flag any uncovered PRD requirement as a **gap**.
### Phase 3: Architecture Decision Validation
For each Architectural Decision Record, challenge:
- Is the decision necessary, or could a simpler approach work?
- Are the alternatives fairly evaluated, or is there a strawman?
- Is the rationale specific to this use case, or generic boilerplate?
- Are the consequences honestly assessed?
- Does the decision optimize for maintainability, scalability, reliability, clarity, and bounded responsibilities?
- Does the decision avoid over-engineering, premature microservices, unnecessary abstractions, and implementation leakage?
### Phase 4: Knowledge Domain Review
For each relevant knowledge domain, validate the architecture:
#### System Decomposition
- Are service boundaries aligned with domain responsibilities?
- Is each service's responsibility single and well-defined?
- Are there cyclic dependencies?
- Is coupling minimized while cohesion is maximized?
#### API & Contract Design
- Are API contracts complete and unambiguous?
- Are status codes appropriate and consistent?
- Is pagination defined for list endpoints?
- Are error responses consistent?
#### Data Modeling
- Are indexes justified by query patterns?
- Are relationships properly modeled?
- Is data ownership clear (each data item owned by exactly one service)?
- Is denormalization intentional and justified?
#### Distributed System Basics
- Are retry semantics clearly defined?
- Is timeout behavior specified?
- Is partial failure handled?
- Are consistency guarantees explicit?
#### Architecture Patterns
- Is each pattern necessary for the PRD requirements?
- Are patterns applied because they solve a real problem, not because they are fashionable?
- Is the chosen pattern the simplest option that works?
#### Storage Knowledge
- Is each storage selection justified by query patterns, write patterns, consistency requirements, or scale expectations?
- Is the storage choice the simplest option that meets requirements?
- Are there cases where a simpler storage option would suffice?
#### Async & Queue Design
- Is asynchronicity justified by PRD requirements?
- Are retry and DLQ strategies defined for every async operation?
- Are ordering guarantees specified where needed?
#### Error Model Design
- Are error categories complete and non-overlapping?
- Is the distinction between retryable and non-retryable errors clear?
- Is partial failure behavior defined?
- Are fallback strategies specified?
#### Idempotency Design
- Are idempotent operations correctly identified from PRD requirements?
- Is the idempotency key strategy complete (source, format, TTL, storage)?
- Is duplicate request behavior specified?
### Phase 5: Over-Engineering Detection
Check for common over-engineering patterns:
- Services that could be modules
- Patterns applied "just in case" without PRD justification
- Storage choices that exceed what the requirements demand
- Async processing where sync would suffice
- Abstraction layers that add complexity without solving a real problem
- Idempotency on operations that do not need it
- Error handling complexity disproportionate to the risk
### Phase 6: Under-Engineering Detection
Check for common under-engineering patterns:
- Missing error handling for edge cases identified in the PRD
- Missing idempotency for operations the PRD marks as requiring it
- Missing NFR accommodations (scaling, latency, availability)
- Missing async processing for operations that the PRD requires to be non-blocking
- Missing security boundaries or authentication where the PRD requires it
- Missing observability for critical operations
## Validation Checklist
After challenging, verify the architecture satisfies:
1. Every architectural element traces to at least one PRD requirement
2. Every PRD requirement is covered by at least one architectural element
3. Every ADR is necessary, well-reasoned, and honestly assessed
4. No over-engineering without PRD justification
5. No under-engineering for PRD-identified requirements
6. All 9 architecture sections are present and substantive (or explicitly N/A with reason)
7. Service boundaries are aligned with domain responsibilities
8. API contracts are complete and consistent
9. Data model is justified by query and write patterns
10. Storage selections are the simplest option that meets requirements
11. Async processing is justified by PRD requirements
12. Error model covers all PRD edge cases
13. Idempotency is applied where the PRD requires it, and not where it does not
## Outcomes
For each issue found:
1. Document the issue
2. Propose a fix
3. Apply the fix to the architecture document
4. Re-verify the fix against the PRD
After all issues are resolved, the architecture is ready for handoff to the Planner.
## Guardrails
This is a pure validation skill.
Do:
- Challenge architectural decisions with evidence
- Validate traceability to PRD requirements
- Detect over-engineering and under-engineering
- Propose specific fixes for identified issues
Do not:
- Change PRD requirements or scope
- Design architecture from scratch
- Make implementation-level decisions
- Break down tasks or create milestones
- Write test cases