181 lines
7.1 KiB
Markdown
181 lines
7.1 KiB
Markdown
|
|
---
|
||
|
|
name: challenge-architecture
|
||
|
|
description: "Stress-test architecture decisions, check PRD traceability, detect over-engineering, and validate storage and pattern selections. Comparable to grill-me in the PM pipeline."
|
||
|
|
---
|
||
|
|
|
||
|
|
Interview the architect relentlessly about every aspect of this architecture until it passes quality gates. Walk down each branch of the architecture decision tree, validating traceability, necessity, and soundness one-by-one.
|
||
|
|
|
||
|
|
Focus on system design validation, not implementation details. If a question drifts into code-level patterns, library choices, or implementation specifics, redirect it back to architecture-level concerns.
|
||
|
|
|
||
|
|
**Announce at start:** "I'm using the challenge-architecture skill to validate and stress-test the architecture."
|
||
|
|
|
||
|
|
Ask the questions one at a time.
|
||
|
|
|
||
|
|
## Primary Input
|
||
|
|
|
||
|
|
- `docs/architecture/{feature}.md`
|
||
|
|
- `docs/prd/{feature}.md`
|
||
|
|
|
||
|
|
## Primary Output
|
||
|
|
|
||
|
|
- Updated `docs/architecture/{feature}.md`
|
||
|
|
|
||
|
|
## Process
|
||
|
|
|
||
|
|
### Phase 1: Traceability Audit
|
||
|
|
|
||
|
|
For every architectural element, verify it traces back to at least one PRD requirement:
|
||
|
|
|
||
|
|
- Does every API endpoint serve a PRD functional requirement?
|
||
|
|
- Does every DB table serve a data requirement from functional requirements or NFRs?
|
||
|
|
- Does every service boundary serve a domain responsibility from the PRD scope?
|
||
|
|
- Does every async flow serve a PRD requirement?
|
||
|
|
- Does every error handling strategy serve a PRD edge case or NFR?
|
||
|
|
- Does every idempotency design serve a PRD requirement?
|
||
|
|
|
||
|
|
Flag any architectural element that exists without PRD traceability as **potential over-engineering**.
|
||
|
|
|
||
|
|
### Phase 2: Requirement Coverage Audit
|
||
|
|
|
||
|
|
For every PRD requirement, verify it is covered by the architecture:
|
||
|
|
|
||
|
|
- Does every functional requirement have at least one architectural component serving it?
|
||
|
|
- Does every NFR have at least one architectural decision addressing it?
|
||
|
|
- Does every edge case have an error handling strategy?
|
||
|
|
- Does every acceptance criterion have architectural support?
|
||
|
|
- Are there PRD requirements that the architecture does not address?
|
||
|
|
|
||
|
|
Flag any uncovered PRD requirement as a **gap**.
|
||
|
|
|
||
|
|
### Phase 3: Architecture Decision Validation
|
||
|
|
|
||
|
|
For each Architectural Decision Record, challenge:
|
||
|
|
|
||
|
|
- Is the decision necessary, or could a simpler approach work?
|
||
|
|
- Are the alternatives fairly evaluated, or is there a strawman?
|
||
|
|
- Is the rationale specific to this use case, or generic boilerplate?
|
||
|
|
- Are the consequences honestly assessed?
|
||
|
|
- Does the decision optimize for maintainability, scalability, reliability, clarity, and bounded responsibilities?
|
||
|
|
- Does the decision avoid over-engineering, premature microservices, unnecessary abstractions, and implementation leakage?
|
||
|
|
|
||
|
|
### Phase 4: Knowledge Domain Review
|
||
|
|
|
||
|
|
For each relevant knowledge domain, validate the architecture:
|
||
|
|
|
||
|
|
#### System Decomposition
|
||
|
|
- Are service boundaries aligned with domain responsibilities?
|
||
|
|
- Is each service's responsibility single and well-defined?
|
||
|
|
- Are there cyclic dependencies?
|
||
|
|
- Is coupling minimized while cohesion is maximized?
|
||
|
|
|
||
|
|
#### API & Contract Design
|
||
|
|
- Are API contracts complete and unambiguous?
|
||
|
|
- Are status codes appropriate and consistent?
|
||
|
|
- Is pagination defined for list endpoints?
|
||
|
|
- Are error responses consistent?
|
||
|
|
|
||
|
|
#### Data Modeling
|
||
|
|
- Are indexes justified by query patterns?
|
||
|
|
- Are relationships properly modeled?
|
||
|
|
- Is data ownership clear (each data item owned by exactly one service)?
|
||
|
|
- Is denormalization intentional and justified?
|
||
|
|
|
||
|
|
#### Distributed System Basics
|
||
|
|
- Are retry semantics clearly defined?
|
||
|
|
- Is timeout behavior specified?
|
||
|
|
- Is partial failure handled?
|
||
|
|
- Are consistency guarantees explicit?
|
||
|
|
|
||
|
|
#### Architecture Patterns
|
||
|
|
- Is each pattern necessary for the PRD requirements?
|
||
|
|
- Are patterns applied because they solve a real problem, not because they are fashionable?
|
||
|
|
- Is the chosen pattern the simplest option that works?
|
||
|
|
|
||
|
|
#### Storage Knowledge
|
||
|
|
- Is each storage selection justified by query patterns, write patterns, consistency requirements, or scale expectations?
|
||
|
|
- Is the storage choice the simplest option that meets requirements?
|
||
|
|
- Are there cases where a simpler storage option would suffice?
|
||
|
|
|
||
|
|
#### Async & Queue Design
|
||
|
|
- Is asynchronicity justified by PRD requirements?
|
||
|
|
- Are retry and DLQ strategies defined for every async operation?
|
||
|
|
- Are ordering guarantees specified where needed?
|
||
|
|
|
||
|
|
#### Error Model Design
|
||
|
|
- Are error categories complete and non-overlapping?
|
||
|
|
- Is the distinction between retryable and non-retryable errors clear?
|
||
|
|
- Is partial failure behavior defined?
|
||
|
|
- Are fallback strategies specified?
|
||
|
|
|
||
|
|
#### Idempotency Design
|
||
|
|
- Are idempotent operations correctly identified from PRD requirements?
|
||
|
|
- Is the idempotency key strategy complete (source, format, TTL, storage)?
|
||
|
|
- Is duplicate request behavior specified?
|
||
|
|
|
||
|
|
### Phase 5: Over-Engineering Detection
|
||
|
|
|
||
|
|
Check for common over-engineering patterns:
|
||
|
|
|
||
|
|
- Services that could be modules
|
||
|
|
- Patterns applied "just in case" without PRD justification
|
||
|
|
- Storage choices that exceed what the requirements demand
|
||
|
|
- Async processing where sync would suffice
|
||
|
|
- Abstraction layers that add complexity without solving a real problem
|
||
|
|
- Idempotency on operations that do not need it
|
||
|
|
- Error handling complexity disproportionate to the risk
|
||
|
|
|
||
|
|
### Phase 6: Under-Engineering Detection
|
||
|
|
|
||
|
|
Check for common under-engineering patterns:
|
||
|
|
|
||
|
|
- Missing error handling for edge cases identified in the PRD
|
||
|
|
- Missing idempotency for operations the PRD marks as requiring it
|
||
|
|
- Missing NFR accommodations (scaling, latency, availability)
|
||
|
|
- Missing async processing for operations that the PRD requires to be non-blocking
|
||
|
|
- Missing security boundaries or authentication where the PRD requires it
|
||
|
|
- Missing observability for critical operations
|
||
|
|
|
||
|
|
## Validation Checklist
|
||
|
|
|
||
|
|
After challenging, verify the architecture satisfies:
|
||
|
|
|
||
|
|
1. Every architectural element traces to at least one PRD requirement
|
||
|
|
2. Every PRD requirement is covered by at least one architectural element
|
||
|
|
3. Every ADR is necessary, well-reasoned, and honestly assessed
|
||
|
|
4. No over-engineering without PRD justification
|
||
|
|
5. No under-engineering for PRD-identified requirements
|
||
|
|
6. All 9 architecture sections are present and substantive (or explicitly N/A with reason)
|
||
|
|
7. Service boundaries are aligned with domain responsibilities
|
||
|
|
8. API contracts are complete and consistent
|
||
|
|
9. Data model is justified by query and write patterns
|
||
|
|
10. Storage selections are the simplest option that meets requirements
|
||
|
|
11. Async processing is justified by PRD requirements
|
||
|
|
12. Error model covers all PRD edge cases
|
||
|
|
13. Idempotency is applied where the PRD requires it, and not where it does not
|
||
|
|
|
||
|
|
## Outcomes
|
||
|
|
|
||
|
|
For each issue found:
|
||
|
|
1. Document the issue
|
||
|
|
2. Propose a fix
|
||
|
|
3. Apply the fix to the architecture document
|
||
|
|
4. Re-verify the fix against the PRD
|
||
|
|
|
||
|
|
After all issues are resolved, the architecture is ready for handoff to the Planner.
|
||
|
|
|
||
|
|
## Guardrails
|
||
|
|
|
||
|
|
This is a pure validation skill.
|
||
|
|
|
||
|
|
Do:
|
||
|
|
- Challenge architectural decisions with evidence
|
||
|
|
- Validate traceability to PRD requirements
|
||
|
|
- Detect over-engineering and under-engineering
|
||
|
|
- Propose specific fixes for identified issues
|
||
|
|
|
||
|
|
Do not:
|
||
|
|
- Change PRD requirements or scope
|
||
|
|
- Design architecture from scratch
|
||
|
|
- Make implementation-level decisions
|
||
|
|
- Break down tasks or create milestones
|
||
|
|
- Write test cases
|