opencode-workflow/skills/challenge-architecture/SKILL.md

223 lines
7.8 KiB
Markdown

---
name: challenge-architecture
description: "Silent audit and batch review of architecture decisions. Validates traceability, scalability, consistency, security, integration, observability, and detects over/under-engineering. Updates the single architecture file in place."
---
Perform a silent, structured audit of the architecture document against the PRD. Produce a single batch review with fixed output groups. Apply all fixes directly to the architecture file. Do not ask interactive questions.
**Announce at start:** "I'm using the challenge-architecture skill to audit and review the architecture."
## Primary Input
- `docs/architecture/{feature}.md`
- `docs/prd/{feature}.md`
## Primary Output (STRICT PATH)
- Updated `docs/architecture/{feature}.md`
This is the **only** file artifact in the Architect pipeline. Review findings and fixes are applied directly to this file. No intermediate files are written.
## Audit Mode
This skill operates in **silent audit / batch review** mode:
- Read the architecture document and PRD in full
- Perform all validation phases silently
- Produce a single structured review with all findings grouped into fixed categories
- Apply all fixes directly to the architecture document
- Do NOT ask questions one at a time or interactively prompt the user
## Audit Phases
Perform the following validations silently, collecting all findings before producing the review.
### Phase 1: Traceability
For every architectural element, verify it traces back to at least one PRD requirement:
- Every API endpoint serves a PRD functional requirement
- Every DB table serves a data requirement from FRs or NFRs
- Every service boundary serves a domain responsibility from the PRD scope
- Every async flow serves a PRD requirement
- Every error handling strategy serves a PRD edge case or NFR
- Every consistency decision serves a PRD requirement
- Every security boundary serves a security or compliance requirement
- Every integration boundary serves an external system requirement
- Every observability decision serves an NFR
### Phase 2: Coverage
For every PRD requirement, verify it is covered by the architecture:
- Every functional requirement has at least one architectural component
- Every NFR has at least one architectural decision
- Every edge case has an error handling strategy
- Every acceptance criterion has architectural support
### Phase 3: Scalability
- Can each service scale independently?
- Are there single points of failure?
- Are there bottlenecks that prevent horizontal scaling?
- Is database scaling addressed?
- Are there unbounded data growth scenarios?
### Phase 4: Consistency
- Is the consistency model explicit for each data domain?
- Are eventual consistency windows acceptable for the use case?
- Are race conditions identified and mitigated?
- Is idempotency designed for operations that require it?
- Are distributed transaction boundaries clear?
- Is the deduplication strategy sound?
- Are retry semantics defined for all async operations?
- Is the outbox pattern used where needed?
- Are saga/compensation patterns defined for multi-step operations?
### Phase 5: Security
- Are authentication boundaries clearly defined?
- Is authorization modeled correctly?
- Is service-to-service authentication specified?
- Is token propagation defined?
- Is tenant isolation defined (for multi-tenant systems)?
- Is secret management addressed?
- Are there data exposure risks in API responses?
- Is audit logging specified for sensitive operations?
### Phase 6: Integration
- Are all external system integrations identified?
- Is the integration pattern appropriate for each?
- Are rate limits and quotas addressed?
- Are failure modes defined for each integration?
- Are retry strategies defined for transient failures?
- Is data transformation between systems addressed?
### Phase 7: Observability
- Are logs, metrics, and traces all specified?
- Is correlation ID propagation defined across services?
- Are SLOs defined for critical operations?
- Are alert conditions and thresholds specified?
- Can the system be debugged end-to-end from logs and traces?
### Phase 8: Data Integrity
- Are there scenarios where data could be lost?
- Are transaction boundaries appropriate?
- Are there scenarios where data could become inconsistent?
- Is data ownership clear?
- Are cascading deletes or updates handled correctly?
### Phase 9: Over-Engineering Detection
- Services that could be modules
- Patterns applied without PRD justification
- Storage choices exceeding requirements
- Async processing where sync would suffice
- Abstraction layers without clear benefit
- Consistency guarantees stronger than requirements
- Security boundaries more complex than the threat model
- Observability granularity beyond operational need
### Phase 10: Under-Engineering Detection
- Missing error handling for PRD edge cases
- Missing idempotency for operations requiring it
- Missing NFR accommodations
- Missing async processing for non-blocking requirements
- Missing security boundaries where the PRD requires them
- Missing observability for critical operations
- Missing consistency model specification
- Missing integration failure handling
- Missing retry strategies for external dependencies
## Review Output Format
After completing all audit phases, produce a single structured review section. Append or update the `## Architecture Review` section in `docs/architecture/{feature}.md` with the following fixed groups:
```markdown
## Architecture Review
### Traceability Gaps
List every architectural element that cannot be traced to a PRD requirement, and every PRD requirement not covered by the architecture.
| Element / Requirement | Issue | Proposed Fix |
|----------------------|-------|-------------|
| ... | Untraceable / Uncovered | ... |
### Missing Decisions
List required architectural decisions that are absent or incomplete.
- [ ] ...
### Over-Engineering
List elements that exceed what the PRD requires.
- ... (specific item, why it is over-engineered, proposed simplification)
### Under-Engineering
List PRD requirements that lack adequate architectural support.
- ... (specific requirement, what is missing, proposed addition)
### Risks
| Risk | Impact | Likelihood | Mitigation |
|------|--------|-----------|------------|
| ... | High/Medium/Low | High/Medium/Low | ... |
### Required Revisions
Numbered list of all changes that MUST be applied before handoff:
1. ...
2. ...
```
After producing the review, apply all Required Revisions directly to `docs/architecture/{feature}.md`.
## Gate Decision
After applying revisions, evaluate the final state:
- **PASS** — All revisions applied, no remaining blockers
- **CONDITIONAL PASS** — Minor gaps remain but do not block Planner handoff
- **FAIL** — Significant revision required; return to `design-architecture`
Record the gate decision at the end of the Architecture Review section.
If FAIL, do NOT proceed to `finalize-architecture`. The architecture must be redesigned in `design-architecture` first.
If PASS or CONDITIONAL PASS, proceed to `finalize-architecture`.
## Guardrails
This is a pure validation and revision skill.
Do:
- Audit the architecture silently and produce a single batch review
- Validate traceability, scalability, consistency, security, integration, observability
- Detect over-engineering and under-engineering
- Propose specific fixes for all identified issues
- Apply all fixes directly to `docs/architecture/{feature}.md`
- Record the gate decision
Do not:
- Ask questions interactively
- Change PRD requirements or scope
- Design architecture from scratch
- Make implementation-level decisions
- Break down tasks or create milestones
- Write test cases
- Produce any file artifact other than `docs/architecture/{feature}.md`
## Transition
If gate decision is PASS or CONDITIONAL PASS, invoke `finalize-architecture` for final completeness check and format validation.