opencode-workflow/skills/design-architecture/SKILL.md

437 lines
14 KiB
Markdown
Raw Normal View History

2026-04-10 09:24:59 +00:00
---
name: design-architecture
2026-04-10 11:28:45 +00:00
description: "Design system architecture based on PRD requirements. The Architect pipeline's core step, producing the single strict output file with all deliverables: Architecture Doc, Mermaid Diagrams, API Contract, DB Schema, ADR, NFR, Security Boundaries, Integration Boundaries, Observability, Consistency Model."
2026-04-10 09:24:59 +00:00
---
2026-04-10 11:28:45 +00:00
This skill produces the complete architecture document for a feature, including all required deliverables.
2026-04-10 09:24:59 +00:00
**Announce at start:** "I'm using the design-architecture skill to design the system architecture."
2026-04-10 09:34:54 +00:00
## Primary Input
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
- `docs/prd/{feature}.md` (required)
2026-04-10 09:24:59 +00:00
2026-04-10 09:34:54 +00:00
## Primary Output (STRICT PATH)
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
- `docs/architecture/{feature}.md`
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
This is the **only** file artifact produced by the Architect pipeline. No intermediate files (research, analysis) are written to disk. All deliverables — diagrams, schemas, specs, ADRs — must be embedded within this single document.
2026-04-10 09:24:59 +00:00
## Hard Gate
Do NOT start this skill if the PRD has unresolved ambiguities that block architectural decisions. Resolve them with the PM first.
## Process
You MUST complete these steps in order:
2026-04-10 11:28:45 +00:00
1. **Read the PRD** at `docs/prd/{feature}.md` end-to-end to understand all requirements
2026-04-10 09:34:54 +00:00
2. **Apply internal analysis** from the `analyze-prd` step (if performed) to understand which knowledge domains are relevant
3. **Design each architecture section** based on PRD requirements and relevant knowledge domains
2026-04-10 11:28:45 +00:00
4. **Apply knowledge contracts** as needed:
2026-04-10 09:24:59 +00:00
- `system-decomposition` when designing service boundaries
- `api-contract-design` when defining API contracts
- `data-modeling` when designing database schema
- `distributed-system-basics` when dealing with distributed concerns
- `architecture-patterns` when selecting architectural patterns
- `storage-knowledge` when making storage technology decisions
- `async-queue-design` when designing asynchronous workflows
- `error-model-design` when defining error handling
2026-04-10 11:28:45 +00:00
- `security-boundary-design` when defining auth, authorization, tenant isolation
- `consistency-transaction-design` when defining consistency model, idempotency, saga
- `integration-boundary-design` when defining external API integration patterns
- `observability-design` when defining logs, metrics, traces, alerts, SLOs
- `migration-rollout-design` when defining rollout strategy, feature flags, rollback
5. **Apply deliverable skills** to produce concrete artifacts:
- `generate_mermaid_diagram` when producing diagrams
- `design_database_schema` when producing database schema
- `generate_openapi_spec` when producing API specifications
- `write_adr` when documenting architectural decisions
- `evaluate_tech_stack` when evaluating technology choices
6. **Ensure traceability** — every architectural decision must trace back to at least one PRD requirement
7. **Write completeness check** — verify all 18 required sections are present and substantive
8. **Write the architecture document** to `docs/architecture/{feature}.md`
## Architect Behavior Principles
Apply these principles in priority order when making design decisions:
1. **High Availability** — Design for fault tolerance and resilience over perfect consistency
2. **Scalability** — Design for horizontal scaling over vertical scaling
3. **Stateless First** — Prefer stateless services; externalize state to databases or caches
4. **API First** — Define contracts before implementation; APIs are the primary interface
5. **Event Driven First** — Prefer event-driven communication for cross-service coordination
6. **Async First** — Prefer asynchronous processing for non-realtime operations
2026-04-10 09:24:59 +00:00
## Architecture Document Template
```markdown
# Architecture: {Feature Name}
2026-04-10 11:28:45 +00:00
## Overview
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
High-level description of the system architecture. Map every major PRD requirement to an architectural component. Summarize the system's purpose, key design decisions, and architectural style.
2026-04-10 09:24:59 +00:00
### Requirement Traceability
| PRD Requirement | Architectural Component |
|----------------|------------------------|
| ... | ... |
2026-04-10 11:28:45 +00:00
## System Architecture
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
Describe the complete system architecture including all services, databases, message queues, caches, and external integrations. Show how components are organized, what technology stack each uses, and how they communicate.
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
### Technology Stack
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
| Layer | Technology | Justification |
|-------|-----------|---------------|
| Language | ... | ... |
| Framework | ... | ... |
| Database | ... | ... |
| Queue | ... | ... |
| Cache | ... | ... |
| Infrastructure | ... | ... |
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
If the feature has no backend component, write `N/A` with a brief reason.
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
### Component Architecture
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
Describe each major component, its responsibility, and how it fits into the overall system.
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
## Service Boundaries
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
Define service boundaries with clear responsibilities and communication patterns.
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
For each service or module:
- Name and single responsibility
- Owned data
- Communication patterns with other services (sync, async, event-driven)
- Potential coupling points and mitigation
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
### Communication Matrix
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
| From | To | Pattern | Protocol | Purpose |
|------|----|---------|----------|---------|
| ... | ... | ... | ... | ... |
## Data Flow
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
Describe how data moves through the system end-to-end. Include:
- Request lifecycle from entry point to response
- Background job processing flow
- Event propagation flow
- Data transformation and enrichment steps
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
## Database Schema
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
Define all database tables, columns, indexes, partition keys, constraints, and relationships. If the feature requires no database changes, write `N/A` with a brief reason.
2026-04-10 09:24:59 +00:00
### Table Definitions
For each table:
- Table name and purpose
- Column definitions (name, type, constraints, defaults)
2026-04-10 11:28:45 +00:00
- Indexes with justification based on query patterns
- Partition keys (where applicable)
2026-04-10 09:24:59 +00:00
- Foreign key relationships
### Entity Relationships
Describe relationships between tables.
2026-04-10 11:28:45 +00:00
### Denormalization Strategy
If denormalization is applied, document which fields are denormalized, why, and the consistency implications.
2026-04-10 09:24:59 +00:00
### Migration Strategy
Notes on migration approach if schema changes affect existing data.
2026-04-10 11:28:45 +00:00
## API Contract
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
Define all API endpoints with full specifications. Use OpenAPI-style definitions for REST APIs. For gRPC APIs, define the service and method specifications.
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
### Endpoint Catalog
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
| Method | Path | Description | PRD Requirement |
|--------|------|-------------|-----------------|
| ... | ... | ... | ... |
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
### Endpoint Details
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
For each endpoint:
- Method and path
- Request schema (headers, path params, query params, body)
- Response schema (success and error responses)
- Status codes
- Authentication requirements
- Idempotency requirements (when applicable)
- Rate limiting expectations (when applicable)
- Pagination and filtering (when applicable)
- PRD functional requirement it satisfies
### Error Codes
Define consistent error codes and error response format.
## Async / Queue Design
2026-04-10 09:24:59 +00:00
Define asynchronous operations and their behavior. If the feature has no asynchronous requirements, write `N/A` with a brief reason.
### Async Operations
For each async operation:
- Operation name and trigger
- Queue or event topic
- Producer and consumer
- Retry policy (max retries, backoff, DLQ)
- Ordering guarantees
- Timeout and cancellation behavior
2026-04-10 11:28:45 +00:00
## Consistency Model
Define the consistency guarantees of the system.
### Consistency Strategy
- Strong vs eventual consistency per data domain
- When eventual consistency is acceptable and why
- Conflict resolution strategies
### Idempotency Design
For each idempotent operation:
- Operation name
- Idempotency key source and format
- Key TTL and storage location
- Duplicate request behavior
- Collision handling
### Deduplication & Retry
- Deduplication strategy for messages and events
- Retry policies and backoff strategies
- Outbox pattern usage (when applicable)
- Saga / compensation patterns (when applicable)
If the feature has no consistency or idempotency requirements, write `N/A` with a brief reason.
2026-04-10 09:24:59 +00:00
## Error Model
Define error handling strategy across the system.
### Error Categories
- Client errors (4xx)
- Server errors (5xx)
- Business rule violations
- Timeout errors
- Cascading failure modes
### Error Propagation Strategy
- Fail-fast vs graceful degradation vs circuit breaker
- Fallback behavior
### Error Response Format
Consistent error response schema across the system.
### PRD Edge Case Mapping
| Error Category | PRD Edge Case | Handling Strategy |
|---------------|---------------|-------------------|
| ... | ... | ... |
2026-04-10 11:28:45 +00:00
## Security Boundaries
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
Define security architecture for the system.
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
- Authentication mechanism
- Authorization model (RBAC, ABAC, etc.)
- Service identity and service-to-service auth
- Token propagation strategy
- Tenant isolation (multi-tenancy model)
- Secret management approach
- Audit logging requirements
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
If the feature has no security implications, write `N/A` with a brief reason.
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
## Integration Boundaries
2026-04-10 09:24:59 +00:00
2026-04-10 11:28:45 +00:00
Define all integrations with external systems.
For each external system integration:
- External system name and purpose
- Integration pattern (API call, webhook, polling, event subscription)
- Rate limits and quotas
- Failure modes and fallback behavior
- Retry strategy
- Data contract (request/response schemas)
- Authentication mechanism
If the feature has no external integrations, write `N/A` with a brief reason.
## Observability
Define observability strategy for the system.
### Logs
- Log levels and what to log
- Structured logging format
- Log aggregation strategy
### Metrics
- Key business metrics
- Key system metrics
- Metric naming conventions
### Traces
- Distributed tracing strategy
- Correlation ID propagation
- Span boundaries
### Alerts
- Alert conditions and thresholds
- Alert routing and escalation
### SLOs
- Availability SLOs
- Latency SLOs
- Error budget
## Scaling Strategy
Define how the system scales based on NFRs.
- Horizontal scaling approach (which components scale independently)
- Vertical scaling considerations
- Database scaling strategy (read replicas, sharding, partitioning)
- Cache scaling strategy
- Queue scaling strategy
- Auto-scaling policies (when applicable)
- Bottleneck analysis
## Non-Functional Requirements
Document all NFRs from the PRD and how the architecture addresses each one.
| NFR | Requirement | Architectural Decision | Verification Method |
|-----|-------------|----------------------|---------------------|
| Performance | ... | ... | ... |
| Availability | ... | ... | ... |
| Scalability | ... | ... | ... |
| Security | ... | ... | ... |
| Compliance | ... | ... | ... |
## Mermaid Diagrams
Produce at minimum the following diagrams embedded in the document.
### System Architecture Diagram
```mermaid
graph TD
A[Component A] --> B[Component B]
B --> C[Database]
B --> D[Queue]
```
### Sequence Diagram
```mermaid
sequenceDiagram
participant Client
participant Service
participant DB
Client->>Service: Request
Service->>DB: Query
DB-->>Service: Result
Service-->>Client: Response
```
### Data Flow Diagram
```mermaid
graph LR
A[Source] --> B[Processing]
B --> C[Storage]
B --> D[Output]
```
Additional diagrams as needed (event flow, state machine, etc.).
## ADR
Document significant architectural decisions.
### ADR-001: {Decision Title}
2026-04-10 09:24:59 +00:00
- **Context**: Why this decision was needed, including which PRD requirements drove it
2026-04-10 11:28:45 +00:00
- **Decision**: What was decided
2026-04-10 09:24:59 +00:00
- **Consequences**: What trade-offs or implications result
2026-04-10 11:28:45 +00:00
- **Alternatives**: What other options were considered
(Add additional ADRs as needed for each significant decision.)
## Risks
Identify and document architectural risks:
| Risk | Impact | Likelihood | Mitigation |
|------|--------|-----------|------------|
| ... | High/Medium/Low | High/Medium/Low | ... |
## Open Questions
List any unresolved questions that need PM or Engineering input:
1. ...
2. ...
2026-04-10 09:24:59 +00:00
```
## Completeness Check
Before finalizing the architecture document, verify:
2026-04-10 11:28:45 +00:00
1. All 18 required sections are present (or explicitly marked N/A with reason)
2. Every PRD functional requirement is traced to at least one architectural component
3. Every PRD NFR is traced to at least one architectural decision
2026-04-10 09:24:59 +00:00
4. Every architecture section that is not N/A has substantive content
5. All API endpoints map to PRD functional requirements
6. All DB tables map to data requirements from functional requirements or NFRs
7. All async flows map to PRD requirements
8. All error handling strategies map to PRD edge cases
2026-04-10 11:28:45 +00:00
9. ADRs exist for all significant decisions (minimum 1)
10. At least 3 Mermaid diagrams are present (system, sequence, data flow)
11. Service boundaries are aligned with domain responsibilities
12. Security boundaries are defined
13. Integration boundaries are defined for all external systems
14. Observability strategy covers logs, metrics, and traces
15. Consistency model is explicit about strong vs eventual guarantees
16. No architectural element exists without traceability to a PRD requirement
2026-04-10 09:24:59 +00:00
## Guardrails
This is a pure Architecture skill.
Do:
- Design system structure and boundaries
- Define API contracts and data models
2026-04-10 11:28:45 +00:00
- Define error handling, retry, and consistency strategies
- Define security boundaries and integration patterns
- Produce Mermaid diagrams, DB schemas, API specs, and ADRs
2026-04-10 09:24:59 +00:00
- Make architectural decisions with clear rationale and alternatives
- Ensure traceability to PRD requirements
Do not:
- Change PRD requirements or scope
- Create task breakdowns, milestones, or deliverables
- Write test cases or test plans
- Write implementation code or pseudocode
- Choose specific libraries or frameworks at the implementation level
- Prescribe code patterns, class structures, or function-level logic
2026-04-10 11:28:45 +00:00
- Produce any file artifact other than `docs/architecture/{feature}.md`
2026-04-10 09:24:59 +00:00
The Architect defines HOW the system is structured.
The Engineering defines HOW the code is written.
## Transition
After completing the architecture document, invoke `challenge-architecture` to validate and stress-test the architecture.