14 KiB
| name | description |
|---|---|
| design-architecture | Design system architecture based on PRD requirements. The Architect pipeline's core step, producing the single strict output file with all deliverables: Architecture Doc, Mermaid Diagrams, API Contract, DB Schema, ADR, NFR, Security Boundaries, Integration Boundaries, Observability, Consistency Model. |
This skill produces the complete architecture document for a feature, including all required deliverables.
Announce at start: "I'm using the design-architecture skill to design the system architecture."
Primary Input
docs/prd/{feature}.md(required)
Primary Output (STRICT PATH)
docs/architecture/{feature}.md
This is the only file artifact produced by the Architect pipeline. No intermediate files (research, analysis) are written to disk. All deliverables — diagrams, schemas, specs, ADRs — must be embedded within this single document.
Hard Gate
Do NOT start this skill if the PRD has unresolved ambiguities that block architectural decisions. Resolve them with the PM first.
Process
You MUST complete these steps in order:
- Read the PRD at
docs/prd/{feature}.mdend-to-end to understand all requirements - Apply internal analysis from the
analyze-prdstep (if performed) to understand which knowledge domains are relevant - Design each architecture section based on PRD requirements and relevant knowledge domains
- Apply knowledge contracts as needed:
system-decompositionwhen designing service boundariesapi-contract-designwhen defining API contractsdata-modelingwhen designing database schemadistributed-system-basicswhen dealing with distributed concernsarchitecture-patternswhen selecting architectural patternsstorage-knowledgewhen making storage technology decisionsasync-queue-designwhen designing asynchronous workflowserror-model-designwhen defining error handlingsecurity-boundary-designwhen defining auth, authorization, tenant isolationconsistency-transaction-designwhen defining consistency model, idempotency, sagaintegration-boundary-designwhen defining external API integration patternsobservability-designwhen defining logs, metrics, traces, alerts, SLOsmigration-rollout-designwhen defining rollout strategy, feature flags, rollback
- Apply deliverable skills to produce concrete artifacts:
generate_mermaid_diagramwhen producing diagramsdesign_database_schemawhen producing database schemagenerate_openapi_specwhen producing API specificationswrite_adrwhen documenting architectural decisionsevaluate_tech_stackwhen evaluating technology choices
- Ensure traceability — every architectural decision must trace back to at least one PRD requirement
- Write completeness check — verify all 18 required sections are present and substantive
- Write the architecture document to
docs/architecture/{feature}.md
Architect Behavior Principles
Apply these principles in priority order when making design decisions:
- High Availability — Design for fault tolerance and resilience over perfect consistency
- Scalability — Design for horizontal scaling over vertical scaling
- Stateless First — Prefer stateless services; externalize state to databases or caches
- API First — Define contracts before implementation; APIs are the primary interface
- Event Driven First — Prefer event-driven communication for cross-service coordination
- Async First — Prefer asynchronous processing for non-realtime operations
Architecture Document Template
# Architecture: {Feature Name}
## Overview
High-level description of the system architecture. Map every major PRD requirement to an architectural component. Summarize the system's purpose, key design decisions, and architectural style.
### Requirement Traceability
| PRD Requirement | Architectural Component |
|----------------|------------------------|
| ... | ... |
## System Architecture
Describe the complete system architecture including all services, databases, message queues, caches, and external integrations. Show how components are organized, what technology stack each uses, and how they communicate.
### Technology Stack
| Layer | Technology | Justification |
|-------|-----------|---------------|
| Language | ... | ... |
| Framework | ... | ... |
| Database | ... | ... |
| Queue | ... | ... |
| Cache | ... | ... |
| Infrastructure | ... | ... |
If the feature has no backend component, write `N/A` with a brief reason.
### Component Architecture
Describe each major component, its responsibility, and how it fits into the overall system.
## Service Boundaries
Define service boundaries with clear responsibilities and communication patterns.
For each service or module:
- Name and single responsibility
- Owned data
- Communication patterns with other services (sync, async, event-driven)
- Potential coupling points and mitigation
### Communication Matrix
| From | To | Pattern | Protocol | Purpose |
|------|----|---------|----------|---------|
| ... | ... | ... | ... | ... |
## Data Flow
Describe how data moves through the system end-to-end. Include:
- Request lifecycle from entry point to response
- Background job processing flow
- Event propagation flow
- Data transformation and enrichment steps
## Database Schema
Define all database tables, columns, indexes, partition keys, constraints, and relationships. If the feature requires no database changes, write `N/A` with a brief reason.
### Table Definitions
For each table:
- Table name and purpose
- Column definitions (name, type, constraints, defaults)
- Indexes with justification based on query patterns
- Partition keys (where applicable)
- Foreign key relationships
### Entity Relationships
Describe relationships between tables.
### Denormalization Strategy
If denormalization is applied, document which fields are denormalized, why, and the consistency implications.
### Migration Strategy
Notes on migration approach if schema changes affect existing data.
## API Contract
Define all API endpoints with full specifications. Use OpenAPI-style definitions for REST APIs. For gRPC APIs, define the service and method specifications.
### Endpoint Catalog
| Method | Path | Description | PRD Requirement |
|--------|------|-------------|-----------------|
| ... | ... | ... | ... |
### Endpoint Details
For each endpoint:
- Method and path
- Request schema (headers, path params, query params, body)
- Response schema (success and error responses)
- Status codes
- Authentication requirements
- Idempotency requirements (when applicable)
- Rate limiting expectations (when applicable)
- Pagination and filtering (when applicable)
- PRD functional requirement it satisfies
### Error Codes
Define consistent error codes and error response format.
## Async / Queue Design
Define asynchronous operations and their behavior. If the feature has no asynchronous requirements, write `N/A` with a brief reason.
### Async Operations
For each async operation:
- Operation name and trigger
- Queue or event topic
- Producer and consumer
- Retry policy (max retries, backoff, DLQ)
- Ordering guarantees
- Timeout and cancellation behavior
## Consistency Model
Define the consistency guarantees of the system.
### Consistency Strategy
- Strong vs eventual consistency per data domain
- When eventual consistency is acceptable and why
- Conflict resolution strategies
### Idempotency Design
For each idempotent operation:
- Operation name
- Idempotency key source and format
- Key TTL and storage location
- Duplicate request behavior
- Collision handling
### Deduplication & Retry
- Deduplication strategy for messages and events
- Retry policies and backoff strategies
- Outbox pattern usage (when applicable)
- Saga / compensation patterns (when applicable)
If the feature has no consistency or idempotency requirements, write `N/A` with a brief reason.
## Error Model
Define error handling strategy across the system.
### Error Categories
- Client errors (4xx)
- Server errors (5xx)
- Business rule violations
- Timeout errors
- Cascading failure modes
### Error Propagation Strategy
- Fail-fast vs graceful degradation vs circuit breaker
- Fallback behavior
### Error Response Format
Consistent error response schema across the system.
### PRD Edge Case Mapping
| Error Category | PRD Edge Case | Handling Strategy |
|---------------|---------------|-------------------|
| ... | ... | ... |
## Security Boundaries
Define security architecture for the system.
- Authentication mechanism
- Authorization model (RBAC, ABAC, etc.)
- Service identity and service-to-service auth
- Token propagation strategy
- Tenant isolation (multi-tenancy model)
- Secret management approach
- Audit logging requirements
If the feature has no security implications, write `N/A` with a brief reason.
## Integration Boundaries
Define all integrations with external systems.
For each external system integration:
- External system name and purpose
- Integration pattern (API call, webhook, polling, event subscription)
- Rate limits and quotas
- Failure modes and fallback behavior
- Retry strategy
- Data contract (request/response schemas)
- Authentication mechanism
If the feature has no external integrations, write `N/A` with a brief reason.
## Observability
Define observability strategy for the system.
### Logs
- Log levels and what to log
- Structured logging format
- Log aggregation strategy
### Metrics
- Key business metrics
- Key system metrics
- Metric naming conventions
### Traces
- Distributed tracing strategy
- Correlation ID propagation
- Span boundaries
### Alerts
- Alert conditions and thresholds
- Alert routing and escalation
### SLOs
- Availability SLOs
- Latency SLOs
- Error budget
## Scaling Strategy
Define how the system scales based on NFRs.
- Horizontal scaling approach (which components scale independently)
- Vertical scaling considerations
- Database scaling strategy (read replicas, sharding, partitioning)
- Cache scaling strategy
- Queue scaling strategy
- Auto-scaling policies (when applicable)
- Bottleneck analysis
## Non-Functional Requirements
Document all NFRs from the PRD and how the architecture addresses each one.
| NFR | Requirement | Architectural Decision | Verification Method |
|-----|-------------|----------------------|---------------------|
| Performance | ... | ... | ... |
| Availability | ... | ... | ... |
| Scalability | ... | ... | ... |
| Security | ... | ... | ... |
| Compliance | ... | ... | ... |
## Mermaid Diagrams
Produce at minimum the following diagrams embedded in the document.
### System Architecture Diagram
```mermaid
graph TD
A[Component A] --> B[Component B]
B --> C[Database]
B --> D[Queue]
Sequence Diagram
sequenceDiagram
participant Client
participant Service
participant DB
Client->>Service: Request
Service->>DB: Query
DB-->>Service: Result
Service-->>Client: Response
Data Flow Diagram
graph LR
A[Source] --> B[Processing]
B --> C[Storage]
B --> D[Output]
Additional diagrams as needed (event flow, state machine, etc.).
ADR
Document significant architectural decisions.
ADR-001: {Decision Title}
- Context: Why this decision was needed, including which PRD requirements drove it
- Decision: What was decided
- Consequences: What trade-offs or implications result
- Alternatives: What other options were considered
(Add additional ADRs as needed for each significant decision.)
Risks
Identify and document architectural risks:
| Risk | Impact | Likelihood | Mitigation |
|---|---|---|---|
| ... | High/Medium/Low | High/Medium/Low | ... |
Open Questions
List any unresolved questions that need PM or Engineering input:
- ...
- ...
## Completeness Check
Before finalizing the architecture document, verify:
1. All 18 required sections are present (or explicitly marked N/A with reason)
2. Every PRD functional requirement is traced to at least one architectural component
3. Every PRD NFR is traced to at least one architectural decision
4. Every architecture section that is not N/A has substantive content
5. All API endpoints map to PRD functional requirements
6. All DB tables map to data requirements from functional requirements or NFRs
7. All async flows map to PRD requirements
8. All error handling strategies map to PRD edge cases
9. ADRs exist for all significant decisions (minimum 1)
10. At least 3 Mermaid diagrams are present (system, sequence, data flow)
11. Service boundaries are aligned with domain responsibilities
12. Security boundaries are defined
13. Integration boundaries are defined for all external systems
14. Observability strategy covers logs, metrics, and traces
15. Consistency model is explicit about strong vs eventual guarantees
16. No architectural element exists without traceability to a PRD requirement
## Guardrails
This is a pure Architecture skill.
Do:
- Design system structure and boundaries
- Define API contracts and data models
- Define error handling, retry, and consistency strategies
- Define security boundaries and integration patterns
- Produce Mermaid diagrams, DB schemas, API specs, and ADRs
- Make architectural decisions with clear rationale and alternatives
- Ensure traceability to PRD requirements
Do not:
- Change PRD requirements or scope
- Create task breakdowns, milestones, or deliverables
- Write test cases or test plans
- Write implementation code or pseudocode
- Choose specific libraries or frameworks at the implementation level
- Prescribe code patterns, class structures, or function-level logic
- Produce any file artifact other than `docs/architecture/{feature}.md`
The Architect defines HOW the system is structured.
The Engineering defines HOW the code is written.
## Transition
After completing the architecture document, invoke `challenge-architecture` to validate and stress-test the architecture.