opencode-workflow/skills/security-boundary-design/SKILL.md

---
name: security-boundary-design
description: "Knowledge contract for security boundary design. Provides principles and patterns for authentication, authorization, service identity, token propagation, tenant isolation, secret management, and audit logging. Referenced by design-architecture when defining security boundaries."
---

This is a knowledge contract, not a workflow skill. It provides theoretical guidance that the Architect references when designing security boundaries. It does not produce artifacts directly.

## Core Principles

### Defense in Depth
- Never rely on a single security boundary
- Apply security at every layer: network, service, data, application
- Assume breach: design so that compromise of one layer doesn't compromise all

### Least Privilege
- Services and users should have the minimum permissions required
- Default deny: start with no access, grant explicitly
- Rotate and expire credentials regularly

### Zero Trust
- Don't trust internal network traffic by default
- Authenticate and authorize every service-to-service call
- Encrypt data in transit, even within the internal network

## Authentication

### Patterns
- **Token-based authentication**: JWT, OAuth2 tokens
- **API key authentication**: For service-to-service and public APIs
- **Certificate-based authentication**: mTLS for internal service communication
- **Session-based authentication**: For web applications with stateful sessions

### Design Considerations
- Define where authentication happens (edge gateway, service level, or both)
- Define token format, issuer, audience, and expiration
- Define token refresh and revocation strategy
- Define credential rotation strategy
- Consider token size impact on request headers

## Authorization

### Patterns
- **RBAC (Role-Based Access Control)**: Assign permissions to roles, assign roles to users
- **ABAC (Attribute-Based Access Control)**: Assign permissions based on attributes (user, resource, environment)
- **ACL (Access Control List)**: Explicit list of who can access what
- **ReBAC (Relationship-Based Access Control)**: Permissions based on relationships between entities

### Design Considerations
- Choose the simplest model that meets PRD requirements
- Define permission granularity: coarse-grained (role-level) vs fine-grained (resource-level)
- Define where authorization is enforced (gateway, service, or both)
- Define how permissions are stored and cached
- Consider multi-tenant authorization: can users in one tenant access resources in another?

## Service Identity

### Patterns
- **Service accounts**: Each service has its own identity with specific permissions
- **Workload identity**: Identity tied to the deployment (Kubernetes service accounts, cloud IAM roles)
- **Service mesh identity**: Identity managed by the service mesh (Istio, Linkerd)

### Design Considerations
- Each service should have its own identity (no shared credentials)
- Service identity should be short-lived and automatically rotated
- Service identity should be bound to the deployment environment
- Service identity permissions should follow least privilege

## Token Propagation

### Patterns
- **Pass-through**: Gateway validates token, passes it to downstream services
- **Token exchange**: Gateway validates external token, issues internal token
- **Token relay**: Each service forwards the token to downstream services
- **Impersonation**: Service calls downstream on behalf of the user

### Design Considerations
- Define token format for internal vs external communication
- Define token lifecycle: creation, validation, refresh, revocation
- Consider token size when propagating through multiple hops
- Consider what context to propagate (user identity, tenant, permissions, correlation ID)

## Tenant Isolation

### Patterns
- **Database-level isolation**: Separate database per tenant
- **Schema-level isolation**: Separate schema per tenant, shared database
- **Row-level isolation**: Shared schema, tenant_id column with enforcement
- **Application-level isolation**: Shared infrastructure, application enforces isolation

### Design Considerations
- Choose isolation level based on PRD requirements (compliance, performance, cost)
- Row-level isolation is simplest but requires careful query filtering
- Database-level isolation provides strongest isolation but highest cost
- Define how tenant context is resolved (subdomain, header, token claim)
- Define how tenant isolation is enforced (middleware, query filter, database policy)

## Secret Management

### Patterns
- **Environment variables**: Simple, but don't support rotation well
- **Secret management service**: HashiCorp Vault, AWS Secrets Manager, GCP Secret Manager
- **Platform-native secrets**: Kubernetes Secrets, cloud IAM role-based access
- **Configuration service**: Centralized configuration with encryption at rest

### Design Considerations
- Secrets must never be stored in code, configuration files in version control, or logs
- Define secret rotation strategy for each type of secret
- Define how services access secrets (sidecar, SDK, environment injection)
- Define audit trail for secret access
- Consider secret hierarchies (global, per-environment, per-service)

## Audit Logging

### Design Considerations
- Log all authentication and authorization events (success and failure)
- Log all data modification operations (who, what, when, from where)
- Log all administrative actions
- Define log retention period based on compliance requirements
- Define log format: structured JSON with consistent fields
- Log must be tamper-evident or append-only for compliance

## Anti-Patterns

- **Shared credentials across services**: Each service must have its own identity
- **Hard-coded secrets**: Secrets must be externalized and rotated
- **Overly broad permissions**: Grant least privilege, not convenience privilege
- **Missing authentication for internal services**: Internal traffic must also be authenticated
- **Missing audit logging for sensitive operations**: All auth events and data modifications must be logged
- **Trust based on network location**: Don't assume internal network is safe