156 lines
7.9 KiB
Markdown
156 lines
7.9 KiB
Markdown
---
|
|
name: consistency-transaction-design
|
|
description: "Knowledge contract for consistency and transaction design. Provides principles and patterns for strong vs eventual consistency, idempotency, deduplication, retry, outbox pattern, saga, and compensation. Referenced by design-architecture when defining consistency model. Subsumes idempotency-design."
|
|
---
|
|
|
|
This is a knowledge contract, not a workflow skill. It provides theoretical guidance that the Architect references when designing consistency and transaction models. It does not produce artifacts directly.
|
|
|
|
This knowledge contract subsumes the previous `idempotency-design` contract. All idempotency concepts are included here alongside broader consistency and transaction patterns.
|
|
|
|
## Core Principles
|
|
|
|
### CAP Theorem
|
|
- **Consistency**: Every read receives the most recent write or an error
|
|
- **Availability**: Every request receives a (non-error) response, without guarantee that it contains the most recent write
|
|
- **Partition tolerance**: The system continues to operate despite an arbitrary number of messages being dropped or delayed by the network
|
|
- You cannot have all three simultaneously. Choose based on business requirements.
|
|
|
|
### Consistency Spectrum
|
|
- **Strong consistency**: Read always returns the latest write. Simplest mental model, but limits availability and scalability.
|
|
- **Causal consistency**: Reads respect causal ordering. Good for collaborative systems.
|
|
- **Eventual consistency**: Reads may return stale data, but converge over time. Highest availability and scalability.
|
|
- **Session consistency**: Reads within a session see their own writes. Good compromise for user-facing systems.
|
|
|
|
## Consistency Model Selection
|
|
|
|
### When to Use Strong Consistency
|
|
- Financial transactions (balances must be accurate)
|
|
- Inventory management (overselling is unacceptable)
|
|
- Unique constraint enforcement (duplicate records are unacceptable)
|
|
- Configuration data (wrong config causes system errors)
|
|
|
|
### When to Use Eventual Consistency
|
|
- Read-heavy workloads with high availability requirements
|
|
- Derived data (counts, aggregates, projections)
|
|
- Notification delivery (delay is acceptable)
|
|
- Analytics data (trend accuracy is sufficient)
|
|
- Search indexes (slight staleness is acceptable)
|
|
|
|
### Design Considerations
|
|
- Define the consistency model per data domain, not per system
|
|
- Document the expected replication lag and its business impact
|
|
- Define conflict resolution strategy for eventual consistency (last-write-wins, merge, manual)
|
|
- Define staleness tolerance per read pattern (how stale is acceptable?)
|
|
|
|
## Idempotency Design
|
|
|
|
### What is Idempotency?
|
|
An operation is idempotent if executing it once has the same effect as executing it multiple times.
|
|
|
|
### When Idempotency is Required
|
|
- Any operation triggered by user action (network retries, browser refresh)
|
|
- Any operation triggered by webhook (delivery may be duplicated)
|
|
- Any operation processed from a queue (at-least-once delivery)
|
|
- Any operation that modifies state (creates, updates, deletes)
|
|
|
|
### Idempotency Key Strategy
|
|
- **Source**: Where does the key come from? (client-generated, server-assigned, composite)
|
|
- **Format**: UUID, hash of request content, or composite key (user_id + action + timestamp)
|
|
- **TTL**: How long is the key stored? Must be long enough to catch retries, short enough to avoid storage bloat
|
|
- **Storage**: Where are idempotency keys stored? (database, Redis, in-memory)
|
|
|
|
### Idempotency Response Behavior
|
|
- **First request**: Process normally, return success response
|
|
- **Duplicate request**: Return the original response (stored alongside the idempotency key)
|
|
- **Concurrent request**: Return 409 Conflict or 425 Too Early (if the original request is still processing)
|
|
|
|
### Idempotency Collision Handling
|
|
- Different requests with the same key must be detected and rejected
|
|
- Keys must be unique per operation type and per client/tenant scope
|
|
|
|
## Deduplication
|
|
|
|
### Patterns
|
|
- **Idempotency key**: For request-level deduplication
|
|
- **Content hash**: For message-level deduplication (hash the message content)
|
|
- **Sequence number**: For ordered message deduplication (track last processed sequence)
|
|
- **Tombstone**: Mark processed messages to prevent reprocessing
|
|
|
|
### Design Considerations
|
|
- Define deduplication window (how long to track processed messages)
|
|
- Define deduplication scope (per-producer, per-consumer, per-queue)
|
|
- Define storage for deduplication state (Redis with TTL, database table)
|
|
- Define cleanup strategy for deduplication state
|
|
|
|
## Retry
|
|
|
|
### Retry Patterns
|
|
- **Fixed interval**: Retry at fixed intervals (simple, but may overload recovering service)
|
|
- **Exponential backoff**: Increase delay between retries (recommended default)
|
|
- **Exponential backoff with jitter**: Add randomness to prevent thundering herd
|
|
- **Circuit breaker**: Stop retrying after consecutive failures, try again after cooldown
|
|
|
|
### Design Considerations
|
|
- Define maximum retry count per operation
|
|
- Define backoff strategy (base, max, multiplier)
|
|
- Define retryable vs non-retryable errors
|
|
- Retryable: network timeout, 503, 429
|
|
- Non-retryable: 400, 401, 403, 404, 409
|
|
- Define retry budget (max retries per time window to prevent runaway retries)
|
|
- Define what to do after max retries (DLQ, alert, manual intervention)
|
|
|
|
## Outbox Pattern
|
|
|
|
### When to Use
|
|
- When you need to atomically write to a database and publish a message
|
|
- When you cannot use a distributed transaction across database and message broker
|
|
- When you need at-least-once message delivery guarantee
|
|
|
|
### How It Works
|
|
1. Write business data and outbox message to the same database transaction
|
|
2. A separate process reads the outbox table and publishes messages to the broker
|
|
3. Mark outbox messages as published after successful delivery
|
|
4.failed deliveries are retried by the outbox reader
|
|
|
|
### Design Considerations
|
|
- Outbox table must be in the same database as business data
|
|
- Outbox reader must handle duplicate delivery (consumer must be idempotent)
|
|
- Outbox reader polling interval affects delivery latency
|
|
- Define outbox message TTL and cleanup strategy
|
|
|
|
## Saga Pattern
|
|
|
|
### When to Use
|
|
- When a business operation spans multiple services and requires distributed transaction semantics
|
|
- When you need to rollback if any step fails
|
|
|
|
### Choreography-Based Saga
|
|
- Each service publishes events that trigger the next step
|
|
- No central coordinator
|
|
- Services must listen for events and decide what to do
|
|
- Compensation: each service publishes a compensation event if a step fails
|
|
|
|
### Orchestration-Based Saga
|
|
- A central orchestrator calls each service in sequence
|
|
- Orchestrator maintains saga state and decides which step to execute next
|
|
- Compensation: orchestrator calls compensation operations in reverse order
|
|
- More visible and debuggable, but adds a single point of failure
|
|
|
|
### Design Considerations
|
|
- Define saga steps and order
|
|
- Define compensation for each step (what to do if this step or a later step fails)
|
|
- Define saga timeout and expiration
|
|
- Define how to handle partial failures (which steps completed, which need compensation)
|
|
- Consider whether choreography or orchestration is more appropriate
|
|
- Choreography: simpler, more decoupled, harder to debug
|
|
- Orchestration: more visible, easier to debug, more coupled
|
|
|
|
## Anti-Patterns
|
|
|
|
- **Assuming strong consistency when using eventually consistent storage**: Be explicit about consistency guarantees
|
|
- **Missing idempotency for queue consumers**: Queue delivery is at-least-once, consumers must be idempotent
|
|
- **Infinite retries without backoff**: Always use exponential backoff with a maximum
|
|
- **Distributed transactions across services**: Use saga pattern instead of trying to enforce ACID across services
|
|
- **Outbox without deduplication**: Outbox pattern guarantees at-least-once delivery, consumers must handle duplicates
|
|
- **Saga without compensation**: Every saga step must have a defined compensation action
|
|
- **Missing conflict resolution for eventually consistent data**: Define how conflicts are resolved when they inevitably occur |