opencode-workflow/skills/consistency-transaction-design/SKILL.md

7.9 KiB

name description
consistency-transaction-design Knowledge contract for consistency and transaction design. Provides principles and patterns for strong vs eventual consistency, idempotency, deduplication, retry, outbox pattern, saga, and compensation. Referenced by design-architecture when defining consistency model. Subsumes idempotency-design.

This is a knowledge contract, not a workflow skill. It provides theoretical guidance that the Architect references when designing consistency and transaction models. It does not produce artifacts directly.

This knowledge contract subsumes the previous idempotency-design contract. All idempotency concepts are included here alongside broader consistency and transaction patterns.

Core Principles

CAP Theorem

  • Consistency: Every read receives the most recent write or an error
  • Availability: Every request receives a (non-error) response, without guarantee that it contains the most recent write
  • Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped or delayed by the network
  • You cannot have all three simultaneously. Choose based on business requirements.

Consistency Spectrum

  • Strong consistency: Read always returns the latest write. Simplest mental model, but limits availability and scalability.
  • Causal consistency: Reads respect causal ordering. Good for collaborative systems.
  • Eventual consistency: Reads may return stale data, but converge over time. Highest availability and scalability.
  • Session consistency: Reads within a session see their own writes. Good compromise for user-facing systems.

Consistency Model Selection

When to Use Strong Consistency

  • Financial transactions (balances must be accurate)
  • Inventory management (overselling is unacceptable)
  • Unique constraint enforcement (duplicate records are unacceptable)
  • Configuration data (wrong config causes system errors)

When to Use Eventual Consistency

  • Read-heavy workloads with high availability requirements
  • Derived data (counts, aggregates, projections)
  • Notification delivery (delay is acceptable)
  • Analytics data (trend accuracy is sufficient)
  • Search indexes (slight staleness is acceptable)

Design Considerations

  • Define the consistency model per data domain, not per system
  • Document the expected replication lag and its business impact
  • Define conflict resolution strategy for eventual consistency (last-write-wins, merge, manual)
  • Define staleness tolerance per read pattern (how stale is acceptable?)

Idempotency Design

What is Idempotency?

An operation is idempotent if executing it once has the same effect as executing it multiple times.

When Idempotency is Required

  • Any operation triggered by user action (network retries, browser refresh)
  • Any operation triggered by webhook (delivery may be duplicated)
  • Any operation processed from a queue (at-least-once delivery)
  • Any operation that modifies state (creates, updates, deletes)

Idempotency Key Strategy

  • Source: Where does the key come from? (client-generated, server-assigned, composite)
  • Format: UUID, hash of request content, or composite key (user_id + action + timestamp)
  • TTL: How long is the key stored? Must be long enough to catch retries, short enough to avoid storage bloat
  • Storage: Where are idempotency keys stored? (database, Redis, in-memory)

Idempotency Response Behavior

  • First request: Process normally, return success response
  • Duplicate request: Return the original response (stored alongside the idempotency key)
  • Concurrent request: Return 409 Conflict or 425 Too Early (if the original request is still processing)

Idempotency Collision Handling

  • Different requests with the same key must be detected and rejected
  • Keys must be unique per operation type and per client/tenant scope

Deduplication

Patterns

  • Idempotency key: For request-level deduplication
  • Content hash: For message-level deduplication (hash the message content)
  • Sequence number: For ordered message deduplication (track last processed sequence)
  • Tombstone: Mark processed messages to prevent reprocessing

Design Considerations

  • Define deduplication window (how long to track processed messages)
  • Define deduplication scope (per-producer, per-consumer, per-queue)
  • Define storage for deduplication state (Redis with TTL, database table)
  • Define cleanup strategy for deduplication state

Retry

Retry Patterns

  • Fixed interval: Retry at fixed intervals (simple, but may overload recovering service)
  • Exponential backoff: Increase delay between retries (recommended default)
  • Exponential backoff with jitter: Add randomness to prevent thundering herd
  • Circuit breaker: Stop retrying after consecutive failures, try again after cooldown

Design Considerations

  • Define maximum retry count per operation
  • Define backoff strategy (base, max, multiplier)
  • Define retryable vs non-retryable errors
    • Retryable: network timeout, 503, 429
    • Non-retryable: 400, 401, 403, 404, 409
  • Define retry budget (max retries per time window to prevent runaway retries)
  • Define what to do after max retries (DLQ, alert, manual intervention)

Outbox Pattern

When to Use

  • When you need to atomically write to a database and publish a message
  • When you cannot use a distributed transaction across database and message broker
  • When you need at-least-once message delivery guarantee

How It Works

  1. Write business data and outbox message to the same database transaction
  2. A separate process reads the outbox table and publishes messages to the broker
  3. Mark outbox messages as published after successful delivery 4.failed deliveries are retried by the outbox reader

Design Considerations

  • Outbox table must be in the same database as business data
  • Outbox reader must handle duplicate delivery (consumer must be idempotent)
  • Outbox reader polling interval affects delivery latency
  • Define outbox message TTL and cleanup strategy

Saga Pattern

When to Use

  • When a business operation spans multiple services and requires distributed transaction semantics
  • When you need to rollback if any step fails

Choreography-Based Saga

  • Each service publishes events that trigger the next step
  • No central coordinator
  • Services must listen for events and decide what to do
  • Compensation: each service publishes a compensation event if a step fails

Orchestration-Based Saga

  • A central orchestrator calls each service in sequence
  • Orchestrator maintains saga state and decides which step to execute next
  • Compensation: orchestrator calls compensation operations in reverse order
  • More visible and debuggable, but adds a single point of failure

Design Considerations

  • Define saga steps and order
  • Define compensation for each step (what to do if this step or a later step fails)
  • Define saga timeout and expiration
  • Define how to handle partial failures (which steps completed, which need compensation)
  • Consider whether choreography or orchestration is more appropriate
  • Choreography: simpler, more decoupled, harder to debug
  • Orchestration: more visible, easier to debug, more coupled

Anti-Patterns

  • Assuming strong consistency when using eventually consistent storage: Be explicit about consistency guarantees
  • Missing idempotency for queue consumers: Queue delivery is at-least-once, consumers must be idempotent
  • Infinite retries without backoff: Always use exponential backoff with a maximum
  • Distributed transactions across services: Use saga pattern instead of trying to enforce ACID across services
  • Outbox without deduplication: Outbox pattern guarantees at-least-once delivery, consumers must handle duplicates
  • Saga without compensation: Every saga step must have a defined compensation action
  • Missing conflict resolution for eventually consistent data: Define how conflicts are resolved when they inevitably occur