Patterns That Worked

Real, repeatable tactics from projects. Each pattern is a short narrative you can copy, adapt, and prove with metrics. Format: Context → Tactic → Steps → Evidence → Risks → Reuse checklist.

Use these as living notes. If your team runs a variation, add a new card with your outcomes.

How to contribute a pattern

Copy _templates/field-note.md into 05-field-notes/ and rename it meaningfully (e.g., boundary-first-on-pricing.md).
Fill in Context / Approach / Trade-offs / Outcome (metrics).
Add a Reuse checklist and link any artifacts (sanitized).
Open a PR and tag it with field-note and relevant domain tags (e.g., api, payments, i18n).

Evidence ideas: before/after defect rates, % coverage increase, time saved, incident reduction, user-facing metrics (refund accuracy, retried ops success).

Pattern 1 — Boundary-first on high-risk fields

Context Long forms and price/quantity fields were leaking edge-case bugs (off-by-one, illegal chars).

Tactic Start design with Boundary & Equivalence for the 2–3 highest-risk fields before writing flows.

Steps

Define domain ranges and encodings (e.g., 0–10,000, integers only).
Pick boundary set: -1, 0, 1, 9,999, 10,000, 10,001.
Add cross-field constraint checks (e.g., discount ≤ subtotal).
Turn into cases with explicit expected results.

Evidence

Defect leakage on pricing fields ↓ 42% over two releases.
Review time per PR ↓ ~20 min (pre-baked cases).

Risks / anti-patterns

Missing cross-field constraints; only testing single-field bounds.

Reuse checklist

Numeric & length limits captured
±1 around each limit tested
Cross-field constraints included
Expected results observable

Pattern 2 — Error taxonomy mapped to UX messages

Context Users saw inconsistent, vague error messages; support tickets spiked after releases.

Tactic Create a canonical error taxonomy and map each error to a user-facing message.

Steps

Define taxonomy: VALIDATION, AUTH, RATE_LIMIT, TRANSIENT, CONFLICT, etc.
Standardize error codes (e.g., VALIDATION.amount.invalid).
Map each to UX text and recovery hint.
Write tests asserting code + message + hint.

Evidence

Duplicate support tickets about “failed checkout” ↓ 30%.
Tests caught 3 mismapped messages pre-release.

Risks

Drift between backend codes and frontend copy.

Reuse checklist

Error code catalog checked in
UX copy reviewed & versioned
Contract tests assert code + message

Pattern 3 — Idempotency & retries for money-moving

Context Intermittent timeouts caused double charges on retries.

Tactic Enforce idempotency keys on payment/refund POSTs with backoff + jitter retries.

Steps

Require Idempotency-Key for unsafe ops.
Retry on 429/5xx/timeout with 100ms, 200ms, 400ms (+ jitter).
Assert same key ⇒ same outcome, new key ⇒ new op.
Add consumer-side dedupe idempotency as a safety net.

Evidence

Duplicate-charge incidents dropped to 0 over 3 months.
Observability logs show matched keys across retries.

Risks

Forgetting to include idempotency on refunds as well as charges.

Reuse checklist

Keys required for POST
Retry policy documented & tested
Same-key contract tests included

Pattern 4 — CRUD × Roles sweep for authorization edges

Context Role-based bugs kept escaping (e.g., “viewer” could export).

Tactic Run a CRUD grid across roles with a tiny matrix of allowed/denied cases.

Steps

List resources and operations.
For each role, mark allow/deny and rationale.
Generate tests from the matrix (positive + negative).
Assert HTTP code + UX notice + audit trail.

Evidence

AuthZ defects in production ↓ 60%.
Faster onboarding of new features (copy an existing matrix).

Risks

Overlooking indirect routes (bulk actions, exports, API tokens).

Reuse checklist

Matrix covers CRUD + exports/imports
Negative tests per role
Audit log checks added

Pattern 5 — Data variation grid for i18n & special chars

Context Intermittent failures on emoji/Chinese inputs; inconsistent normalization.

Tactic Use a data grid with variations for charset, length, locale, normalization.

Steps

Build a table: ASCII / Latin-1 / UTF-8 (emoji), mixed scripts.
Vary lengths: min / typical / max / overflow.
Include normalization (NFC/NFD) and trimming cases.
Assert storage, retrieval, and rendering invariants.

Evidence

i18n data-related incidents ↓ 70%.
Found 2 DB collation misconfigs pre-release.

Risks

Only testing input; forgetting export/CSV & email templates.

Reuse checklist

Charset mix covered
Length & normalization
End-to-end rendering/export verified

Pattern 6 — Cursor pagination with stability invariants

Context Users saw missing/duplicated items during rapid updates.

Tactic Adopt cursor-based pagination with stable sort and test invariants.

Steps

Define sort keys and stability rule (e.g., created_at DESC, id DESC).
Simulate inserts while paging.
Assert no duplicates, no skips across pages.
Verify next/prev cursor semantics.

Evidence

“Missing items” reports ↓ 95% after enforcing cursor rules.
Found a race on secondary index before GA.

Risks

Sorting on non-unique keys without tiebreaker.

Reuse checklist

Stable (tie-broken) sort
Insert-while-paging test
Duplicate/skip assertions

Pattern 7 — Observability contracts for verifiable tests

Context Tests flaked because outcomes weren’t externally observable.

Tactic Define log/metric/tracing keys as part of the requirement (observability contract).

Steps

Choose structured log keys (e.g., order_id, idempotency_key, status).
Emit metric for critical path (e.g., refund.success).
Add trace/span names for each step.
Tests assert evidence artifacts (log line or metric delta).

Evidence

Flaky checks stabilized; test runtime ↓ 18% by removing sleeps.
Faster incident triage with correlation IDs.

Risks

Logging PII or secrets.

Reuse checklist

Structured logs (no secrets)
Metrics & traces defined
Evidence captured in tests

Pattern 8 — p95 budgets as acceptance criteria

Context Perf regressions slipped in small PRs.

Tactic Set p95 budgets per scenario and gate with a lightweight check.

Steps

Define budgets (e.g., p95 ≤ 500 ms for search).
Seed a synthetic or micro-benchmark for the path.
Fail the gate if p95 breaches by > X%.
Track trend over time.

Evidence

p95 spikes caught pre-merge 4× in one quarter.
Smoother release cadence; fewer “hotfix Friday” events.

Risks

Measuring on noisy infra; use relative deltas or fixed rigs.

Reuse checklist

Budget documented
Synthetic or micro-bench in CI
Trend visualized

Submit yours

Keep stories short and specific.
Include at least one metric.
Offer a reuse checklist.
Sanitize data and remove confidential details.

Use _templates/field-note.md or _templates/case-study.md. Thank you for helping others ship better tests.

Patterns That Worked

How to contribute a pattern

Pattern 1 — Boundary-first on high-risk fields

Pattern 2 — Error taxonomy mapped to UX messages

Pattern 3 — Idempotency & retries for money-moving

Pattern 4 — CRUD × Roles sweep for authorization edges

Pattern 5 — Data variation grid for i18n & special chars

Pattern 6 — Cursor pagination with stability invariants

Pattern 7 — Observability contracts for verifiable tests

Pattern 8 — p95 budgets as acceptance criteria

Submit yours

Patterns That Worked

On this page