Flows & State

Features aren’t just inputs and outputs—they’re journeys through states. Modeling flows (MAE: Main/Alt/Exception) and state machines gives you the scaffolding to write fewer tests with better coverage and clearer oracles.

1) Why flows & state?

Flow thinking (MAE) captures how users/systems move through a task.
State modeling captures what can be true at any moment and what must not be possible.
Together they surface invariants, guards, and observable outcomes, which make tests reliable and non-flaky.

If you can name the current state and the event that transitions it, you can design a precise, checkable test.

2) Quick workflow

Name the goal and actors (User, System, Scheduler, External Service).
Draft MAE flows (Main / Alt / Exception) as bullet steps.
Extract a state list from the flows (Nouns, not verbs).
Define transitions: FROM --(event/trigger [guard])--> TO {actions}.
Write invariants (“must always hold”) and terminal states.
Add observability: log keys, metrics, traces for each transition.
Pick edge cases: retries, timeouts, cancellations, concurrency.

3) MAE flow skeleton

Main:     User applies discount code → API verify → Applied badge → Total updated
Alt:      Member applies after address step → Verify still valid → Applied
Exception: Expired code → Show message → Keep input → No total change
Exception: Not combinable → Offer remove conflict item

Keep each step an observable action (UI change, API call, log/metric).

4) State modeling: the core table

Represent transitions in a single source of truth table. Tests can be generated from it.

From	Event/Trigger	Guard/Condition	Action/Side-effects	To	Observables (oracle)
`idle`	`apply(code)`	`len 1..16 && not expired`	mark `pending`, call `/verify`	`pending`	log: `event=apply`, `code_len`, trace `verify`
`pending`	`verify.ok`		set `applied=true`, update totals	`applied`	resp 200; UI badge; total delta
`pending`	`verify.fail(expired)`		keep input; show message	`idle`	message `VALIDATION.code.expired`
`pending`	`timeout` + `retry()`	`attempts<N`	backoff & retry	`pending`	metric `retry_count` inc; same correlation id
`pending`	`cancel()`		abort verify	`idle`	log `cancelled=true`

Add a tiebreaker where ordering matters (e.g., (created_at, id) for lists) and include it in your observables.

5) Draw it (ASCII is fine)

 idle
  | apply(code) [valid]
  v
 pending -- verify.ok --> applied
   |  \
   |   \-- verify.fail(expired) --> idle
   |
   \-- timeout -> retry (<= N) -> pending

You don’t need UML perfection; you need a model everyone understands.

6) Invariants (make them testable)

Totals never go negative.
An applied code implies a successful verify in logs with the same correlation_id.
Expired code ⇒ no total change and a visible message.
retry_count ≤ N; same idempotency key ⇒ same outcome.

Turn each invariant into an assertion or a monitor.

7) Idempotency, retries, and time

State machines shine when dealing with time and repeats:

Idempotency: The same request in the same state and with the same key yields the same result.
Retries: Only retry on retryable outcomes (timeouts, 429, 5xx) and back off (add jitter).
Timeouts: Decide the state change on timeout (stay pending vs fail fast).
Deadlines: Permit or block transitions after a cutoff.

Add these as lines in the transition table with an oracle (log key/metric).

8) Concurrency & cancellation

Concurrent apply: Two quick applies should dedupe to the latest code or reject the second.
Cancel mid-verify: UI cancel should roll back to idle with no total change.
Race with cart updates: State transitions should reference a version (cart_version) to avoid lost updates.

Design a negative + recovery path for each concurrency risk.

9) Worked example — Refund workflow (money-moving)

States: requested → processing → succeeded | failed | cancelled

From	Event/Trigger	Guard	Action	To	Observables
requested	submit(refund, k)	`valid && amount≤charge && auth`	create record; enqueue; log `key=k`	processing	resp 202; log `status=processing`
processing	gateway.ok		persist txn id	succeeded	event `refund.succeeded`; log `txn_id`
processing	gateway.fail	retryable?	backoff + retry (max N)	processing	metric `retry_count`
processing	timeout	attempts<N	retry with same key k	processing	same `idempotency_key=k`
processing	cancel()	not yet terminal	mark cancelled	cancelled	event `refund.cancelled`
processing	duplicate submit(k)		dedupe on `k`	processing	resp 200 w/ existing result (idempotency)
processing	gateway.final_fail		record error	failed	error code; no duplicate payout

Invariants

Never two payouts for the same charge+amount.
Same idempotency_key returns same refund regardless of retries.
Terminal states are absorbing (no outgoing transitions except view).

Test ideas (select 8–12)

Submit valid refund → processing → succeeded.
Timeout then retry (same key) → one payout, success.
Duplicate submit with same key → returns existing result.
Retryable fails N times → backoff observed; still ≤ N attempts.
Final fail → no payout; error code surfaced.
Cancel mid-processing → cancelled; no payout.
Idempotent verification after success → returns success again.

Link to: ../40-api-and-data-contracts/idempotency-and-retries.md and ../70-mini-projects/refund-workflow/*.

10) Observability contract (make transitions visible)

For each transition, specify:

Log keys: event, from_state, to_state, correlation_id, idempotency_key, error_code
Metrics: counters for success/fail/retry; histograms for latency
Traces: span names per step, with attributes for keys above
Evidence: where to fetch (log query, dashboard, trace link)

Put these into your PRD/acceptance criteria so tests can assert them.

11) Anti-patterns (avoid these)

Verb-based states (verifying, verifies) instead of noun-based (pending).
Hidden transitions via side-effects not captured in the model.
No guards on transitions (e.g., allow cancel after terminal).
Unobservable transitions (no logs/metrics/traces to prove them).
Ambiguous error handling (sometimes fail fast, sometimes retry).

12) Review checklist (quick gate)

MAE flows enumerated (at least one exception & one recovery)
States are nouns, transitions have clear events/guards/actions
Invariants listed and testable
Idempotency/retry behavior defined where relevant
Cancellation & concurrency cases included
Observability contract: logs, metrics, traces
Terminal states are absorbing
Links to cases and checklists are provided

13) Starter table template

Copy this into your feature folder and fill it.

| From | Event | Guard | Action | To | Observables |
|------|-------|-------|--------|----|-------------|
|      |       |       |        |    |             |

Next:

Build MAE flows in ../30-scenario-patterns/main-alt-exception.md
Add edge inputs with ../20-techniques/boundary-and-equivalence.md
Define contracts in ../40-api-and-data-contracts/*
Gate with ../60-checklists/*

Flows And States