Skip to content

0050 — Provisioning, Generation, and Lifecycle Hardening (Batch 3)

Status

Accepted

Context

Eight pending decisions in the SK series cover backend correctness and operational hardening for the galaxy generator and region-provisioning surface (SK17–SK24). They share a theme — "the worldgen pipeline and the subscription pipeline both need a hardened state machine, idempotency, and recovery semantics" — but the load-bearing question among them is what happens to a region when its owner stops paying.

The shipped monetization spec (OPERATIONS/monetization.md) covers happy-path activation and a high-level "after 30 days suspended, eligible for termination" line, but not:

  • Whether other players in the region can take over the lapsed subscription.
  • What happens to player-owned ships, planets, stations, and captured holdings inside a terminating region.
  • How to preserve audit trails when sectors cascade-delete.
  • What the player gets back from the operator when their region is wiped.

The user's design intent on the cascade is explicit: other paying players should be able to assume the subscription before termination fires; if no one does, paying residents above all else should have their belongings preserved as much as possible; things that can't physically move (planets) should produce in-fiction compensation; things that can move (ships, stations) should move; large asset transfers (planet safes) need a banking surface and a transport tax; relocating a station is a real operator cost and the player pays for it.

The remaining seven decisions are tactical hardening on top of that cascade:

  • SK17 Crash-recovery verification mid-generation
  • SK18 Webhook replay + concurrent provisioning races
  • SK19 Generation-seed retention for reproducibility
  • SK20 Bang generator vs gameserver canonicality
  • SK21 Phase 13 rollback-of-rollback failure handling
  • SK22 Phase 14 attachment failure refund/recovery
  • SK23 Galaxy-cap concurrent compare-and-swap (deferred — see Decision below)
  • SK24 Audit-trail preservation across regenerations

Decision

Region lifecycle and termination cascade

Five-state region lifecycle:

State Trigger What works What's blocked Takeover available?
active Subscription paid Everything Nothing n/a (region has owner)
suspended Payment failure event All gameplay continues New residents from outside cannot join. Owner UI shows payment-recovery prompt. ARIA narrates lapse to current residents. Yes — opens immediately
grace 7 days suspended, payment unrecovered All gameplay continues New construction blocked (no new station builds, no new gate beacons, no region-funded TradeDocks). Visible warning banner. Yes — still open
terminated 30 days from initial suspension, payment unrecovered Region marked for cleanup; 7-day final notice fires All region-content writes blocked No — takeover window closed
generation_corrupt Phase 13 rollback failure (per SK21) n/a All region access blocked; ops alert fires n/a (handled out-of-band by ops)

After terminated, the cleanup cascade runs (asset preservation rules below). 7 days after termination, the Region row + dependent content hard-deletes via CASCADE; audit rows persist via region_id_snapshot.

Total grace period from first payment failure to content deletion: ~44 days.

Region-owner takeover flow

While a region is suspended or grace, any Galactic Citizen subscriber (galaxy-wide, not just region residents) can offer to assume ownership.

POST /api/v1/regions/{id}/takeover
  Authorization: Bearer <player_token>

  Body: {} (empty — the act of POSTing is the offer)

  → Server validates:
      - Region.status ∈ {'suspended', 'grace'}
      - Caller is_galactic_citizen = True
      - Caller has no current Region Owner subscription (one-region-per-owner rule)

  → PayPal flow runs (standard $25/mo Region Owner subscription).

  → On payment success, atomic ownership flip:
      - Region.owner_id = caller.user_id
      - Region.paypal_subscription_id = new_subscription_id
      - Region.status = 'active'
      - Old owner's region-subscription line terminates (no refund — they got their service)
      - Realtime broadcast to residents: "{region_name} is now owned by {new_owner}"

  → If multiple players race, first-to-pay wins. Concurrent requests serialize via the
    Region.id advisory lock (see SK18 below); losers receive ERR_REGION_TAKEN.

Old owner, if they retain Galactic Citizen subscription, keeps their ships/planets/stations/credits inside the region as a regular resident. They go from "owner+resident" to "resident-only."

Asset preservation cascade (when termination fires, no takeover)

The cleanup orchestrator runs immediately on Region.status = terminated. Each player with assets in the region is processed atomically.

Asset type Disposition Cost / compensation
Piloted ship Evacuates with the player to Central Nexus Gateway Plaza (deterministic per-player sector hash). None
Drifting / parked ship (player-owned, has hatch pin) Routed to AbandonedHangar at Central Nexus Starport Prime. Owner claims free on next dock there. Insurance + cargo intact. None
Abandoned ship (ownership relinquished — anyone can claim) Lost (already operating as ownerless). None
Hangared in a Carrier Carrier evacuates with hangar contents intact. None
Player-owned station Relocates as a unit to a destination region the player has access to (or Central Nexus default if none specified). Structure + all upgrades + treasury + cargo + revenue history travel with it. Re-anchors at a destination sector with security_level = basic and tariffs reset to 5% defaults. 30% of (acquisition cost + sum of upgrade capital costs) as relocation fee — see paths below.
Player-owned planet Lost (planets cannot move). All colonists, infrastructure, accumulated planet treasury → forfeited with the region. Genesis devices + credits per citadel-level table below. Plus: planet's safe vault contents → Central Nexus Bank with 20% transport loss (or 100% via prepay).
Captured pirate holding Stations within: relocate per station rule. Planets within: lost per planet rule with safe → Bank transfer. Capture-record + medal preserved on player's history. Per the underlying entity rules.
Player credits (Player.credits) Survive on the Player row (region-independent). None needed
Active contracts (player-issued, destination in region) Cancelled. Escrow refunded to issuer minus partial-fulfillment payouts already disbursed. Acceptor's pro-rata work paid out as if completed.
Active contracts (player-accepted, destination in region) Cancelled. Acceptance fee refunded. No penalty. None
Bounties placed by player (target in region) Refunded pro-rata. None
Ship-construction-in-progress at TradeDock Cancelled per AU2-6 + ADR-0039 (resources forfeited; credits refunded per phase schedule). Standard cancellation rules.
Player-built warp gates anchored to/from region Per SK38 (Batch 5). TBD
Cargo Wrecks, mines, deployed drones Lost. None

Station relocation paths

Path Trigger Outcome
A. Automatic (default, no action) Cascade fires Fee debited from station treasury; treasury after fee + cargo move with the station at 100%. If treasury insufficient, deficit debits from player wallet. If wallet insufficient, station auto-strips upgrades (highest capital cost first) until math covers. If even base structure can't cover, station is lost with credit-compensation fallback (50% of acquisition + 30 days' average revenue paid into Central Bank).
B. Pre-paid during grace Player visits station during Suspended/Grace, opens "pre-pay relocation," pays the 30% fee from wallet relocation_prepaid flag locks. On cascade: station relocates with full treasury + cargo + all upgrades intact. Refunded if takeover happens or player evacuates manually.
C. Manual disassembly (📐 — design-only edge case, defer to post-launch) Player explicitly disassembles during grace Same fee + 7-day disassembly timer (station non-operational during). Useful for picking specific destination sectors.

Planet-safe transport paths

Path Trigger Outcome
A. Automatic (default) Cascade fires Safe contents transfer to player's PlayerCentralBankAccount at Central Nexus Bank. 20% loss applied — both credits and each commodity stack lose 20% (rounded down to whole units). Bank ledger line item: "Cascade transport: -20% (region {old_name} terminated)."
B. Pre-paid during grace Player visits planet during Suspended/Grace, opens safe, clicks "pre-pay transport fee" Server calculates 20% of safe value at current market prices. Player pays from wallet. transport_prepaid flag locks. On cascade: safe transfers at 100%. Refunded if takeover happens or player manually evacuates.
C. Manual evacuation Player visits planet during Suspended/Grace, withdraws contents to ship cargo No fee. Standard play. The "be proactive" path.

Planet-loss compensation (citadel level → reward)

Citadel level Genesis devices Credit compensation
L1 Outpost 1 Basic 50,000 cr
L2 Settlement 1 Basic + 1 Advanced 250,000 cr
L3 Colony 2 Advanced 1,000,000 cr
L4 Major Colony 3 Advanced 5,000,000 cr
L5 Planetary Capital 5 Advanced 25,000,000 cr

The Genesis devices give the player a real path to rebuild somewhere else — they can land in the Central Nexus, transport to another region, and seed a new colony with the recovered devices. The credits are a partial-loss buffer, paid into the player's Player.credits wallet (not the Bank).

Central Nexus Bank

A new player-facing surface at any Starport Prime dock in the Central Nexus, operated by the Galactic Concord. Holds credits + commodities indefinitely on behalf of players. Region-independent storage that survives any region-lifecycle event.

SchemaPlayerCentralBankAccount:

Column Type Notes
player_id UUID PK / FK Player.id One account per player
credits Integer, default 0 Liquid
commodities JSONB, default {} {ore: 5000, organics: 2000, ...} per the canonical commodity enum
created_at, updated_at DateTime server defaults

Operations:

  • Deposit — only by the cascade orchestrator (planet-safe transfer); no normal player-side deposit. The Bank is a one-way recipient from cascade events.
  • Withdraw credits — any Starport Prime dock; instant, free.
  • Withdraw commodities — any Starport Prime dock; 1 turn per 100 units (rounded up), to player's ship cargo. Cannot exceed ship's free cargo space.

The Bank is also where station credit-compensation fallbacks land (when a station can't pay its relocation fee and is lost), and where the ARIA-driven "you have unclaimed assets" prompts surface to players returning from a cascade.

SK17 — Per-phase row-count sentinels

The 14-phase generator commits a checksum at each phase boundary. Region.generation_phase_checksums JSONB:

{
  "phase_1": {"row_count": 1, "duration_ms": 12, "completed_at": "..."},
  "phase_2": {"row_count": 850, "duration_ms": 230, "completed_at": "..."},
  ...
}

Each phase's exit logic counts the rows it inserted (sectors, warps, formations, holdings, etc.) and writes the count. On generator restart at phase N+1, the orchestrator re-verifies the count by re-running the phase's count query against the live state. Mismatch → roll back to the last verified phase, retry from there. Hash sentinels are over-engineered for the 14-phase flow; row counts catch the realistic crash patterns.

SK18 — Webhook idempotency + provisioning lock

Two locks compose:

  • Idempotency key on every webhook event. PayPal includes webhook_event_id; server stores processed IDs in a webhook_event_log table (event_id UNIQUE, received_at, outcome). Replays return 200 OK with the original outcome's metadata.
  • DB-level advisory lock keyed on Region.id for any provisioning operation, including takeover transactions: SELECT pg_advisory_xact_lock(hashtext('region:' || region_id)). Concurrent webhook + admin force + takeover serialize cleanly.

SK19 — Generation seed retention

Add Region.generation_seed BIGINT NOT NULL to schema (forward-only Alembic; backfill existing rows from bigint(uuid) of Region.id). Set at Phase 0 input validation. Persists for ops repro and bang reproducibility (per SK20).

SK20 — Gameserver-canonical equivalence

The gameserver generator is canonical for runtime behavior. The bang generator (sw2102-bang) imports proceed as follows:

  1. Bang produces a Universe JSON dataset.
  2. Import endpoint runs the gameserver Phase 13 validation gate against the imported content. The validation logic is factored into services/gameserver/src/services/galaxy_validation.py and imported by both sides.
  3. Rule violation → reject the import with ERR_BANG_VALIDATION_FAILED listing the failing invariants.
  4. Successful import → content commits as if Phase 13 had passed natively.

Bang can experiment with new generation algorithms; it cannot ship output the gameserver wouldn't generate itself. This closes the drift question by making the validation set the contract.

SK21 — Generation-corrupt state

New Region.status = 'generation_corrupt' enum value. On Phase 13 strict-rollback failure (FK constraint hit, disk full, network blip during cleanup), the region flips to this state, an region_generation_corrupt event fires on the realtime bus + admin alert (email/Slack), and the region is removed from any provisioning surface. Operator decides per-case whether to manually recover or hard-delete. Provisioning a fresh replacement region for the affected subscriber is the typical recovery path.

SK22 — Phase 14 retry + owner relocation

Phase 14 retry policy: at-least-once retry with idempotency. The region_attached realtime event carries an idempotency key (region.id + attempt_n); the warp tunnel insert is INSERT ... ON CONFLICT DO NOTHING. Retry runs with exponential backoff: 1s, 5s, 30s, 5min, 30min, 6h. After 5 failed attempts, the region flips to attachment_pending with an ops alert and ARIA narration to the owner: "Your home region's Nexus connection is taking longer than expected. You can travel there directly from your current location whenever you're ready."

Critical correction: Phase 14 failure is not "region unreachable." The region itself is fine (Phases 1–13 succeeded). What's missing is the cross-region warp from this region's Frontier outer reaches to the Nexus per ADR-0043. The owner is still placed in their region by the owner-relocation flow below, independent of Phase 14.

Owner relocation flow:

  • New player at first login → spawns at their region's Capital sector (Phase 11 anchor: TERRA welcome planet). Path doesn't depend on Phase 14.
  • Existing player who buys a new region → on first dock after purchase, ARIA offers a one-shot "Transport to your new home" action. The transport places them at the new region's Capital sector with their current ship + cargo + insurance intact. Independent of Phase 14. A small relocation fee covers the in-fiction journey (default 50,000 cr).

Refund only fires on generation_corrupt (Phase 11/12/13 fails) or attachment_pending after all retries exhausted and a persistent operator-confirmed infrastructure issue. A region with delayed-Nexus-warp but otherwise functional is not refundable.

SK23 — Drop the galaxy cap; per-region locks

The user's correction: a galaxy doesn't need a global sector cap. The galaxy size is the sum of all paying regions' total_sectors; it grows organically as subscribers buy regions.

Schema action: Galaxy.max_sectors becomes a soft observability metric (operator dashboard alert at thresholds for capacity planning) rather than a hard pre-check. The column is retained as nullable/soft target; provisioning does not enforce against it.

The original SK23 race (concurrent provisioning blowing past the cap) doesn't exist once the cap is dropped — different regions are independent. Concurrent provisioning of the same region row (webhook replay vs admin force) is handled by SK18's per-Region.id advisory lock.

SK24 — Audit-trail preservation

Every audit-trail table gains a region_id_snapshot UUID column (nullable, stored at row creation), and the existing sector FK becomes ON DELETE SET NULL:

  • combat_logregion_id_snapshot, sector_id ON DELETE SET NULL
  • enhanced_market_transactions → same
  • bounty_claim → same
  • npc_death_log → same
  • aria_observation_log → same
  • pirate_kill_log → same
  • cargo_wreck_log (📐 — when it lands) → same

On region re-generation (force=true) or termination cleanup, the sector rows go away but the audit rows persist with region_id_snapshot pointing at the (also-deleted) region. Audit queries handle the NULL gracefully ("sector unknown — region was regenerated/terminated"). Region-level history is still queryable via region_id_snapshot.

This preserves the audit history (legally and gameplay-narratively important) while letting force=true clean the operational state cleanly.

Consequences

Positive:

  • Region ownership becomes a transferable, durable asset rather than a fragile "if your owner stops paying, the whole region dies" scenario. Communities of paying players inside a region can collectively keep it alive.
  • The asset preservation cascade respects player investment. Stations are portable; planet safes have a banking lifeline; planets compensate via Genesis devices. The 20% / 30% transport tax models the real operator cost without being punitive.
  • The Central Nexus Bank surface is load-bearing for the whole game, not just for cascades — it gives players a region-independent depository for the first time. ARIA can use it for "in case of emergency" recovery storage.
  • SK17–SK24 close the operational corner cases that block "we can confidently re-generate or terminate a region" — the PayPal pipeline, the Phase 14 failure, the Galaxy cap race, the audit-row orphaning are all accounted for.
  • Owner-relocation flow makes "I just bought a new region" a smooth experience for existing players, not a dead-end where they own a region they can't reach.
  • Termination cascade gives players a 44-day grace window with multiple intervention points (takeover, pre-pay safes/stations, manual evacuation) before content is gone forever. Plenty of time for legitimate emergencies (lost card, brief financial trouble).

Neutral:

  • New PlayerCentralBankAccount table; new Region.generation_seed, generation_phase_checksums, and extended Region.status enum; region_id_snapshot columns across 6+ audit tables. All forward-only Alembic migrations.
  • New runtime services: cascade orchestrator (cleanup), Bank transfer service, station relocation service. Each is a focused module; total LOC modest.
  • Two new realtime events: region_taken_over, region_terminated_cleanup_complete.

Negative:

  • The cascade is operationally complex. Many composing rules (station relocation paths × planet safe paths × player ship handling × audit preservation) means a lot of test surface. Acceptable: the alternative is "lose everything," which is unconscionable when the player has been paying.
  • The 30% station relocation fee can be uncomfortable for owners with marginal stations. Some stations may cost more to relocate than they're worth, and the player chooses to let them be lost. This is intentional — the fee should be a real decision, not a free transfer.
  • Audit-row preservation means the database keeps growing even as regions are deleted. Acceptable for an MMO economy where player histories matter; can be revisited with archival policies post-launch if storage becomes a concern.
  • The Bank surface centralizes a lot of player-asset trust at the Central Nexus. If Starport Prime were ever destroyed (operator event, emergent gameplay), bank access would need a fallback. Out of scope for this ADR; flag as a future concern.

Alternatives considered

Hard-delete on termination, no compensation (rejected). Rejected because it punishes paying players for an event they didn't control (the region owner's payment lapse). The whole takeover + cascade + Bank design exists to honor the principle that paying players' belongings should be preserved.

Mandatory takeover by faction NPC if no player offers (rejected). Have the Galactic Concord take over a lapsed region as a permanent operator-managed region. Rejected because it puts unbounded operator load on the design (the operator now hosts every lapsed region forever) and dilutes the meaning of "Region Owner" subscription. The 44-day grace + cascade is the right limit.

Auction takeover instead of first-pay-wins (rejected). Hold a 7-day auction during grace; highest GC subscriber bid wins. Rejected as overcomplicated for launch — first-pay-wins is simpler and faster, the takeover cost is fixed at $25/mo, and a region's value to a buyer is mostly subjective (depends on what's in it for them). Auctions can come back as a post-launch enhancement if there's demand.

Charge the relocation fee from the player wallet up-front, no station-treasury option (rejected). Force every relocation to be wallet-paid. Rejected because some legitimate stations have huge treasuries but the owning player is broke at cascade time; treasury-based default with wallet fallback covers more cases gracefully.

Centralize audit-row preservation in a separate archive_audit table (rejected). Move audit rows out of their primary tables on region deletion. Rejected as more complex than region_id_snapshot + ON DELETE SET NULL. The snapshot column is one tiny addition per audit table; archival migration is a multi-table refactor.

Per-region max_sectors enforced cap (kept). Each region's total_sectors (CHECK 100–1500 per the schema) is the right granularity. Galaxy-wide cap dropped per SK23.

Phase 14 failure → automatic full subscription refund (rejected). Refund the whole subscription on any Phase 14 failure. Rejected because Phase 14 failure leaves the region functional in nearly all cases; refunding for a delayed-Nexus-warp is over-correcting. The refund only fires when the issue is persistent and operator-confirmed.