Design Decision Log

“You feel as if someone is watching you make choices. It must be the DevTeam.”

See also: DESIGN.md (architecture) | DEVELOPMENT.md (dev workflow) | LORE.md (porting lessons)

Decision 1: Async Game Loop via Promise Queue

“You are momentarily blinded by the Promise.”

Context: The C NetHack game loop calls nhgetch() which blocks the thread until a key is pressed. JavaScript in the browser cannot block.

Options considered:

  1. Web Workers with SharedArrayBuffer + Atomics.wait() (true blocking)
  2. Async/await with a Promise-based input queue
  3. State machine that returns control on each input request

Choice: Option 2 – Async/await with Promise queue.

Rationale:

Tradeoff: Every function in the call chain from moveloop_core to any input function must be async. This is pervasive but mechanical.


Decision 2: Single-Page HTML with Inline Terminal

“The walls of the room are covered in <span> tags.”

Context: How to render the 80×24 character display.

Options considered:

  1. Canvas-based rendering (draw each character as pixels)
  2. <pre> element with <span> per character cell
  3. CSS Grid of divs

Choice: Option 2 – <pre> with <span> elements.

Rationale:

Performance: 80×24 = 1,920 spans is trivial for modern browsers. Even full screen redraws at 60fps would be fine.


Decision 3: ES6 Modules Without Build Step

“You don’t need a build tool. You feel remarkably unburdened.”

Context: How to organize the JavaScript code.

Options considered:

  1. Single monolithic file
  2. ES6 modules with <script type="module">
  3. CommonJS with a bundler (webpack/rollup)

Choice: Option 2 – ES6 modules loaded natively.

Rationale:


Decision 4: Faithful DEC Symbols via Unicode

“You see here a box-drawing character (U+2502).”

Context: Classic NetHack uses DEC Special Graphics characters for wall drawing. These are box-drawing characters like ─│┌┐└┘├┤┬┴┼.

Choice: Use Unicode box-drawing characters (U+2500 block) which are the standard modern representation of DEC Special Graphics. The “IBMgraphics” symbol set uses similar characters.

NetHack’s symbols config option supports both “DECgraphics” and “IBMgraphics”. We default to the IBM set since Unicode box-drawing maps directly to it.

When the user hasn’t selected a symbol set, we use the plain ASCII defaults from defsym.h (|, -, +, #, etc.) which is the most faithful to the default TTY experience.


Decision 5: Simplified Vision for Initial Port (superseded — see update)

“It is dark. You can’t see anything but the pragmatic choice.”

Context: NetHack’s vision.c implements a complex raycasting algorithm for field of view.

Original choice: For the initial port, implement a rule-based FOV:

This produces correct results for the standard dungeon layout. The full raycasting algorithm will be ported in a later pass.

Justification: The room/corridor FOV rules are what 99% of gameplay uses. The raycasting is only needed for open areas, line-of-sight blocking by boulders, etc. Getting the core game playable is more valuable than perfect FOV initially.

Update (implemented): vision.js now implements the full Algorithm C raycasting from vision.c — the same recursive line-of-sight scanner that traces visibility along octant rays, handling walls, doors, boulders, and partial occlusion exactly as the C does. The simplified rule-based approach was completely replaced.


Decision 6: Monster & Object Data as Structured JS Arrays

“You see here 383 monsters and 478 objects. They have been carefully catalogued.”

Context: The C code defines monster data via C macros in monsters.h (3927 lines) and object data in objects.h (1647 lines).

Choice: Port as JavaScript arrays of plain objects, preserving field names from the C structures.

Example (C):

MON(NAM("giant ant"), S_ANT,
    LVL(2, 18, 3, 0, 0), (G_GENO | G_SGROUP),
    A(ATTK(AT_BITE, AD_PHYS, 1, 4), ...),
    SIZ(10, 10, MS_SILENT, MZ_TINY), 0, 0,
    M1_ANIMAL | M1_NOHANDS | M1_OVIPAROUS | M1_CARNIVORE,
    M2_HOSTILE, 0, CLR_BROWN, GIANT_ANT)

Example (JS — using canonical C field names):

{ // monsters.h:120
  mname: "giant ant", mlet: S_ANT,
  mlevel: 2, mmove: 18, ac: 3, mr: 0, maligntyp: 0,
  geno: G_GENO | G_SGROUP,
  mattk: [{aatyp: AT_BITE, adtyp: AD_PHYS, damn: 1, damd: 4}, ...],
  cwt: 10, cnutrit: 10, msound: MS_SILENT, msize: MZ_TINY,
  mflags1: M1_ANIMAL | M1_NOHANDS | M1_OVIPAROUS | M1_CARNIVORE,
  mflags2: M2_HOSTILE, mflags3: 0,
  mcolor: CLR_BROWN
}

The C line reference in the comment allows cross-referencing. Field names now match struct permonst in permonst.h exactly (Phase 2 normalization).


Decision 7: Global State Object Pattern

“You feel the weight of hundreds of global variables settle into a single namespace.”

Context: The C code uses many global variables (decl.c has hundreds). How to manage these in JS?

Options:

  1. True global variables
  2. Single namespace object (NH.u, NH.level, etc.)
  3. Class instances with encapsulation

Choice: Option 2 – single game object, accessed via gstate.js.

Rationale:

Implementation: gstate.js exports a single mutable game reference (export let game = null). allmain.js calls setGame(game) once before the game loop starts. All other modules import game from gstate.js and access state through game.u, game.level, game.flags, etc. — matching C’s naming where possible (u, level, flags, context, moves, fmon).

Evolution: Early versions used an NH namespace (this decision’s original choice) and also had multiple set*Context() / register*() calls to wire module dependencies at init time. All of that was replaced by gstate.js during Phase 4 of the architectural refactor. The final pattern is simpler: one object, no wiring, ESM live bindings handle the rest.


Decision 8: RNG – ISAAC64 for Exact C Compatibility

“You feel the hand of fate guiding you. It is using BigInt arithmetic.”

Context: NetHack uses a seeded RNG for reproducible games. The C code uses ISAAC64 (isaac64.c) for the main game RNG and the system RNG for initial seeding.

Choice (updated): Port ISAAC64 faithfully using JavaScript BigInt for 64-bit unsigned integer arithmetic. This gives bit-for-bit identical output to the C version for any given seed, enabling deterministic comparison tests.

Implementation notes:

Previous choice (initial port): xoshiro128** was used temporarily for the first prototype, but was replaced with ISAAC64 to enable exact C matching.


Decision 9: Corridor Algorithm – Faithful Port of join()

“You hear the rumble of distant digging. The corridors connect.”

Context: mklev.c’s join() function creates L-shaped corridors between rooms. This is a signature visual element of NetHack’s dungeon.

Choice: Port the exact algorithm from join() including:

This ensures the dungeon “looks like NetHack.”


Decision 10: Wizard Mode via URL Parameters

“You are in wizard mode. All query parameters are at your disposal.”

Context: NetHack’s wizard mode (debug mode) is invaluable for testing. The C version activates it via compile-time flags or special user names. We need a browser-friendly equivalent.

Choice: Activate wizard mode via URL query parameters:

Wizard commands implemented:

Rationale: URL parameters are the natural equivalent of command-line flags for a browser application. They can be bookmarked, shared, and don’t require any UI for activation. The ?seed=N parameter combined with wizard mode enables fully deterministic, reproducible test scenarios.

Testing: Wizard commands that don’t require user input (e.g., magic mapping) are tested via unit tests with mock game objects. Input-requiring commands (level change, teleport, genesis) are tested via E2E browser tests.


Decision 11: Lua Level Definitions – Direct Port to JS (No Interpreter)

You feel the mass of scripted levels pressing on your mind.

Context: NetHack 3.7 introduced Lua scripting for special level definitions. There are 141 .lua files totaling ~17,000 lines in dat/. The C code embeds a full Lua 5.4 interpreter and exposes a des.* API (35+ functions from sp_lev.c) plus a selection API for geometric map operations. The question: how does WebHack handle these?

Options considered:

  1. Embed a Lua interpreter in JS (~1,500 lines for a minimal subset)
  2. Transpile Lua→JS via an existing parser (e.g., luaparse) + code emitter
  3. Port the Lua scripts directly to JavaScript (mechanical syntax conversion)

Choice: Option 3 – Direct port to JavaScript.

Rationale:

The Lua scripts aren’t really “programs” – they’re a DSL for dungeon layout. Reading them reveals three complexity tiers:

Tier Files Lines What they do
Declarative ~130 ~3,000 des.room({...}), des.monster(), des.object() in sequence. No logic.
Procedural ~5 ~500 if percent(60) then ... end, for i=1,n do, shuffled tables
Complex ~3 ~4,600 Closures, selection set operations (\|, &), stored callbacks

90% of the code is declarative – nested tables and function calls. The syntax conversion is mechanical:

-- Lua                              // JavaScript
des.room({ type = "ordinary",       des.room({ type: "ordinary",
  x = 3, y = 3,                       x: 3, y: 3,
  contents = function()                contents: () => {
    des.monster()                        des.monster();
  end                                  }
})                                   });

An interpreter would be ~1,500 lines of code that exists purely to run scripts that are themselves being ported. That’s building a bridge when you could just walk across the river. Direct porting:

What IS needed:

Even without a Lua interpreter, we need the APIs that Lua scripts call. These become JS modules:

  1. des.* API (35+ functions from sp_lev.c): des.room(), des.monster(), des.object(), des.terrain(), des.map(), des.stair(), des.trap(), des.door(), des.level_flags(), des.level_init(), des.region(), des.altar(), des.fountain(), des.random_corridors(), etc.

  2. Selection API (geometric operations on map coordinates): selection.new(), set(), rndcoord(), percentage(), grow(), negate(), match(), floodfill(), iterate(), filter_mapchar(). Supports union, intersection, complement via operator overloading or method chaining.

  3. nhlib helpers: percent(), shuffle(), montype_from_name(), and a handful of utility functions (~242 lines, trivially ported).

Lua features NOT used (confirming an interpreter isn’t needed):

Porting order (tracked in GitHub Issues):

Phase 1 (foundation):
  ├── des.* API implementation (JS equivalent of sp_lev.c)
  └── Selection API (geometric operations)

Phase 2 (level files):
  ├── nhlib helpers
  ├── dungeon.lua (dungeon tree structure, pure data)
  ├── ~130 simple declarative levels (oracle, mines, sokoban, etc.)
  └── ~5 procedural levels (castle, medusa, astral)

Phase 3 (complex):
  ├── themerms.lua (theme rooms with closures + selection ops)
  └── quest.lua (quest text data + callbacks)

Tradeoff: If NetHack’s Lua usage expands significantly in future versions (e.g., user-scriptable levels, Lua-based game logic), we’d need to revisit this decision. For 3.7’s usage – which is firmly “configuration DSL with occasional procedural glue” – direct porting is the pragmatic choice.

“You hear the strident call of a scripting language. It seems far away.”


Decision 12: Pre-Makelevel PRNG Alignment via skipRng()

You sense the presence of 257 consumed random values.

Context: The C game consumes 257 ISAAC64 values between init_random() and the start of makelevel(). These come from:

Source Calls Purpose
o_init.c 198 Object class shuffling (randomize potion/scroll/wand appearances)
dungeon.c 53 Dungeon structure initialization
nhlua.c 4 Lua startup
u_init.c 1 Character creation
bones.c 1 Bones file check

JS doesn’t implement these subsystems yet. If we just call initRng(seed) and then makelevel(), the PRNG state is 257 values behind the C version.

Choice: skipRng(n) – consume n raw ISAAC64 values without logging, to fast-forward past C startup calls that JS doesn’t need yet.

Rationale: This is the pragmatic approach that the user specifically suggested. Instead of porting o_init() (which shuffles object appearances – a system we’ll need eventually but not for basic level generation fidelity), we advance the PRNG by the exact number of consumed values.

The skip count (257 for seed=42) was determined empirically using the C PRNG logger (003-prng-logging.patch), which logs every RNG call with file:line information.

Future: When we port o_init(), dungeon.c init, etc., the skip count drops accordingly. When all startup systems are ported, skipRng() is no longer needed. The PRNG log comparison tests track progress toward this goal.


Decision 13: PRNG Timing Parity

“You sense a disturbance in the random number plane.”

Context: Session comparison tests compare JS and C PRNG call sequences using strict index-based comparison. When one side makes an extra or missing RNG call, all subsequent indices become misaligned, making it impossible to tell whether the rest of the sequence matches. A single missing call early on causes hundreds of apparent mismatches downstream.

Principle: PRNG calls must happen at the same step as C. Screen parity forces this in 99%+ of cases — any call that affects visible output must happen on the same turn or the screen will diverge. Calls may happen later than C only for invisible internal state (e.g., deferred exercise checks). Calls must never happen earlier than C — an earlier call means JS is computing something C hasn’t yet, which always indicates a bug (premature computation, misplaced logic, or a missing guard).

Categories of divergence:

Diagnosis: Use node test/comparison/rng_shift_analysis.js to identify time-shifts across sessions. The tool uses bounded-lookahead alignment to distinguish shifts (insertions/deletions) from value mismatches, and aggregates the most-affected functions across all sessions.

See also: C_PARITY_WORKLIST.md (PRNG Timing section), Issue #143.


Decision 14: SYNCLOCK Boundary Ownership Over Fallback Auto-Sync

“You hear a click as the boundary lock engages.”

Context: The JS port uses async I/O, but NetHack C is single-threaded and blocking. During migration, run_command() had a compatibility fallback that would auto-sync legacy _pendingMore state into a synthetic owner=more boundary. This reduced breakage but obscured ownership bugs and made ordering errors harder to diagnose.

Choice: Move no-owner handling to a strict-mode contract while preserving a default compatibility path. If _pendingMore is set with no owner=more boundary, emit boundary.more.owner-missing; in strict mode (WEBHACK_STRICT_MORE_OWNER=1) ignore the key, otherwise perform compatibility auto-sync.

Rationale:

  1. Ownership bugs should fail loudly at the boundary, not be repaired silently.
  2. Deterministic replay/parity debugging is faster when boundary violations are explicit.
  3. Strict mode (WEBHACK_STRICT_MORE_OWNER=1) prevents accidental key consumption reordering and exposes latent ownership bugs.

Implementation shape:

Tradeoff: Strict mode surfaces latent boundary defects earlier; default mode keeps compatibility while migration completes.

See also: docs/SYNCLOCK_CAMPAIGN_PLAN.md, docs/SYNCLOCK_PROGRAMMING_GUIDE.md, Issue #277.


Decision 15: Game Loop Ordering — Structural Match Over Deferral Flags

“The monsters always move first. That is the law of the dungeon.”

Context: C’s moveloop_core() processes one game tick per call in a strict phase sequence: bookkeeping (A), monsters (B), display (C), then player input (D/E/F), then cleanup (G). Phase B (monster movement, spawning, turn-end processing) runs BEFORE Phase F (player command). The JS port initially ran Phase B AFTER the player command, inside finalizeTimedCommand. This caused the 5 remaining parity failures (seed031/032/033/328/theme27), all of which diverge because monsters and the player act in the wrong order.

Decision: Match C’s structure directly. Move advanceTimedTurn (Phase B) to the top of the _gameLoopStep while(true) loop, gated by context.move, so monsters move BEFORE the next player command. No deferral flags, no synthetic queuing, no continuation tokens.

Rejected alternative: pendingDeferredTimedTurn

An earlier experiment built a deferral mechanism: finalizeTimedCommand would set game.pendingDeferredTimedTurn = true instead of calling moveloop_core, and runPendingDeferredTimedTurn() would execute it at the start of the next runOneCommandCycle. This was abandoned because:

  1. C has no deferral — it’s for(;;) { moveloop_core(); } with Phase B at top
  2. Deferral adds a synthetic continuation flag, violating the execution model
  3. The flag was built but never activated (dead code)
  4. Structural reorder is simpler, more maintainable, and C-faithful

The pendingDeferredTimedTurn field, runPendingDeferredTimedTurn() method, and pendingTravelTimedTurn field have been deleted.

See also: docs/GAME_LOOP_REORDER_PLAN.md, docs/DESIGN.md §The C game loop.


“You have reached the bottom of the decision log. There are more decisions to come. The strident call of the DevTeam echoes in the distance.”