Moving Like Clockwork

A cozy life-sim lives or dies on the feeling that the town keeps turning when you’re not looking. Walk into the same building two mornings running and the same people should be there doing the same work — until the day something changes them. The naive way to build that is to simulate every villager all the time and write each one’s position into the save file. We don’t do either. This post walks through the architecture we landed on instead, why we made the calls we made, and a couple of the bugs the design is specifically shaped to avoid.

Fair warning: this is a background, inside-baseball post. If you came here for screenshots, this isn’t that one. If you came here to argue about save schemas, pull up a chair.

One more caveat before we dig in: this is work in progress. What follows is the architecture as we’ve designed it, and parts of it are running in the vertical slice today — but it’s not all battle-tested across the full game yet, and some of the open questions below are genuinely still open. Treat this as a snapshot of how we’re building NPC scheduling, not a victory lap. Some of these calls will probably look different by the time the game ships, and if they do, that’s the process working.

Four layers, kept separate on purpose

The first thing worth saying is that “NPC behavior” isn’t one system. We split it into four, and a lot of our sanity comes from never letting them bleed into each other:

  • Entity state — facts about the NPC. Whether they’ve met the player, their relationship standing, mood, per-NPC inventory, any active overrides. This lives on a component on the NPC, and only the fields we explicitly mark as save-relevant get persisted.
  • Schedule data — a static-ish table mapping a time block to a location, an activity, and the dialogue the NPC draws from while there. This is authored content, not runtime state.
  • StateTree — Unreal’s StateTree, carrying the behavior logic. What the NPC is doing right now: walking a path, claiming a workstation, standing idle. It reads the other layers as inputs and is rebuilt from scratch on load.
  • Persistence — the save struct and the load hooks that rehydrate everything.

The distinction people most often collapse is entity state versus the StateTree. They are not the same thing. Entity state is data; the StateTree is logic that consumes that data. Keeping them apart is what lets us throw the entire StateTree away on save and reconstruct it on load without losing anything that matters.

The reframe: positions are not a thing we save

Here’s the decision everything else hangs off of. We do not persist NPC positions. Where an NPC physically stands is never written to disk.

That sounds reckless until you flip it around: the schedule plus the NPC’s entity state already is the source of truth for where someone should be. If you know it’s mid-morning and you know this NPC’s schedule and current overrides, you know where they are. Saving the position too is not just redundant — it’s a second source of truth that can disagree with the first, and now you’ve got a bug class to chase.

So schedule-driven spawning isn’t a fallback we reach for when the saved position is missing. It’s the only spawn path. Load a save five minutes after you made it, or three in-game days later, and the same code resolves where everyone is. There’s no special “long gap” branch, because there’s no ticking we have to fast-forward through.

Compiling a day

If schedules drive everything, the obvious question is: do you run a schedule lookup every frame to figure out what an NPC is doing? No. That would mean re-walking the schedule rows and the override priority chain constantly, and it scatters the “which rule won” logic all over the runtime.

Instead we compile. Once per NPC per day, we take the schedule rows plus whatever overrides are active and produce a concrete, ordered list of events for that day. The runtime then just plays through that list. It doesn’t reason about priorities or preconditions; it reads the next event and does it.

This buys us a few things. The “which override won” question gets answered exactly once, in one deterministic pass, instead of being smeared across every frame. The compiled events list becomes a real, inspectable object — you can dump it, diff it, and debug it independently of the inputs that produced it. And a compile-time bug looks visibly different from a runtime bug, which matters enormously when you’re staring at an NPC who’s in the wrong place and trying to figure out why.

The in-engine debugger view we’re building for this reflects the split: it’ll show both the compiled events list for the day and the raw inputs (schedule rows, active override slots) side by side. If the list is wrong but the inputs are right, that’s a compiler bug. If both look right and the NPC still misbehaves, the problem is downstream in the StateTree. Separating those two failure modes should save hours.

Overrides: priority by source, not by number

Most of the time an NPC runs their default routine. Sometimes a quest needs them positioned where the player can find them; sometimes a scripted story beat needs them at a specific spot; sometimes a town-wide calendar event pulls everyone somewhere. Those are overrides, and overrides need a priority order.

The tempting design is numeric priority — give each override a number, highest wins. We deliberately didn’t do that, because numeric priority has a nasty failure mode: two overrides both claim priority 100 and now your resolution order is undefined. Whoever wrote those two systems months apart never coordinated, and you get a nondeterministic bug that only shows up when both fire on the same NPC on the same frame.

Instead, priority is by source, in a fixed order: Story Event beats Quest Event beats Calendar Event beats Work Schedule beats Default Daily Routine. Each NPC carries one slot per source. A system writes into its own slot and nowhere else; resolution reads the slots top to bottom and takes the first one that’s filled. There’s no number to collide on, and the resolution code is trivial — a fixed walk down five slots.

The within-source rule is just as boring on purpose: at most one override per source is active at a time. A new override from the same source replaces the old one. No stacking, no sub-priority arithmetic. If a genuinely hard collision ever shows up — say a long-running background quest and a cutscene both wanting the same NPC at the same instant — we’d rather hit that as a loud, specific case to design around than paper over it with ad-hoc precedence rules today.

One important conceptual note that took us a while to get crisp: quests don’t trigger story beats. A quest is a collection of goals; it tracks what you need to do. A story event is a scripted beat that fires when world and NPC state meet its preconditions. Finishing a quest can open a story event — but only because the quest changed some state that the story event was watching for, not because the quest reached out and fired the script. Keeping those decoupled is how we avoid the classic open-world bug where a quest and a cutscene both try to own the same character at the same moment.

Offscreen, the clock doesn’t tick — it gets sampled

We don’t run a simulation for villagers you can’t see. There’s no per-NPC tick budget burning CPU on someone three zones away. The schedule is sampled on demand: when we need to know where an NPC is, we ask the schedule, we don’t ask a simulation that’s been grinding away.

But “offscreen” isn’t uniform, so the representation splits:

  • Overworld NPCs still traverse the world’s navigation graph — ZoneGraph, in Unreal terms — and can show up on your map as a moving dot, so you can spot someone crossing town from a distance. That sense of a populated world is worth keeping.
  • Interior NPCs collapse down to “they’re inside here, doing this.” No path, no map presence, just a virtual location and an activity state.

These two “no full actor right now” cases have completely different reasons behind them, and we’re careful never to conflate them. An overworld NPC with no spawned puppet is just a level-of-detail decision, handed off to Unreal’s Mass representation system — the player is far away. An interior NPC with no puppet is a logical state — they’re inside a building. Different lifecycles, different triggers, different correctness conditions. The day we start reasoning about “outdoor virtual NPCs” as a category is the day we’ve introduced a bug, because that category doesn’t exist in our model.

NPCs in transit are their own problem

The trickiest case is the one in between: an NPC walking from one place to another when you save, load, or just look away and back.

The wrong move is to snap them to their destination. Autosave fires while someone’s mid-walk, and if loading snaps them to the end of the path, you get a visible teleport — and worse, you break any scripted moment that depends on watching them arrive. So in-transit is its own first-class state. We store where they’re coming from, where they’re going, and when they left, plus every piece of world state the pathfinder actually consumed to route them. On load, we regenerate the path from those inputs.

The hard requirement underneath this is path determinism: the path has to be a pure function of (origin, destination, world-state-at-departure). If anything that affects routing isn’t captured in that transit record, the recomputed path drifts away from the “remembered” one. To catch exactly that, we’re adding a debugger overlay that draws a line between where the interpolated path says the NPC should be and where the actual puppet is standing. That drift line is designed to be our canary — if it ever stretches, we’ve missed an input, and we can go find it before a player ever does.

What we do save, and the asymmetry that decided it

So if positions aren’t saved, what is? Entity state’s persistent fields — met-player, relationship, mood, inventory, the override slots — and, perhaps surprisingly, the compiled events list for the current day.

We went back and forth on that last one. Saving only the inputs and recompiling on load is smaller. But the cost of getting a recompile wrong is asymmetric, and that’s what decided it. A drift bug on a single transit path is one visible teleport — annoying, obvious, easy to spot and fix. A drift bug in a recompiled events list silently corrupts an NPC’s entire day, in a way that’s much harder to notice and much worse when it bites. Picture loading a Tuesday save and finding a villager asleep at home when they should be at their workbench — no crash, no error, just someone quietly living the wrong day because the recompile resolved it differently than the save that made it. Bytes are cheap; a quietly broken day is not. So we persist the list the runtime was actually playing, and only recompile at the day rollover from current inputs.

That same compile step does double duty. When a story, quest, or calendar system drops an override mid-day, we don’t surgically splice it into the events list — we recompile the rest of the day from the current in-game time forward, preserving everything already played. It’s the exact same code path as the daily rollover compile, which means there’s one compile path to test and trust, not two that can subtly disagree.

Why bother with all this

Every one of these calls trades something. Source-based priority gives up fine-grained control to buy deterministic resolution. Not-saving-positions gives up a tiny bit of load-time convenience to buy a single source of truth. Recompile-on-override gives up a micro-optimization to buy one well-tested code path. Saving the compiled list spends bytes to buy a drift-free day.

The through-line is that we’d rather pay in places that are cheap and visible than in places that are expensive and silent. For a two-person studio that can’t afford to chase nondeterministic save-corruption bugs for a week, “boring and deterministic” is the whole strategy. The town keeps turning whether or not you’re watching — and, just as importantly, it turns the same way every time you come back to it.

All of which is to say: this is where the design stands today, mid-build. Ask us again in a few months and some of it will have changed under contact with the rest of the game — we’ll write that up too when it does.

More from the workbench soon. If you want the louder updates, that’s what the socials are for — this corner of the site is where we leave the long notes.


Discover more from Secondhand Carrot

Subscribe to get the latest posts sent to your email.

Leave a Reply

Scroll to Top