Rolling Back Sound | Slow Rush Studios

Ahh, rollback networking: you're wonderful, and I love you, but you make so many things even harder.

Let's talk about how adding sound makes networked multiplayer harder to pull off - something that you probably never thought would be the case.

Rollback Recap

Here's how rollback networking works: ¹

60 times a second, the game "advances": it takes input (e.g. keypresses) from players and simulates a 60th of a second ("one tick") of gameplay.
Making games multiplayer over the internet is hard because keypresses take a lot of time to travel from remote players' games to our running game.
What do you do when a remote player's keypresses haven't arrived yet? You don't want to pause and wait (because the game would stutter a lot), so one approach is to predict their keypresses: we can optimistically assume remote players are pressing the same keys as in the previous tick.
Sometimes that assumption is wrong! We might later learn that a remote player actually held down a new key back in tick X: then we have to rewind the whole game to that historical tick X, and advance it forward again until the present tick.

It works surprisingly well! (You can try it in my old web-based rollback-networking multiplayer demo.)

But there's a big hairy detail there which turns out to matter a lot for playing sounds in multiplayer: how does that rewinding actually work?

Rewinding for Rollbacks

To implement rewinding, I store "old game ticks":

Each time the game advances to a new tick, I copy the entire game state and store it as a 'historical tick'.
(Up to 8 of them; the more we keep, the more we can rewind.. but it also takes more memory, and rewinds of more than 8 ticks tends to feel bad.)

Secondly, each game tick remembers what inputs it used to advance to the next tick:

Diagram showing storing of inputs and using them to advance to new ticks

I store keypresses (whether optimistically predicted or real) and all other inputs that the game used to advance to the next tick.

And when we find out that we need to factor in a new input, it's just like traveling back in time in a movie² to branch off into a new canonical timeline:

Diagram showing resimulation of old tick states based on receiving a new input

The new canonical branch of the timeline uses the new input and the last-known-good tick to advance into a new set of ticks to work out what the present time is meant to look like - and all the ticks from the now-incorrect alternate timeline get thrown away.

This all happens in a 60th of a second, and the key to making it seamless(-ish) is that you're not supposed to see it happen.

Graphics-wise, for a brief few 60ths of a second you'll see the remote player running - then suddenly they snap into their correct position 3/60ths of a way through their jump arc; the incorrect state is so brief that you almost can't see it.

Rolling Back Side Effects

Thing is, in graphics land, you can take only the latest tick and just.. draw it. Easy. ³

Sound is more complicated because it is a "side effect": playing a sound from tick X has the side effect of, well, playing the sound, and that sound keeps playing in tick X+1 and so on.

Combine that with rollbacks forking off new timelines, and suddenly you have all sorts of issues to handle:

A sound played 100 milliseconds ago maybe shouldn't have been played?
Oops, in the new canonical timeline we had to play a sound 50 milliseconds ago, but we haven't played it at all!
In the now-incorrect alternative timeline we correctly played a sound - but now we gotta make sure not to double it up.
Uh oh, we stopped a sound from the past that we actually shouldn't have stopped.
We played that sound from over here, but it should have been played from over there.

So instead of actually playing sounds immediately, instead I have each historical tick also store what sounds it wants to play or stop: ⁴

Diagram showing storing plans of sound, which can be executed later.

We 'decouple' the decision making of whether to play/stop sounds (in the game's simulation logic) from the actual audible playing/stopping of a sound in the game's audio engine.

Then, we can avoid playing a sound until we know for suresies that there won't be new input arriving for that tick - so then we'll never have to "undo" playing a sound.

Except... then all sounds from any tick are delayed by the time it takes for all remote players' inputs for that tick to reach you, and that's kind of noticeable:

Click to play (needs sound) Loose impression of what it feels like when audio is delayed.
It's actually worse when you're playing, because we're used to our interactions making sounds immediately!

Instead, we can play sounds immediately, and when a rollback happens, we can "reconcile" the now-incorrect timeline's sound plans (which we've already actioned!) with what the new canonical timeline's set of sound plans.

For example: if a sound was played that shouldn't have been played, I stop it, and if a sound needs to be played (and it hasn't already been played!⁵) then I can start it.

It's still tricky! For example, for a sound meant to be played 3/60ths of a second ago, I don't fast forward that far into the sound - that might miss a key audible feature, so it still seems better for the sound to be "late" there. ⁶

But now there's no delay in playing sounds from your own interactions, even if you're playing with someone really far away.

Playable web build‎

You won't notice the new sound rollback stuff here in this singleplayer demo, but fear not, here's other new sound stuff:

Spatial audio works now: sounds are attenuated and panned appropriately.

Aside: There's surprisingly little info on handling sound spatialization for 2D games so maybe I should write about this too at some point?

Physics objects like barrels and ragdolls now make clanging and squishing collision noises.
When you're underwater, it now sounds like you're underwater: I use FMOD magic to apply a low-pass filter & reverb effect, and play a bubbly underwater sound too. (It's totally gratuitous polish, but it helped me learn some FMOD things. Also I think it's neat.)

Press F1 for help, including to see keyboard/mouse controls. Mobile devices probably won't work! By playing you agree to our Privacy Policy and Terms of Service.

That said, sounds for atom interactions (like acid corroding, fire burning, etc - but also characters falling into water) are still conspicuously missing, as are spellcasting sounds.

This old Ars Technica article on rollback networking goes into a lot more details.

Or like how Marvel occasionally retroactively decides ("retcons") that superhero X isn't actually dead after all because folks still want to pay money to see them in theatres (or to maintain their Disney+ subscription).

Well, you can complicate it by trying to interpolate entity positions between their last-drawn position (possibly from a now-incorrect alternative timeline) and the current position. But I don't do that. (yet?)

⁴

In computer science parlance, this is like using a language's Effect System to make an otherwise side-effecting function actually be side-effect free: we make the game simulation be free of side-effects by having it build up a list of side effects that it would ideally like to have - and then dealing with that list at some later point.

⁵

Actually, storing which sound events to play and then being able to refer to them later is a tricky thing that took me a few technical design iterations: the game simulation can't rely on any state from the audio subsystem as it isn't going to be in sync between different players, but you also need to ensure that a sound played for a given jump will be identified as that same jump's sound if the game is rewound and played forward again.. possibly with that the jumping player positioned elsewhere! (And you need to prevent unbounded memory use too.)

Currently my approach is that when the game simulation wants to play a sound, it mints its own handles (current tick id + incrementing counter for that tick) for that sound, and the game simulation can reuse that handles to queue up requests to stop the sound or adjust its parameters (e.g. volume, mixing behavior, etc).

But, from outside the game simulation, that's not enough to tell that a jump sound from timeline A is meant to be the same sound in timeline B! So I also store the name of the sound, and the spatial coordinates of the sound (bucketed into the nearest 4 pixels), and use that for this reconciliation.

⁶

Also, currently I don't "re-play" a sound that was incorrectly stopped, or undo incorrect volume (or other parameter) changes: my hope is that these edge cases won't be noticeable in practice.