Saturday, February 24, 2024

Dare to Dream Again - Bringing Obsidian back to life - Part 4 of 4 (Mirror)

This is part 4 of a 4-part series, mirrored from the Gale Force Games page.

Part 1 is here.

Part 2 is here.

Part 3 is here.

Better than it was before.  Better. Stronger. Faster.

In the last installment, I discussed the process of getting Obsidian back to a point where it was, at least, no buggier than the original game.  That is still not quite the case, there is one bug that doesn't occur in the original version: Inventory items getting swept across the screen during pan transitions.

Ironically, that bug isn't actually visible on modern machines running the original game because the pan transitions were all set to run at maximum rate, which is as fast as possible.  If you don't like the bug, you can always turn off the "Enable short transitions" option in the game options.  Today though, we're not here to talk about bugs, we are here to talk about making Obsidian in ScummVM the best Obsidian experience that over Obsidianed.

The first topic though falls into both categories: MIDI

A cacophony of bugs

Obsidian uses MIDI for music, played back through QuickTime.  This seems like an odd choice, given that Myst and most other games were using digital music, and QuickTime's tinny MIDI instruments were very limiting, but based on interviews, this may have been driven by Thomas Dolby's love of technology and MIDI in particular.  It sounds like a lot of the rationale was driven by a desire to have a dynamic music system, which didn't quite pan out.  Dynamic music is used in exactly two places in the full game: During the church puzzle, and after solving the statue puzzle (which cuts out the piano track).

Before we start, we have to talk about what MIDI is.  MIDI is an acronym for Musical Instrument Digital Interface, and it's designed to interface MIDI-compatible controllers (such as keyboards or programmable sequencers) to interface with devices that consume MIDI commands such as synthesizers, computers, and drum machines.

It was also commonly used in the early DOS days for various reasons related to being able to decode it using low-CPU-usage algorithms, or offloading it to sound hardware where it didn't use CPU at all, combined with its small size on disk, so ScummVM already had ample support for playing back MIDI - Sort of.

When you request a MIDI device from ScummVM, it might be a software backend like FluidSynth, it might be a physical MIDI playback device, it might be anything, but you only get one.  This is a problem because Obsidian often has multiple MIDI files playing at once.  The vidbot on/off sound for instance is a MIDI sound, and that plays simultaneously with the music.  There's also one even more surprising one which I'll get to in a moment.

This is a problem because how MIDI works is that there are several different "channels" each of which responds to a command, and a MIDI controller sends commands assigned to channels.  This allows a MIDI device to play multiple instruments at once by assigning instruments to different channels.

Being a protocol designed for real-time, MIDI commands that play notes also do not have an associated duration, instead there are separate commands for activating and deactivating a playing note.  In turn, the file format used for MIDI in Obsidian, known as SMF (Standard MIDI Format), basically encodes commands to send to the MIDI device and what time to send them.

However, this creates a problem: If we stop sending commands from a MIDI file while a note is playing, then we'll never send the command to stop playing the note.  When using QuickTime's software synthesizer, this isn't a problem, because it just stops running the software synthesizer for that MIDI source.  If we only have one continuously-active MIDI output though, then we can't do that.  Fortunately, most notes have finite duration anyway, but sometimes they don't, and for that, Obsidian has a gnarly hack: A MIDI file that sends an "all notes off" command to every channel in response to various commands, but also periodically on a 30-second timer.

In fact, you may notice a moment in the intro where the music pauses dramatically before displaying the title - that pause is at exactly the 30 second mark, presumably timed exactly to prevent that music note-stopper from ruining the theme.

So, the good news is that despite not having a proper multi-source output, the game is still capable of functioning with a single-source output, as long as you're willing to tolerate abrupt cuts in the music every 30 seconds.

There is another problem: MIDI sources have individually-controlled volume, so how do we handle that?  Well, the simple way is that when a MIDI note is sent, it is sent with a "velocity" which roughly corresponds to the volume of the note, but really means something like "intensity."  It's not exactly a volume scale, because you might get more attack in the sound with higher velocity, for instance, with a higher-quality MIDI renderer.  But, it's close enough for what we need, so it was done by intercepting note on and note off commands, modulating the velocity, stuffing the new velocity back into the command, and sending it out.

Mixing things up a bit

Just because it works doesn't really mean it's good though, and the music periodically getting chopped is really annoying.  But, like I said, I like digging through file formats and standards almost as much as I hate having free time, so I decided to create the secret weapon to solve this problem: The dynamic MIDI mixer.  This fun feature is enabled if you enable the "Improved music mixing" option in the game options (it's on by default).

So, what does this actually do?  Well, basically we are going to implement multiple MIDI drivers that funnel into a single MIDI driver in a more intelligent way.  First, we have a way of tracking the full state of every single MIDI channel:

We track this for every channel of the output device and also every channel of each MIDI source.

Each MIDI source in the game is assigned to MidiCombinerSource, which is a thing that looks like a MIDI driver to it, but in the dynamic music mixer, is a funnel into the combiner.

Aside from starting and stopping notes though, there are several other types of MIDI messages: There are controllers, which can alter the qualities of a channel in some way, there's the program (which affects what instrument to play), and also two features called "sustain" and "sostenuto."  So far, ScummVM doesn't support any games known to use sostenuto, but I wanted to get this right the first time.  Sustain and sostenuto are normally controlled by pedals, and if the pedal is triggered while active, then any note playing continues playing after the "note off" command until the pedal is released, so to track an active note, we need to track whether or not it is sustained as well.

Internally, in order for anything to happen, a MIDI source channel has to be dynamically assigned to an available "physical" channel on the output device.  The way that works is: All channels are unassigned by default.  If a controller change happens, and the channel is unassigned, then it updates the MidiChannelState of the source channel, but otherwise does nothing.

When a note is played on an unassigned channel, the combiner tries to find a channel that is the right channel type (i.e. one channel is typically reserved for percussion), and otherwise stopped playing its last note the longest time ago.  If there is still an active source assigned to that channel, then that source channel is unassigned.  Then, the new source channel is assigned to that output channel, the channel state of the output channel and the source's channel are compared, and anything that is different is adjusted by sending the necessary MIDI commands to do so, and then the note is played.  If a source channel is still actively assigned to an output channel, then control commands are sent to the output immediately.

Because of this, the "all notes off" trigger actually doesn't do anything any more.  It updates the MIDI combiner source's channel state, but since nothing in that MIDI file ever plays a note, none of it ever goes to the output driver.

We can also use the MIDI gain control, which is an actual volume control, instead of having to module velocity, and since we know which output channels are being used by a source, we can completely quiet a source by just sending an "all sound off" command (which bypasses sustain) to silence the channel, and then deallocate it.

La la la, I can't hear you!

Well that's unfortunate.  Individual sounds in Obsidian can have their own volume level, which is modulated by the global volume level, so setting the volume to the maximum doesn't necessarily mean that the MIDI source is set to the maximum possible volume.  As mentioned earlier, the vidbot sounds are MIDI, and by default their volume is set to 50%, but they do seem significantly quieter when using gain control vs. modulating the velocity.  What's going on here?

After a bunch of trawling around, it turns out the answer is in the General MIDI Level 2 specification at page 7: "gain in dB = 40 * log10(cc7/127)"

... What does that mean?  Unfortunately, I'm actually finding out that I did the fix wrong as I'm writing this, and need to fix it again!  "cc7" is the MIDI channel volume value.  The volume scale on MIDI sources is a linear scale, meaning a value half as large causes the amplitude to be halved.  Decibels (dB) are a logarithmic scale, meaning any 2 values with the same distance apart the same proportional magnitude.  One problem with this though is figuring out what we're measuring.  Many measurements in electrical and sound engineering are the square or square root of other measurements due to various physics interactions.  Decibels are a scale designed around factor-of-10 changes, but whether a 10x change is +10 or +20 depends on what quantity is being measured, due to that problem of quantities being squared.

What we want is the measure that the volume is supposed is be scaling, which is amplitude, which is on the +20 = 10x scale.


So converting normalized MIDI volume (a.k.a. the volume rescaled to a 0-1 scale) to modulation involves squaring it, so we need to figure out how to convert scaled modulation back into a new normalized volume.

... okay, great, so basically all we have to do to compute the new MIDI volume is scale the MIDI source volume to a 0-1 scale, and then multiple the original MIDI volume by the square root of that value.  Cool.  This means our vidbots playing at 50% volume are now a 0.7071x multiplier instead of 0.5.  Unfortunately, the original implementation of the decibel scale wrong and was using a 4th-root, which I guess is better than it being too quiet, but still wrong.

Anyway, it's fixed-er now!

I've got 32 problems and my bit depth isn't one

Most monitors today try to run games in 32-bit 8-bits-per-channel (and 8 bits of waste to help with memory alignment) color mode, or sometimes HDR if the game supports it.  Attempting to run Obsidian in 32-bit mode will greet you with this error though:

Obsidian was released at a time when 16-bit color depth (or "thousands of colors" as it was called on Mac) was fairly new, most things ran in 8-bit depth with a color lookup table, so 16-bit was fairly cutting edge.

All of Obsidian's images are 16-bit images though, so we're not really losing anything by running with a 32-bit render target, are we?  Actually, we are.  The images are 16-bit.  The videos are not.  The videos are encoded with Cinepak, which in full-color mode can render out to 8 bits per channel, and even if the input images to Cinepak were in 16-bit color, the averaging out that occurs during the encoding process (and from the YUV-to-RGB perceptual transform) have more accuracy per color channel.  So, we want to render in 32-bit.

That's accomplished by having an override that just lies to the scripts about what color depth the game is being run at.

A more cinematic experience

When Obsidian came out, most desktop monitors were 640x480, which is a 3:4 aspect ratio.  Most displays today are 16:9 widescreen.  However, the images and videos in Obsidian are all 640x360, also 16:9.  Well that's pretty neat.  It would be cool to play the game in widescreen and get rid of the letterboxing on both sides of the monitor, wouldn't it?

Okay, let's just offset the game frame up 60 pixels, cut down the resolution, and lie to the game about the resolution it's running at so it doesn't throw a startup error!  That's a great start, but what about the inventory items that display below the frame?

There's an internal system for overriding object behaviors, which can somewhat handle this.  It's more complicated than it could be, because with how Obsidian is broken up, sometimes you carry the item into different sections and subsections, which means the elements that display them are duplicated.

Fortunately, the inventory items are color-keyed already even though they're on a black background, so I didn't need to do anything to make them not render a black box outline, but they did run into a problem with the security survey in the maze.


Uh oh, the keycards are overlapping the survey, and I really need to let him know that I eat my ice cream straight from a bowl!  Well I guess we could just change the layer order so the form's on top of the cards, right?  Actually, no, because the security form image isn't just the security form, it also includes a bit to the left.

Ultimately, this was resolved by detecting this specific situation.  When the security form is displayed, the cards are moved off-screen, and moved back when the security form is dismissed.  Now I can eat my ice cream in peace.

There was one last problem in widescreen mode:


... the Rocket Science Games logo is too big!  Now you might be thinking, "so what, that company hasn't existed for 25 years, who caaaaaares!?"  Unfortunately, I care, so we need to fix this!  But what can we do?  Shrink it?  But then we get letterboxing on the sides again, and that doesn't look nice!  What we need, which you may have seen if you watch old TV shows re-broadcast on widescreen, is an anamorphic filter.

An anamorphic filter works by stretching out the sides of the image more than the center areas.  This was done by computing an exponential curve that has a derivative of 0 at the point where it's supposed to stop (meaning, basically, the rate of pixel coordinate change becomes normal where the curve tops, preventing a noticeable seam), and applying that to the pixel grid.  Here's what the filter applied to an 8x8 grid pattern looks like:

And here's the filter applied to the logo video:

Much better!

Forgetting to save is half the adventure

While practically a foreign concept these days, checkpoints weren't always a thing.  Like most games of its time, if you wanted to keep your progress, you had to save the game.  The game even reminds you to save when you try to quit!  Having auto-save would be really nice though, wouldn't it?

The auto-save feature works partly on a timer, like most ScummVM games, but there's an option (enabled by default) to also auto-save at progress points.  Most puzzle solutions in Obsidian set some variable, but you aren't always allowed to save, so this creates a bit of a problem: Finding how to detect puzzle solutions, and then finding a safe place to save.

Auto-save detection is done in two ways: One way is by detecting arrival in a specific scene while coming from a specific other scene, which is used to detect things like chapter transitions, but also things like beating the maze without the proper document.  The other way is detecting arrival in a specific scene with a puzzle completion variable set differently than what it was the last time the game was loaded or restarted.  Normally, the latter category is done by triggering in the scene you would wind up in after the puzzle.

There is a minor omission in this scheme: If you complete a puzzle, save the game, then reload it, the autosave won't trigger... but in that case, they've pretty much saved right there anyway, so who cares?

I'm playing a game, not a menu!

ScummVM shows screenshots of the game where you saved it in its save game UI, but there's one problem: If you save the game from the in-game UI, then you're not looking at the game, you're looking at a menu.  So, some hooks had to be added to take a screenshot before transitioning to the menu and using that screenshot for the save instead.

Some light reading

People seem to like subtitles, so why not add them too?  Well, they were added, and you can download them from ScummVM's add-ons page.

Subtitles mostly work by detecting when a specific sound asset is activated and popping up the subtitles.  There is however an option for a very small number of subtitles:

Without spoiling too much, there's one puzzle that involves sound, but popping up subtitles for the sound at all kind of gives away the answer, and it's one of the neatest puzzles in the game.  So, depending on what your rationale is for enabling subtitles, you can keep this option off if you want to have subtitles, and can hear sound, but don't want to spoil the challenge.

Overall, there are a 686 voice lines in the game.  Deciding when to split a subtitle up, and where to split the lines, was an ongoing challenge that I lack the expertise to do, but I did the best I could.

In some cases, this involved getting help for figuring out lines I couldn't make out.  I'd never actually heard the term "lousing up" before in my life.  Also, I added speaker names to the subtitles, but in some cases, we don't really know the names of these characters and just have to guess.

The identity of the characters also clearly has some story implications, but without knowing for sure who they are, that has to be danced around, a problem made more difficult by Obsidian's casting.  While some of the characters are professional actors, many of them (especially the vidbots) are Rocket Science employees.

The bureau chief, for instance, is almost certainly Howard Cushnir, who also appears in a celebration video, and I think is also a character in a chapter intro cinematic, but it's not clear if that's because it's the same character, the same person playing multiple characters, or if I'm mistaken and it's not even the same person.  Adding to this, the brief appearance in a cinematic is as Max's teacher, but he appears to be the project administrator in the journal.  (At least, that seems to be the implication - that his appearance as the chief in the bureau realm is a reflection of his bureaucratic authority status in the real world.)  It would be odd for that to be the same person, then, but maybe he's a graduate professor at a research university and it is the same character?  There's no way to tell.

Another amusing case was the eye test in the bureau maze.  The voiceover there drops to a soft, illegible level, and it's supposed to be a gag that it's unintelligible, so he tells you to go to the hearing test booth (who sends you back for an eye test).  The problem is that it is intelligible if you isolate the sound and turn the volume up, so the intent is that it's illegible, but there are actual coherent words in the line.  So, should the line show the words, or something like "<Unintelligible>" to keep the gag?  I kind of split the difference.

Eventually, it's time to move on

Doing this involved a lot of testing.  In many cases, I found things in the logic that looked like they caused bugs, confirmed that they caused bugs, and verified that the same bugs occurred in the original game.  That type of bug is much harder to fix, and ultimately only one of them was (a progression blocker that occurs if you save before logging into the journal).

The inventory panning bug is the only bug left that hasn't been fixed, and due to how the widescreen mod works, giving it a proper fix is difficult.  In non-widescreen mode, the fix is to only pan the part of the screen with main-scene elements.

In the end though, the two most important lessons I've learned in life are that shipping beats perfection, and motivation is a finite resource.  Working on this was always a race against burnout, and eventually it started reaching the point where it was hard to justify any further work vs. moving on to other things.  Eventually it's time to say "it's good, and that's good enough," stick a fork in it, and move on to other things.

The future of the mTropolis engine in ScummVM

mTropolis was used to ship a few dozen titles, and a few of them have been on the list as possible additions: Muppet Treasure Island, S.P.Q.R. - The Empire's Darkest Hour, Star Trek: The Game Show, and MindGym.  The first one is done and was added in ScummVM in the 2.8.0 release, the next two I have copies of.  S.P.Q.R. is the next one on the to-do.

However, as mentioned above, motivation is not a finite resource, and unlike some members of the team, I don't really approach things with nostalgia, or as a historian that thinks it's not their place to judge the quality of things from the past.  My goal is to save valuable things from oblivion, and the further down the to-do list I go, the more questionable that "value" gets, in my opinion.  So, we'll see what happens, and when it happens, but that's the plan for now.

Jank beyond the dream worlds

It wouldn't be a good rant if I didn't leave you with some tales from other mTropolis games, would it?  Muppet Treasure Island has numerous hacks to deal with duplicated aliased compound variables needing to be linked up in a way that I still haven't made any sense of, and music doesn't work in S.P.Q.R. right now because it depends on sending messages to objects in a scene that is never loaded.

Also, I finally figured out what those extra 8 bytes in the catalog header are: Eventually mFactory realized that it was a problem to have separate formats for Mac and Windows, so they decided to make a cross-platform format.  Does the cross-platform format work by being in a common format that works on both platforms?  Of course not.  It exports the Mac and Windows versions into the same file, and de-duplicates the asset data.

That's all for now.  See you in the future, somewhere.

~~Fin~~

No comments:

Post a Comment