VOGONS

Common searches


Reply 80 of 122, by Jepael

User metadata
Rank Oldbie
Rank
Oldbie

And sure, there's GUS mod players for 286. But GUS is really 386/486 era thing anyway.

Even with SB 1.x, the DAC could be better used for 8-bit PCM sound effects instead of using CPU horsepower for mixing even four channel music into 8-bit PCM.

Scali wrote:

Sadly it's not quite 'free' on a low-end system. The original OPL2 responded very slowly to commands, so you had to add in lots of delays (Yamaha recommended no less than 35 'dummy' reads after doing a write, to get the chip to settle). On a slow PC that means if you play music that updates every frame, you could end up spending about half your frame time updating the OPL2 registers.

Yes it's slow, but not that slow. The amount of dummy reads depends on the speed of the machine of course, there is no need to do 35 dummy reads on a 4.77 MHz 8088.

Based on my calculations, you could write more than 37000 registers per second, meaning writing all the (about) 120 sound registers would take 3.2 milliseconds which does sound a lot. But surely in any normal conditions, there's no need to write all 120 registers each frame 😀

Reply 81 of 122, by Jepael

User metadata
Rank Oldbie
Rank
Oldbie
SaxxonPike wrote:

I had no idea. That's kind of obnoxious for a delay. I wonder if one might be able to stagger some of those commands (especially for simultaneous notes and program changes) to reduce the impact that programming the OPL2 has on timing. Or maybe interleave writing registers with some other computation on the CPU in between - surely that number of reads isn't required, only the delay, correct?

Only long enough delay is required, which means not accessing the chip during that time at all, so no staggering of register writes for example to FM channels 1 and 2.

Sure you can do other stuff while waiting. But it would be hard to come up with an idea what to do and how to do it for about 23 microseconds at a time, especially when the amount of registers to write is variable per sound player advance call.

Reply 82 of 122, by Scali

User metadata
Rank l33t
Rank
l33t
Jepael wrote:

Yes it's slow, but not that slow. The amount of dummy reads depends on the speed of the machine of course, there is no need to do 35 dummy reads on a 4.77 MHz 8088.

Pretty sure it's not.
That is, port reads are more or less 'constant time'. That's why they can be used for delay, even on fast systems. They take the time of a 'classic' ISA bus cycle (4 clks at 4.77 MHz).
35 was literally what was in the programming manual of the AdLib in 1987, when 4.77 MHz 8088 systems or Turbo XTs were still commonplace.
Yes, an 8088 at 4.77 MHz could get away with slightly less than 35 reads perhaps, because it takes more time to decode the instructions... But how are you going to calibrate your code?
All replay routines I've seen for AdLib just bang the registers 35 times (unrolled).
Anyway, bottom line is that there's quite a bit of delay on register updates on AdLib, in absolute time. You can't 'optimize' it beyond the capabilities of the OPL2 chip. It just hurts slow systems more than fast systems.

The problem is that the delay is long enough to be annoying, and too short to try to do much in the sense of 'clever' stuff, as in putting other processing in between, or doing register updates on a timer interrupt or such.

One thing that might work is what they sometimes did on C64 music as well: update only one channel per frame. So you get 'staggered' channel updates. This might throw the timing off by a bit, but usually too short to notice. It also loses a bit of detail, because some updates get 'lost', because they never get the time to be sent to the chip. But if you compose inside these limitations, it's quite usable.
C64 only has 3 channels, perhaps for AdLib it'd be better to update in groups of channels, eg 3 at a time, but the idea remains valid.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 83 of 122, by beastlike

User metadata
Rank Member
Rank
Member

Could just go PC Speaker - there are some classics that have some awesome PC speaker soundtracks and sound effects. Programming is simple, should be little strain on even the earlier processors

Reply 84 of 122, by Jepael

User metadata
Rank Oldbie
Rank
Oldbie
Scali wrote:
Pretty sure it's not. That is, port reads are more or less 'constant time'. That's why they can be used for delay, even on fast […]
Show full quote
Jepael wrote:

Yes it's slow, but not that slow. The amount of dummy reads depends on the speed of the machine of course, there is no need to do 35 dummy reads on a 4.77 MHz 8088.

Pretty sure it's not.
That is, port reads are more or less 'constant time'. That's why they can be used for delay, even on fast systems. They take the time of a 'classic' ISA bus cycle (4 clks at 4.77 MHz).
35 was literally what was in the programming manual of the AdLib in 1987, when 4.77 MHz 8088 systems or Turbo XTs were still commonplace.

Yes, an 8088 at 4.77 MHz could get away with slightly less than 35 reads perhaps, because it takes more time to decode the instructions... But how are you going to calibrate your code?
All replay routines I've seen for AdLib just bang the registers 35 times (unrolled).
Anyway, bottom line is that there's quite a bit of delay on register updates on AdLib, in absolute time. You can't 'optimize' it beyond the capabilities of the OPL2 chip. It just hurts slow systems more than fast systems.

I recall reading they had to increase the number of IO reads for faster machines, but can't find a better source now than this: http://www.oldskool.org/guides/oldonnew/sound

So 84 cycles of 3.579 MHz is same as 112 cycles of 4.773 MHz, and if IO read is 4 clks, that's only 28 reads instead of 35. Yeah, not a huge difference though.

Monkey Island did calibrate the delay based on calibration loop. Something along the lines of how many push/pop operations in some period between timer interrupts and then some math to come up with a magical value how many loops of push/pop should be done between port writes.

Too error prone, and I think they got the index and data delay periods the wrong way around, but somehow it still works on every machine I've ever tried it. Who knows if it's slower than it should be.

Scali wrote:

The problem is that the delay is long enough to be annoying, and too short to try to do much in the sense of 'clever' stuff, as in putting other processing in between, or doing register updates on a timer interrupt or such.

One thing that might work is what they sometimes did on C64 music as well: update only one channel per frame. So you get 'staggered' channel updates. This might throw the timing off by a bit, but usually too short to notice. It also loses a bit of detail, because some updates get 'lost', because they never get the time to be sent to the chip. But if you compose inside these limitations, it's quite usable.
C64 only has 3 channels, perhaps for AdLib it'd be better to update in groups of channels, eg 3 at a time, but the idea remains valid.

Yes it's just best to waste the IO wait time for simplicity. Clever thought on staggering the updates, but wouldn't it require far greater clock than just locking on to vsync period? I mean if original code is called once per vsync, using say 3 channel groups it would require 3*vsync timer then, and thus notes on same row could be 5 to 11 ms apart if vsync rate is 60Hz. So perhaps the timer rate should be much higher to alleviate that. Wolf3D uses 700Hz OPL IMF music (286 required), Monkey Island 473 Hz for OPL MIDI music (8088 playable).

Another issue is the overhead the timer interrupt takes.

Reply 85 of 122, by Scali

User metadata
Rank l33t
Rank
l33t
Jepael wrote:

Monkey Island did calibrate the delay based on calibration loop. Something along the lines of how many push/pop operations in some period between timer interrupts and then some math to come up with a magical value how many loops of push/pop should be done between port writes.

Perhaps they were also 'lucky' that most people had a clone with OPL3 rather than a real OPL2 😀
For OPL3, the delays aren't required.

Jepael wrote:

Yes it's just best to waste the IO wait time for simplicity. Clever thought on staggering the updates, but wouldn't it require far greater clock than just locking on to vsync period? I mean if original code is called once per vsync, using say 3 channel groups it would require 3*vsync timer then, and thus notes on same row could be 5 to 11 ms apart if vsync rate is 60Hz.

Yea, that's the whole idea.

Jepael wrote:

So perhaps the timer rate should be much higher to alleviate that. Wolf3D uses 700Hz OPL IMF music (286 required), Monkey Island 473 Hz for OPL MIDI music (8088 playable).

To be honest, I have no idea why they'd choose such high update rates. Sounds like whoever designed that was on crack 😀
SID music generally updates at 50 Hz (okay, there's some 'multispeed' stuff that updates multiple times per frame, but that's not generally used in games, aside perhaps from title screens and such, for the simple reason that it takes too much CPU time, and messes up routines that have to be synced to screen position).
MOD music updates at 50 Hz.
Edlib updates at 50/60/70 Hz depending on your choice.

So why would you possibly go a factor 10 above that? I don't really see what it would add musically, aside perhaps from some really esoteric super-fast instrument parameter updates to trigger weird effects that you can't pull off at 60 Hz.

I mean, I can somewhat understand MIDI, since it is basically 'free tempo', it just records music data in realtime, never designed to run inside a game engine, let alone synced to a display refresh rate. The MIDI standard has a ~31 kbps bandwidth, so theoretically you could have pretty high resolution.
But I don't think MIDI is a good choice in terms of efficiency.

Jepael wrote:

Another issue is the overhead the timer interrupt takes.

Yup, more 3-4 timer interrupts per frame is a no-no on 8088-class PCs. The interrupt overhead is really high.
I've been experimenting with background replay routines for MODs, and at 8 KHz you lose more than 50% of your CPU time to just playing the samples.
8 KHz would be 8000/60 = 133 interrupts per frame. Each interrupt takes more than a scanline worth of time.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 86 of 122, by Jepael

User metadata
Rank Oldbie
Rank
Oldbie
Scali wrote:
Jepael wrote:

So perhaps the timer rate should be much higher to alleviate that. Wolf3D uses 700Hz OPL IMF music (286 required), Monkey Island 473 Hz for OPL MIDI music (8088 playable).

To be honest, I have no idea why they'd choose such high update rates. Sounds like whoever designed that was on crack 😀

IMF music files are just register dumps generated from original MIDI songs, so like you said, "free tempo". 700 Hz is also multiple of 70Hz frame rate, some multiple of PC speaker sound effects rate, sub-multiple of 7000Hz sampling rate for feeding about ten samples per tick into the DSS FIFO, and enough to play sign bit of PCM samples to get some kind of resemblance of speech from speaker. It might not do everything on each timer tick. That sort of crack 😀

The higher the rate, the less there is deviation from the actual time something should happen, like a note trigger. MI runs actually the tempo accumulator at 473 Hz, so it might call playback routine only every other tick if tempo is 128, or if tempo is 255 then 255 out of 256 interrupts run the playback. So the granularity is about 2.1ms and is never tempo "jitter" more than 4.1ms. Good enough.

Reply 87 of 122, by Scali

User metadata
Rank l33t
Rank
l33t
Jepael wrote:

IMF music files are just register dumps generated from original MIDI songs

So that's where their problem is. MIDI sucks for games, as I already said.
Guess they couldn't be bothered to build a proper music tool.

Jepael wrote:

and enough to play sign bit of PCM samples to get some kind of resemblance of speech from speaker.

I hope they never tried to do that, that's got to be the most horrible idea ever. Thank god for RealSound 😀

Jepael wrote:

The higher the rate, the less there is deviation from the actual time something should happen, like a note trigger. MI runs actually the tempo accumulator at 473 Hz, so it might call playback routine only every other tick if tempo is 128, or if tempo is 255 then 255 out of 256 interrupts run the playback. So the granularity is about 2.1ms and is never tempo "jitter" more than 4.1ms. Good enough.

Yea, but if you design your music format to be based on the framerate, you don't have that problem in the first place. Music will only update at the framerate by definition.
So no need to do all sorts of kludges with high-frequency timer interrupts etc that won't really work on anything lower than a 286 anyway.
As I say, it works perfectly on eg C64 and Amiga, and various other systems. I guess it only shows the lack of skill and dedication on the PC side of things.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 88 of 122, by K1n9_Duk3

User metadata
Rank Member
Rank
Member
Scali wrote:
Jepael wrote:

and enough to play sign bit of PCM samples to get some kind of resemblance of speech from speaker.

I hope they never tried to do that, that's got to be the most horrible idea ever. Thank god for RealSound 😀

They actually did in Wolf3D and SoD. There's just no menu option to enable that.

https://www.youtube.com/watch?v=1BtlsjJRnFU

Reply 89 of 122, by Scali

User metadata
Rank l33t
Rank
l33t
K1n9_Duk3 wrote:
Scali wrote:
Jepael wrote:

and enough to play sign bit of PCM samples to get some kind of resemblance of speech from speaker.

I hope they never tried to do that, that's got to be the most horrible idea ever. Thank god for RealSound 😀

They actually did in Wolf3D and SoD. There's just no menu option to enable that.

https://www.youtube.com/watch?v=1BtlsjJRnFU

My point exactly, painful 😀

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 90 of 122, by VileR

User metadata
Rank l33t
Rank
l33t

Easy to see why they removed it - 1-bit audio at these sample rates naturally sounds horrible. Then again it's a valid tribute to the original Castle Wolfenstein, which did its sfx the same way. 😀

[ WEB ] - [ BLOG ] - [ TUBE ] - [ CODE ]

Reply 91 of 122, by SaxxonPike

User metadata
Rank Member
Rank
Member
Scali wrote:

Yea, but if you design your music format to be based on the framerate, you don't have that problem in the first place. Music will only update at the framerate by definition.
So no need to do all sorts of kludges with high-frequency timer interrupts etc that won't really work on anything lower than a 286 anyway.
As I say, it works perfectly on eg C64 and Amiga, and various other systems. I guess it only shows the lack of skill and dedication on the PC side of things.

On C64 you don't need to delay between register writes to the SID though. High frequencies lead to the routine doing a whole lot of nothing, sure, but there's enough time between them that register write delays are just a detail of the hardware that become insignificant if you only do one write per tick.

Sound device guides:
Sound Blaster
Aztech
OPL3-SA

Reply 92 of 122, by Scali

User metadata
Rank l33t
Rank
l33t
SaxxonPike wrote:

On C64 you don't need to delay between register writes to the SID though.

That wasn't the point though. The point was that on a slow CPU, interrupt overhead is quite significant. Therefore you want to avoid having more than 1-2 interrupts per frame.

SaxxonPike wrote:

High frequencies lead to the routine doing a whole lot of nothing, sure, but there's enough time between them that register write delays are just a detail of the hardware that become insignificant if you only do one write per tick.

The problem is that it's simply not an option on an 8088. I don't think you understand that.
If you want to target an 8088, you simply can't use a music system like in Wolfenstein 3D, which runs at 700 Hz, because it'd eat up lots of precious CPU time.
It'd probably be cheaper to do updates once per frame, and do some delays between writes (how many writes are you really going to do every frame?) than to fire off lots of interrupts that may or may not update a register.
Just firing an interrupt and doing an iret directly is already quite expensive on an 8088. If you need to also switch to your song data, check if you need to update anything, you'd also need to save and restore a bunch of registers, and the cost adds up very quickly. Doing that at 700 Hz is going to kill your CPU, even if it's not actually playing any note.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 93 of 122, by keenmaster486

User metadata
Rank l33t
Rank
l33t

OK guys, here's what I'm thinking on this:

We should definitely not do MOD music if we want to target 8088/286 systems. For 386 it might be OK, but still, the low CPU overhead of OPL2/3 register dumps sounds pretty good.

I'm thinking IMF files. But how much CPU time would we gain if we moved to, for instance, 280 Hz (half the 560 Hz rate used for Keen, which is too slow on 8088 anyway)? And I don't even know if we want to target 8088; I was thinking the minimum expected requirement would be 286/386/VGA.

World's foremost 486 enjoyer.

Reply 95 of 122, by Scali

User metadata
Rank l33t
Rank
l33t
leileilol wrote:

A typical common 8088 system wouldn't be having a soundcard installed anyway. you'd have to beep it

Well yes, then again, since most games supported both PC speaker and sound cards, you could still play AdLib music on your 8088 if you happened to have a sound card.
I guess it's all about the targets you're going for.
If you want to go for an 'authentic' game from the era of say 1987-1992, then you're likely going to support a wide range of CPUs, graphics standards and audio.
Eg, 8088+, CGA/EGA/VGA/Hercules, PC speaker/AdLib/SB/Tandy.

A turbo XT at ~10 MHz with VGA and an original SB would not be that far-fetched, and would be an interesting gaming platform.
It would be more or less in the ballpark of a stock 6 MHz AT, where VGA and an SB would also make sense.

leileilol wrote:

Also you COULD use S3M for FM output... 😀

Did they ever release the replay routine for that though, and if so, is it 8088-compatible code?

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 96 of 122, by K1n9_Duk3

User metadata
Rank Member
Rank
Member
keenmaster486 wrote:

I'm thinking IMF files. But how much CPU time would we gain if we moved to, for instance, 280 Hz (half the 560 Hz rate used for Keen, which is too slow on 8088 anyway)?

Well, you can use 280 Hz (that rate was used in Duke Nukem II for IMF music) or even 140 Hz, 70 Hz or just 18 Hz. All you need to do is chnage the timer for INT 8 to the desired value, the IMF playback itself doesn't care at all about the timing. The only problem is creating IMF songs that work at that rate, since they are usually converted from MIDI. The MIDI files are usually not made for really low playback rates like 18 Hz. 140 Hz works for most things without noticeable differences, and so should 70 Hz. Worst case, timing will be off by 14 milliseconds, which should be fine with most players. And with clever IMF converting (like, allowing the IMF song to be played slightly slower than originally inteded) you could reduce timing issues to a minimum even at 70 Hz. I don't think I need to tell you by which factor the CPU overhead would decrease when using 70 Hz instead of 700 Hz or 560 Hz.

BTW. the original IMF fileas always used a multiple of 140 Hz for playback, because 140 Hz was used for the PC Speaker / AdLib sound effects. This way, they could use a simple counter in the interrupt routine instead of more complicated timing routines. The 140 Hz rate was probably used because it was a multiple of the 70 Hz refresh rate of the CRT in 320x200 EGA/VGA modes.

Reply 97 of 122, by Scali

User metadata
Rank l33t
Rank
l33t
K1n9_Duk3 wrote:

The 140 Hz rate was probably used because it was a multiple of the 70 Hz refresh rate of the CRT in 320x200 EGA/VGA modes.

VGA yes, and that might be the reason. Running at a multiple of the framerate ensures that you always have the same amount of interrupts per frame, and always in the same place (although, only if they resync to the vsync regularly, since VGA runs on its own crystal, and therefore VGA's '70 Hz' is not exactly the same as the PIT's '70 Hz', so you get drift.).
Always having the interrupts in the same place means that you know you won't suddenly get your music routine firing during some time-critical stuff such as polling for vsync. So you don't get any jitter or other timing issues.

EGA however runs at 60 Hz. So it will always be desynced.

Then again, for 'fast' systems such as the 286+ that Wolf3D and such were targeted at, it's not that much of an issue.

In 8088 MPH we used 60 Hz music, which was either played by a timer interrupt that was synced to the vsync (so it always fired at a point where we weren't drawing anything... we could just wait for the timer interrupt instead of polling for vsync), or the music routine was called at the end of a cycle-counted frame routine.

In either scenario you basically had a reserved 'slot' somewhere in the frame where the music could run.
This way we could still race the beam and do whatever we wanted, with both screen and music updates running at 60 Hz and perfect accuracy.

On many other 8-bit platforms, such as the C64, a very similar strategy is used for combining music and graphics.

http://scalibq.wordpress.com/just-keeping-it- … ro-programming/

Reply 98 of 122, by VileR

User metadata
Rank l33t
Rank
l33t
K1n9_Duk3 wrote:

The only problem is creating IMF songs that work at that rate, since they are usually converted from MIDI. The MIDI files are usually not made for really low playback rates like 18 Hz. 140 Hz works for most things without noticeable differences, and so should 70 Hz.

Back around 2005 or so I had a needlessly complicated toolchain for IMF composition, where the first step was 'MIDI Tracker'... at least back then it was pretty bad, but tracker-style quantized timing would surely help the songs work well in the 50-70Hz range.

[ WEB ] - [ BLOG ] - [ TUBE ] - [ CODE ]

Reply 99 of 122, by K1n9_Duk3

User metadata
Rank Member
Rank
Member
Scali wrote:

EGA however runs at 60 Hz. So it will always be desynced.

Oh. Didn't know that. I was just guessing since the DOSBox video catpure usually records 320x200x16 at 70 (point something) fps.