The problem is mostly that of de-facto standardization and the terminology equivalences that spring from that.
There are a few recognized forms of synthesizing sound:
1) Additive. Take a simple waveform (like a sine or triangle wave) and add another simple waveform to it to make it more complex. Without getting too deep into sound theory, suffice to say that you can create any sound by adding together enough sine waves of varied frequency. This is the basis of the Fourier Transform -- a way of breaking down a complex sound into a set of simple sine waves. Practical synthesizers are going to be limited in the number of waveforms available for a given voice, though, and so the resulting sound tends to be best suited to things like bells or organs, or imitations of semi-complex sounds like piano strings.
2) Subtractive. Take a simple waveform and shape it with filters. This is the overarching classification for PWM synths like the C64 SID, or well-known musical instruments like the Roland Juno or Moog stuff. If you're generous, you could lump in even those that don't offer much or any filtering, like the CMS, NES, Game Boy, Sega Master System, or even the PC speaker.
3) Frequency Modulation (FM). Now we're leveraging not only additive synthesis (combining waveforms), but also creating a pipeline where we can alter the characteristics of one waveform based on the properties of another. Since it's kind of an extension of additive synthesis, the same restrictions apply. It take a lot of simple waveforms to successfully reproduce a complex sound, and most synthesizers don't offer that much horsepower. The common Yamaha OPL2 and OPL3 can only throw a couple waveforms together. Yamaha's professional synths, like the DX-7, can add a few more -- and the resulting sound is much more sophisticated but still can't emulate real-world sounds very well. Interesting ones though, for sure.
4) Samples. Primarily, this takes a snapshot recording of a sound and can optionally play it back slower or faster to change both its duration and its pitch. Multi-sampling takes snapshots at various pitches to alleviate some of the duration-vs-pitch trade-off, but the resulting "voice" takes more memory / ROM / disk space to store. You can also define areas within a sample that can be looped to prolong the recorded sound indefinitely. You would then apply volume curves to restore the natural profile of that sound -- the initial attack, the decay to silence, etc.
5) The classification of techniques kind of known as virtual-acoustic. I.e., mathematical algorithms that simulate the physics of an instrument. Typically, this will be tailored to a specific KIND of instrument, like Pianoteq's piano (and harp, and harpsichord) simulations, or Yamaha's VL70m which can do awesome horn simulations (woodwinds, brass, etc.), or simulated strings like guitars and so on and so forth. Not something you'll likely encounter for game music, more of a professional instrument thing.
Now here's where things get complicated. There are so many terms that sometimes mean the same thing, and other times only have meaning in context. For example, the term "wavetable" could be referring to:
1) The elementary waveforms available to additive / subtractive / FM sound generators. E.g., each oscillator can select from a wavetable of square, pulse, sine, triangle, sawtooth.
2) Samples stored in ROM that can be played back on a sample-based synthesizer, like a Wave Blaster, Sound Canvas, AWE32, etc.
Then there's "FM". In the PC audio context, we know this is basically always referring to the Yamaha OPL2 / OPL3 chip on an AdLib, Sound Blaster, Pro Audio Spectrum, YMF, ESS, etc... OR, the emulation of that OPL chip in CQM, Crystal's thing, software like DOSBox, etc. In reality, "FM" is just a synthesis technique, but we know it's synonymous with OPL when talking about PC audio. Out in pro audio, you're more than likely talking about the DX-7 or its siblings, but could just be talking about a generic synthesizer like Native Instruments' FM-8. People will often refer to FM and AdLib interchangeably, because the original AdLib was JUST an OPL2 chip and supporting electronics, so there wasn't much ambiguity. When the SB provided backwards compatibility with the AdLib, it muddied the waters somewhat, but you still know what someone means when they say "it's AdLib sound". It just means it uses an OPL2 FM synth.
One thing about FM is that, even though we're talking almost exclusively about the OPL, there are subtle variations in terms of what is supported. For example, the original AdLib used IO ports around 388. Some Sound Blasters use this range for backward compatibility, but may also present the OPL2 at IO 220. Or TWO OPL2s at 220 and up, like on the SB Pro 1. Or, the OPL3 at 220 and OPL2 emulation mode at 388. So, it might be necessary to say "this game supports AdLib (i.e., OPL2 @ 388) and SB Pro (2xOPL2 @ 220), or just SB (OPL2 @ 220)." Usually, though, you can just say "AdLib-compatible" and call it a day because that's close enough.
The venerable MT-32 is a bit of an odd duck. It combines very short samples (because 80s -- ROM is expensive!) with simple waveform synthesis to create a realistic attack, and a convincing-enough sustain that it sounds more or less like whatever it's trying to emulate. Very effective for the time, and leaps and bounds beyond what most of us had. It's not totally unique, but most products would usually either be synthesizers (creating sounds from scratch) or samplers, not both. And since the the MT-32 is so prolific in game music soundtracks (thanks in large part to Sierra), we tend to refer to it as its own category. The various emulations of the MT-32 in other synths do little more than map the stock sounds to whatever that synth has on-board, and call it close enough. It's kind of like going to the paint store and asking for "purple". You'll get something that is the right general color, but probably not exactly what you had envisioned.
Now, when we talk about "MIDI", we're technically referring to sending musical data as sheet music. It's up to whatever the synthesizing device is to make sense of it and produce some kind of sound. Because that's such a loose interpretation, the MIDI forum came up with the "General MIDI" standard that at least dictates that a Piano is a Piano, and a Xylophone is a Xylophone. It doesn't say what that piano will sound like, just that it'll be a piano sound of some description.
The MT-32 is a MIDI device, in that it can receive note data via MIDI. It is NOT a General MIDI device because it does not conform exactly to the GM instrument map. The Sound Canvas is a GM device. Roland and Yamaha both had their own standards -- Roland had GS, and Yamaha had XG. They each dictate similarities within that brand's portfolio, but mean little across brands.
The OPL2 / OPL3 is not a MIDI device. It is sent discrete commands via a proprietary programming interface. However -- you can create a driver that speaks OPL and translates MIDI (or even General MIDI standard maps) to OPL commands. This is how the Windows FM synth driver works, and is often included in game sound engines like those used by Sierra and Apogee, etc. That way, they just create a MIDI soundtrack and send it to the appropriate driver to be acted upon by either your OPL sound chip, your MT-32, your Sound Canvas, or whatever you happen to have.
The AWE music synthesizer (EMU 10K) is also not a MIDI device. It takes software to translate MIDI data into EMU 10K commands. Same goes for the Gravis Ultrasound products.
The MPU-401 is a MIDI interface. It acts like a serial port -- a standard way of getting MIDI data out of a PC via a known programming interface. It's not the ONLY way to do it, it's just the only one that matters most of the time. Early Sound Blasters had a proprietary MIDI interface that not much supports. The synthesizer you use with an MPU-401 is up to you. It could be GM, it could be an MT-32, it could be a stage lighting rig controlled by MIDI data.
Even though specific hardware, like the AWE or GUS, are not native MIDI devices, we still tend to refer to them as MIDI synths because, for most people most of the time, we're using them to play back MIDI data through some kind of driver that is doing the translation. It's accurate enough to call them MIDI synths for that reason. You're getting into the realm of technicalities by being a stickler about terminology.