This is off the top of my head, so it might be incorrect, but here goes. I would write a simple delay routine and test it using a standalone test program in Dosbox to make sure it works with different CPU speeds. I would then inject it into the original executable in a suitable place. To find a suitable place (places), run the game in Dosbox and use the debugger to find the main loop(s).
I haven't done this before, but this might work. The main loop probably calls several subroutines - pick one of the subroutine calls and replace the call site with a call to a wrapper subroutine which calls the original subroutine AND your delay subroutine. You should be able to append your wrapper subroutine and your delay subroutine after the original code and data. You have plenty of space in the executable for your injected subroutines since the executable is 40 KiB (COM is limited to 64 KiB).
I'm not sure what tools to use to patch the executable. Maybe disassemble the COM file with IDA Freeware and try re-assembling it with your modifications.
When modifying executables like this, it's important that you don't add or remove anything within a section/segment, because that messes up addresses. You can append stuff at the end or prepend stuff at the beginning, or replace things such as instructions or addresses in the middle as long as the number of bytes doesn't change. Also, your wrapper subroutine must have the same calling convention as the subroutine that it's replacing - that means preserving certain registers.
Please ask if you have further questions - I'll try to answer. I've also struggled to understand this stuff by myself 😀