You can't intercept memory access with the tools you have on an IBM XT. There is no on-processor MMU (which appeared in the 386 an can be programmed by a virtualization manager like EMM386), and there is no external MMU as well. Instead, the Hercules card has 64KB of memory, and there is a control bit that enables the second 32K being visible on the ISA bus exactly in the address range of the CGA card. Furthermore, there is another control bit that chooses whether the first 32K (in MDA range on the ISA bus) or the second 32K (in CGA range on the ISA bus) should be displayed. So if you configure the card into a graphics mode that uses 80 bytes per scanline (that is 640 monochrome pixel, as CGA does), has banks of 8K (as CGA does, and yeah, Hercules banks are always 8K as well, so no need to configure anything), and displays the contents of the "upper half" of Hercules memory, this is directly CGA compatible without any kind of "interception".