So, there's quite a few different definitions of multiprocessing.
For the modern form of it, where normal end user processes can run on any arbitrary CPU, this requires all CPUs to be the same architecture, and the OS to know how to assign processes to said arbitrary CPUs.
Complicating that particular discussion is the stipulation in the topic of home PCs. (Office PCs were also mentioned, and that's much simpler, I'll get to that, because office PCs includes workstations.)
For home machines in particular, under the modern definitions of multiprocessing, a Windows-based multiprocessing home PC was not possible before XP (because XP was the first home release of NT, and no multiprocessing of any form was supported under non-NT Windows), and a multiprocessing home Mac was not possible before OS X (OS 7.5 through 9 supported multiprocessing, but not in the modern sense - only multiprocessor-aware applications could use it at all, and they specifically called a multiprocessing API to spin threads off, the main process remained running on the first processor).
I'm going to go with the BeBox as the first modern multiprocessing home machine, and that only because they aimed it at home enthusiast users as a secondary market, content creation being the primary market.
For x86, all of the pre-Pentium D/Athlon 64 X2 hardware except for the Abit BP6 was marketed as workstation-class hardware, not enthusiast-class hardware. Therefore, the BP6 was the first home enthusiast multiprocessing x86 motherboard, although it needed BeOS, Linux, or NT4/2000 to take advantage of it. For Windows specifically, as a home platform, it'll be whatever non-workstation machine was first to get either an Athlon 64 X2 or a Pentium D slapped in its socket (because the Abit BP6 was out of production by the time XP came out).
Now, for office, this becomes far, far easier. Sun SPARCStation 10 for desktop RISC workstation, NCR 3360 for deskside x86 workstation (dual Pentiums), VTech Platinum SMP for desktop x86 workstation (dual 486DX2s). (Also, the Platinum SMP configurations that didn't ship with NT 3.1 didn't have the SMP Power Board, which was the daughtercard that carried the second CPU, and went into a modified VLB slot, they were only single CPU.)
And, really, everything above is on marketing lines more than anything. Really, you could run any of the above machines for home use, although for an NCR 3360 or Platinum SMP, you may well need to reboot into DOS/Windows 3.1 to do a lot of tasks (but an Abit BP6, running Windows 2000, would have been fine for most things), and the SPARCStation 10 would confine you to the Unix world (although x86 emulators existed for running DOS and Windows 3.1 in a pinch, even back then).
Now, under other definitions, which allowed for task offload (ala the pre-OS X Macs)... for home use, I'll concur with the C64 and its widespread use of the 1541's CPU to offload certain tasks to, although a ROM upgrade on both ends was needed to improve performance if those workloads required passing a lot of data between the host and 1541 CPUs. There were certainly many other asymmetric multiprocessor architectures at the time, but usually they were used for I/O offload (IIRC, the HP 9845 example was using the second processor, which was in the monitor, as a GPU). (And, while some of the Z80 cards for Apple IIs supported running independently of the 6502 (PCPI Applicards and clones), that wasn't usually used for actual multiprocessing workloads.)