spiroyster wrote on 2018-02-05, 14:32:@SteveC @Jo22
Yes thanks. Matrox ftp does have PG-XXXX files, which I already had, but can't find anything on SM-XXXX series (ii […]
Show full quote
@SteveC @Jo22
Yes thanks. Matrox ftp does have PG-XXXX files, which I already had, but can't find anything on SM-XXXX series (iirc thats why I contacted Matrox UK). I may have to revisit this as a summer project this year.
Here are some low-res images (dug out of my piccyes folder, cards are in storage somewhere) of the early ISA '3D accelerator' cards. The model number relates to the max screen resolution supported by said card.
SM-640 (IBM PGC compatible, unknown how this is actually implemented)... NOT MINE (found on vcf iirc) !!
SM-640A_4.jpg
Neat, so there were some 8-bit PC 3D accelerators out there, or at least that one.
I don't think there were any historical examples of 8-bit PC/XT slot compatible 3D accelerator or coprocessor boards actually posted in this thread:
8 bit 3D card
Also, the 3DO blaster is neat, but I assume it had all the same restrictions placed on it as the 3DO console. ie software could only be programmed through the proprietary software libraries, compiled and encrypted to the 3DO company's software standards. So no low-level access to hardware or potential for 3rd parties to develop (or port) other APIs to it. So performance was just as bottlenecked as the console and the PC itself just acts as a glorified power supply.
I'm pretty sure even the first wave of 'standard' PCI 3D accelerators had more low-level hardware access than that, as did the other 3D-oriented game consoles, and they needed it to get decent performance. (even on the Playstation, while the GPU side of things fared pretty well with Sony's graphics API, devs still needed to write efficient code to the host CPU and manage the Geometry DSP effectively, which also typically meant some assembly language optimizations added to compiled code, and also the potential for custom in-house or even game specific C compilers to make best use of the R3000A)
Otherwise the 3DO Blaster, even in ISA form should've been able to benefit a great deal from the host PC's CPU working in parallel with the 3DO hardware, potentially relegating the 12.5 MHz ARM 60 primarily to the role of controller/manager of the GPU (or rather, the Cel engine texture/sprite/polygon processor and the geometry DSP) as well as acting as the audio controller.
The PC's CPU could run the entire logic and AI portion of the game engine on its side and offload all of that from the 3DO's rather limited CPU. You could also potentially split the 3D geometry overhead between the PC x86 host CPU and the geometry DSP onboard the card.
The 3DO itself is pretty bandwidth choked with the ARM CPU having no cache and no local ram, but sharing the 2MB of main memory as texture/sprite memory. (the 1MB of dual-port VRAM was used for the framebuffer) Though that also meant that games using very low res textures or lots of untextured, shaded polygons ended up giving the ARM 60 a lot more time to work in RAM, sort of like having a 12.5 MHz 386DX with 2MB of 0WS 80ns DRAM installed along with an AGP style local bus interface to the video card with DMA used to load pixel and texel data and render it to the framebuffer.
The Atari Jaguar has a similar problem, though with an even slower 13.3 MHz 68000, though tricky programmers sometimes worked around that and pushed more than just T&L and polygon set-up rasterization duties on the RISC GPU core at the expense of lots of paging chunks of code into the 4kB local program RAM, since the GPU can't normally execute code from main memory. (as a pure 3D accelerator, the GPU should be handling 3D vertex math as well as handling the rasterization in concert with the blitter, setting up each line of a polygon and sending blitter commands to do flat or gouraud shaded line fills or draw flat or gouraud shaded texture spans, the latter being substantially slower than the former) The GPU is also supposed to be tasked with writing complex object lists for sprite-heavy games using the object processor, but that's a whole other side of the hardware.
Oh and bear in mind the 3DO, like the Saturn and Nvidia NV-1, used forward texture mapping as well as quadrilateral primitives. The latter isn't that unusual (and I believe a number of software renderers natively did quads), but forward texture mapping is a lot more exotic and has some quirks, namely issues with overdraw and both performance hits and rendering artifacts related to that overdraw. In addition to that, you're limited to a 1 texture to 1 quad basis, no applying a texture stretched across a polygon strip or fan, so textures need to be sub-divided appropriately and also pre-processed accordingly. (unless the polygon is a true square or rectangle, you need to draw the source texture as intended in its non-square quad or triangle form and then stretch it out to fill a square or rectangular texture matrix or texture 'stamp', so when rendered it gets skewed back to the intended shape and looks right)
Typical reverse texture mapping has non-linear texel fetches and linear pixel writes, but forward texture mapping has linear texel reads and nonlinear pixel writes.
Rather than rendering spans (lines/rows) of pixels, traversing the raster and making 1 texel fetch per screen pixel, you do a single texel fetch and then spit that out to the screen as many times as needed to fill the scaled/skewed translated area. This mostly works fine when you take a low-res texture and scale/stretch it up, but when you have a texture being shrunk/squished it means you have to over-write pixels (you scan through every texel once and end up writing pixels to the same screen location multiple times) but you also end up corrupting/distorting gouraud shading and translucent blending effects since overlapping pixels end up blended together rather than just the final output pixels being blended with the framebuffer.
There's the additional issue that forward texture mapping is difficult to optimize burst/page-mode writes to the framebuffer and you effectively need a pixel destination cache/buffer to do that. OTOH, the linear texel reads means a texture cache is largely unnecessary. (implementing a tile-deferred rendering scheme with a destination buffer corresponding to a fixed tile size might have been an option for developing that system further ... I wonder if Nvidia's NV2 GPU went that direction)
A render output cache/buffer would also potentially have solved the alpha blending error issue as the blending stage could be applied to the resulting 2D output tile and not during the 3D texture translation and rendering stages. (making heavier use of light maps and avoiding gouraud shading on textured polygons could avoid the distorted lighting effects, too) Applying bilinear filtering to the render output tile would also probably be the way to go there. (doing it in the 3D translation and rendering stage would have the same overdraw blending issues as translucency/alpha blending)