VOGONS

Common searches


Search results

Display options

Re: Vogons Video Announcement Thread

in Milliways
Time for a 2 year update [1] on the CGA Graphics programming I've been doing. I announced some years ago on Vogons that I was writing a CGA graphics library. This took a back seat for a while as I tried to figure out how to do 3D rotation (which was a much harder project than I thought). [The …

Re: CGA Graphics library

in Milliways
@zorko Yes I saw that code. It works, but you are incurring the cost once per line. If we were to draw the sprite in columns instead of rows then the cost would be incurred every byte instead. That is what I meant when I said I don't see how to do it really fast.

Re: CGA Graphics library

in Milliways
@zorko Thanks for your efforts here. Sorry I have taken so long to reply, but it is sometimes quite difficult to keep up with everything. I agree that a sprite should not be a multiple of 4 pixels wide. One way to deal with this is to add an additional "colour" in the sprite which does not really …

Re: CGA Graphics library

in Milliways
Hi zorko, Thanks for your comments. Sorry that I didn't see your message until today. I think that a routine to put an image onscreen very fast is a great idea for the CGA library I'm writing. And I agree that sometimes one doesn't want the data format to be too complicated. I am planning on making …

Re: CGA Graphics library

in Milliways
At the risk of posting EVERY SECOND VIDEO from my channel.... I have finally achieved 3D rotation (of a tetrahedron) in CGA graphics mode, which is a bit of a milestone for the CGA Graphics library. If you want to see the video itself, it's here: https://youtu.be/3DQ7HfGN60s

Re: CGA Graphics library

in Milliways
For anyone interested, the new ellipse code is here: https://github.com/wbhart/CGAGraphics/blob/master/cga320x200/fast/ellipse.asm Unfortunately it didn't come down to 4000 cycles for the setup, but the total cost for an ellipse is around 8700 + 140 cycles per pixel. That 8700 seems high, but it's …

Re: CGA Graphics library

in Milliways
There's an even more important advantage than making the coordinates fit into a byte, which I didn't mention. You have less computation to do as you move across the screen horizontally. In the case of drawing a straight line, for example, something like 3/4 of the time is spent computing which …

Re: CGA Graphics library

in Milliways
I have made some significant progress on the CGA graphics library I'm writing, after weeks of effort. I now have pixel perfect ellipse code that runs in 140 cycles per pixel on the 8088 @4.77MHz without turning off interrupts. On my 10MHz 8088 it's 147 cycles per pixel (I previously called this an …

Re: Raster bar implementation on Amstrad PC1512

in Milliways
Last night I thought I would try adding an additional stosb every few rasters and see if it would fit. You can't add an extra stosb every raster or every two rasters, but it seems to be possible to add an extra one every three rasters, maybe. The problem seems to be that jitter caused by detecting …

Re: Raster bar implementation on Amstrad PC1512

in Milliways
So there are two different possible types of wait states. One is "wait until a particular clock is at a particular phase" and the other is "wait a certain number of cycles". The first type can replaced by an earlier nop but the second can't. It's likely that there are some of each type introduced …

Re: Raster bar implementation on Amstrad PC1512

in Milliways
I tried the shl ax, cl trick. The 5, 2, 3, 2, 3, 4, 2 nops can be replaced with shl ax, cl with cl = 2, 1, 1, 1, 2, 2, 1. So I make that 16, 12, 12, 12, 16, 16, 12 cycles. So the gaps between the stosw's account for just 96 cycles total. The nops were just 63 cycles total, so at least we've found …

Re: Raster bar implementation on Amstrad PC1512

in Milliways
Ah, the reason why adding a single nop per frame made such a difference is because of instruction alignment. It probably matters whether stosw nop or nop stosw is prefetched, since if the nop has to be prefetched at the wrong time, it pushes everything else out.

Re: Raster bar implementation on Amstrad PC1512

in Milliways
There is no three raster pattern that is better than the one raster pattern I mentioned above. Both of the sequences of 3 nops can be replaced with 4, especially the second sequence of 3 and one almost gets away with it. There's just a tiny bit of jitter as the stosw's occasionally miss their …

Re: Raster bar implementation on Amstrad PC1512

in Milliways
I know what I can do. I can use shl reg, cl instead of nops. Then the CPU won't need to access the bus and I can get variable length instruction timings. Perhaps the CGA wait states are really stopping the CPU from using the bus to prefetch the nops. I should have thought of this earlier. I have …

Re: Raster bar implementation on Amstrad PC1512

in Milliways
I know what I can do. I can use shl reg, cl instead of nops. Then the CPU won't need to access the bus and I can get variable length instruction timings. Perhaps the CGA wait states are really stopping the CPU from using the bus to prefetch the nops. I should have thought of this earlier.

Re: Raster bar implementation on Amstrad PC1512

in Milliways
That would mean the numbers really don't add up. A raster is 509.6 cpu cycles. If we take 11 cycles for stosw (granted, I used the wrong number here) and just 12 additional cycles for the part of the access not already included in the stosw time, then 7 stosw's and 21 nops should take 161 + 63 = …

Re: Raster bar implementation on Amstrad PC1512

in Milliways
2 stosbs = 2x11 cycles + 2x8 cycles of 8 bit memory accesses (including what the manual calls bus cycles) = 38 cycles 1 stosw = 15 cycles + 16 cycles of 16->8 bit conversion costs + 2x8 cycles of 8 bit memory accesses = 47 cycles stosw is 11 cycles on 8086. I think the 2x8 cycles of 8 bit memory …

Re: Raster bar implementation on Amstrad PC1512

in Milliways
The RAM is dual ported but that doesn't mean that the two ports are completely independent - the VDU controller still needs to tell the RAM which addresses to send to the raster when. That is easier to do if all the VRAM control logic is derived from the same clock, i.e. the one that comes from the …

Page 1 of 4