VOGONS

Common searches


Search results

Display options

Re: Raster bar implementation on Amstrad PC1512

in Milliways
The reason why some references say 11 cycles for STOSW and some say 15 is that the former are for 8086 and the latter for 8088 - there is an additional 4-cycle penalty on 8088 due to the narrower bus. Of course there is, and I feel silly now for not noticing that. I actually originally wrote that …

Re: Raster bar implementation on Amstrad PC1512

in Milliways
Another thing that bothers me is this statement in the manual: "The VDU display timing and system CPU/DMA timing are derived from different, unrelated reference frequencies. For this reason CPU accesses to the display RAM must be synchronised to the display timing by the VDU controller, and this is …

Re: Raster bar implementation on Amstrad PC1512

in Milliways
Oh wow, that is really helpful! I had already concluded STOSW was probably taking 24 cycles, but for different reasons to you. I had 11 CPU cycles for STOSW without a REP prefix, not 16. There could be another 4 cycles for prefetch, presumably. But I suspect there isn't much going on on the CPU bus, …

Re: Raster bar implementation on Amstrad PC1512

in Milliways
It does sound like it is worth giving a try. Hopefully I'll find some time at some point. First I want to investigate whether patterns that are longer than a single raster might be better than what I have. I'm not sure how to check that just yet, but I have a strategy that might work. Some of the …

Re: Raster bar implementation on Amstrad PC1512

in Milliways
What you say about the toggle bit is almost certainly the case. That all occurred to me too. I was going to mention it in the video, but it started to get a bit technical. I did actually at some point consider using the PIT. As I mentioned in the video, I actually got into this by trying to optimise …

Raster bar implementation on Amstrad PC1512

in Milliways
A few people have noticed that raster bars don't work on the Amstrad PC1512. I believe I figured out why and then I found a technique for making them work. I made a video about the whole thing here: https://youtu.be/mZSepcma948 I actually don't know if raster bars are supposed to work in graphics …

Re: CGA Graphics library

in Milliways
I've now written assembly code to precompute ellipses. It is actually now slightly faster to first precompute the ellipse then draw it from the precomputed information. This is not terribly surprising given that the precomputed data is used for both halves of the ellipse and no longer needs to be …

Re: CGA Graphics library

in Milliways
I wrote some code for drawing precomputed ellipses. It starts with an array of bits specifying which pixels of the verticalish and horizontalish parts should move horizontally/vertically respectively from their predecessors. The result is pretty fast at 91 cycles per pixel, which is about a 50% …

Re: CGA Graphics library

in Milliways
I made a little video on my channel about the fast ellipses, with a little "demo effect", albeit computed in real time, rather than precomputed: https://youtu.be/7o07XN6tucQ

Re: CGA Graphics library

in Milliways
The previous fastest code I had for general ellipses on the 8088, prior to this, was 342 cycles per pixel, so the new code is more than twice as fast. Update: the code for the full ellipse is now in the repository. It's 1400 lines of assembly code!

Re: CGA Graphics library

in Milliways
I finally have the ellipse code working, at least for the right hand half of the ellipse. So I can now give timings on the 8088 @ 4.77MHz. It's taking 156 cycles per pixel. On the 8086 @ 8 MHz it takes 130 cycles per pixel. I actually think that it could be faster to split the algorithm into two …

Re: CGA Graphics library

in Milliways
I discovered a way to do all semiradii up to the 160, 100 needed for a full screen ellipse. It's not perfect, with about 50 pairs of semiradii leading to a single pixel artifact, 1 leading to a two pixel artifact and one leading to a 3 pixel artifact. But I think all ellipses will still visually …

Re: CGA Graphics library

in Milliways
I discovered that my original Julia code had a dumb bug in it and the original algorithm doesn't quite work the way I thought. Fortunately I've been able to adapt it to something very similar to what I had come up with that works for semiradii less than 100 except for the following: 26 94 44 39 75 …

Re: CGA Graphics library

in Milliways
I finally found some more time to work on the horizontalish part of the ellipse code. Not all the same tricks worked, so I ended up using SS and ES to store regs temporarily. With the following register assignments: ; di, di+bx offsets of points above and below axis, ax: pixels ; dl: deltax (lo8), …

Re: CGA Graphics library

in Milliways
Actually, I have an idea how to fix these problems. I can switch the roles of al and ch. Then al will not be used in the non-a section and can be stored to immediate in CS so that the whole of ax is available as an accumulator in the non-a section. Then al can be restored from the immediate before …

Re: CGA Graphics library

in Milliways
Below is how the code works out for the verticalish part of the ellipse. Basically, if we number code sections 1, 1a, 2, 2a, 3, 3a, 4, 4a where section n is the section that deals with pixels that have x-coord equal to n-1 mod 4, and the "a" sections are the alternative endings for those sections, …

Re: CGA Graphics library

in Milliways
I think I may have found a reordering of the code which brings the maximum conditional jump down from over 160 to just over 110 bytes. It looks promising, but I'll have to implement it to make sure I didn't screw up. I'll probably do that tomorrow.

Re: CGA Graphics library

in Milliways
Below is what each pixel of the verticalish part of the ellipse looks like (modulo errors), with the following registers: ; di, di+bx offsets of points above and below axis, ah:accumulator/scratch ; dx:deltax (hi16), sp:yinc, bp:deltay (hi16), ; al: D (lo8), ch: deltax (lo8), cl: deltay (lo8), si: D …

Re: CGA Graphics library

in Milliways
Oh, except the following *always* draws the ellipse without any glitches (up to semiradii 160, 100)!! function ellipse2(A, r::Int, s::Int) i = 1; x = r; y = 0; r_orig = r c = (s*s) << 8 a = (r*r) << 8 D = 0 xdelta = 2*c*r_orig ydelta = a while (xdelta >> 8) >= (ydelta >> 8) A[i] = (x, y); i += 1; D …

Page 2 of 4