VOGONS


First post, by Harry Potter

User metadata
Rank Member
Rank
Member

Hi! I'm working on several compression techniques for several systems and am having a performance issue: the lz77 scan code is too slow. 🙁 On 8-bit computers, it is written in assembler. A 16-bit version uses some assembler, and I need to convert it all to assembler. I applied a suggestion to optimize the 8-bit version's inner loop, and it worked but only slightly. I was told about using hash-tables. I don't fully understand it. 🙁 Can somebody here explain it better for me? I have another way to optimize lz77: use an 8k array of bits, where each bit specifies whether an associated word was found or not. It helped slightly. How else can I optimize an lz77 technique?

BTW, I have a text compression for 8-bit systems called printtok. It allows you to compress strings using tokens and RLE of spaces. It helps a text adventure I'm creating's text by about 25%. Should I convert it to DOS?

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 2 of 14, by Harry Potter

User metadata
Rank Member
Rank
Member

Very. An 880k file takes a few minutes to compress on a semi-modern Win11/64 laptop.

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 3 of 14, by GemCookie

User metadata
Rank Newbie
Rank
Newbie

Define "semi-modern"; this could mean anything from an Athlon 64 X2 to a Coffee Lake i7.

Asus Maximus Extreme (X38) | Core 2 Quad Q9550 | GTX 750 Ti | 8 GiB DDR3 | 120 GB SSD + 640 GB HDD | Sound Blaster X-Fi Titanium | WinXP64, 7, 11
Fujitsu D1215 board | P3 866 | Riva TNT2 M64 | 256 MiB PC133 CL2 | 120 GB HDD | WfW 3.11, Win95, NT, 2k, XP

Reply 4 of 14, by Ringding

User metadata
Rank Member
Rank
Member

What’s the purpose of this nit-picking? Every desktop CPU that runs Win11 is likely in the same performance range, by an order of magnitude.

A few minutes for less than 1MB is so slow that I agree it is embarrassingly slow 😉. I just tried gzip on a 4MB file, and it took 0.1 seconds. I would have a look at how it or other open-source compressors (zlib, lzop, zstd, lzip, ...) handle their window search.

Reply 5 of 14, by Harry Potter

User metadata
Rank Member
Rank
Member

I attached the code that compresses using lz77. It is for Digital Mars C and Win32. I believe my laptop is less than four years old. BTW, part of the slowdown I believe is due to my printing the current position in the file being compressed.

Attachments

  • Filename
    BR.C
    File size
    14 KiB
    Downloads
    50 downloads
    File license
    Public domain

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 6 of 14, by bakemono

User metadata
Rank Oldbie
Rank
Oldbie

How big is your search space for pattern matching? IIRC things like PKZip only search a 32KB window for duplicate strings (since Deflate compression includes something similar to lz77 in it).

I believe the 'hash tables' refers to an algo where you do a single pass of the data, calculating a rolling checksum of strings of fixed length (eg. 4 bytes) and build a table of offsets where each table entry corresponds to a checksum/hash and contains the offset where the matching string was found. So then when you have a string coming into the compressor and you need to know if it is a duplicate of previous data, you hash the first 4 bytes and use that as an index into the table to get the offset. This is much faster than a brute-force search through the previous data. At that point you can verify whether the data actually matches (ie. not a hash collision) and whether the match is longer than the 4-byte minimum. Or maybe the string isn't a duplicate at all, in which case the table entry would be some pre-initialized 'empty' value.

again another retro game on itch: https://90soft90.itch.io/shmup-salad

Reply 7 of 14, by Harry Potter

User metadata
Rank Member
Rank
Member

So, if a match is found, start scanning from there, and, if only the first two or three bytes match, scan to there, right? Also, if a match is not found, don't compress with lz77, right?

BTW, I'm using a 60,000-byte window for pattern-matching.

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 8 of 14, by Harry Potter

User metadata
Rank Member
Rank
Member

I'm just about ready for the optimization. Are there any other recommendations to speed up lz77 block-searching? I have a method that helps a little but needs an 8k array of bits, where each bit determines whether an associated word was already used. For every lz77 scan, I first check the array to see if the word was already used. If not, don't scan and just return no match. For every byte compressed, set the bit in the array corresponding to the current word. What do you think?

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 9 of 14, by bakemono

User metadata
Rank Oldbie
Rank
Oldbie

By 'word' you mean 16 bits, right? It sounds like it would speed up the compression. The only thing missing from this approach is any kind of hint on where to scan.

Do you scan beginning with the oldest (lowest address) data? Maybe it would be faster to search in the other direction. (Particularly on CPUs with L1 cache, since more recent data is less likely to have already been ejected from the cache)

again another retro game on itch: https://90soft90.itch.io/shmup-salad

Reply 10 of 14, by Harry Potter

User metadata
Rank Member
Rank
Member

By 'word' I do mean 16 bits, and I do scan from the most recent byte first. The last time I tried this technique, it only helped a little. I'm going to apply this now on an 8-bit version. 😀

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 11 of 14, by Harry Potter

User metadata
Rank Member
Rank
Member

My idea didn't work. 🙁 And it cost me 8k on an 8-bit system, which is significant. Worse yet, it hurt the compression ratio. 🙁

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 12 of 14, by Harry Potter

User metadata
Rank Member
Rank
Member

I was trying a version of LZW, but the compression ratio was very poor. 🙁 I gained a lot of ground with it, but the ratio is still poor. I attached two code snippets of the technique:

Attachments

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 13 of 14, by Harry Potter

User metadata
Rank Member
Rank
Member

I forgot to mention: I'm still trying to process the docs on hash tables. While I'm trying to figure them out, is there anything else I can do to optimize LZ77?

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 14 of 14, by Harry Potter

User metadata
Rank Member
Rank
Member

These files are a little dated. I attached more recent copies.

Attachments

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community