VOGONS


Create better compression?

Topic actions

First post, by Harry Potter

User metadata
Rank Oldbie
Rank
Oldbie

Hi! I am working hard on better compression techniques. So far, some Commodore 64 versions do much better than their Windows counterparts, and I don't know why. 🙁 Worse yet, the C64 versions pass a realism test, while the Windows ones don't. I might need some help with that issue. If I succeed, is there a calling for a much better compression technique? One that can well exceed 7Zip even on an 8-bit computer? So far, I'm getting that on 8-bit systems. If there is, whom can I serve? I need to compress a lot of floppies and Zip disks for myself on several different computers.

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 1 of 28, by megatron-uk

User metadata
Rank Oldbie
Rank
Oldbie

I think you will probably find that there isn't much call to use compressed floppy images or compressed files on floppies/zip/cd or similar these days.

Modern solid state drives really give retro computer users as much storage as they would ever need.

I can see the use for compression in an application or game where you may want to fit data in a given amount of runtime memory, but for storage? Not so much these days.

Storage is cheap for old computers... But processor time is something not as easily increased.

My collection database and technical wiki:
https://www.target-earth.net

Reply 2 of 28, by Harry Potter

User metadata
Rank Oldbie
Rank
Oldbie

So no go? 🙁 I see a lot of ZIP and 7Z files on the internet. A better compressor should work better there, no? At the very least, very large files could be 5-10% smaller. Would that be doable?

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 3 of 28, by megatron-uk

User metadata
Rank Oldbie
Rank
Oldbie

Why?

Storage is dirt cheap. Unless you are inventing your own compression libraries for fun, or use inside an application or game, it's really not bringing any benefit what so ever - is your intention to unzip / uncompress all the content you already have, and then re-compress using your own tool? What benefit will that give you? Yes, you may save a few bytes here and there for things that have to fit on a floppy disk... but are you *really* still using floppy disks to archive things?

My collection database and technical wiki:
https://www.target-earth.net

Reply 4 of 28, by Disruptor

User metadata
Rank Oldbie
Rank
Oldbie

Wait, we're talking about a Commodore C64.
But what does neet to be compressed on the C64? Or is it just the decompression that matters on that little machine?

Reply 5 of 28, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

Well C64 to 1541 drive is almost like using another PC as a filesystem over serial laplink. i.e. it's not terribly fast, so if your data is compressed, it might give an apparent speed up when you're getting 30% more data for your data. There's a point where your storage is not slow enough relative to the speed of your CPU where the time to decompress exceeds the time saved in data transfer.

But anyhoo, zip is not the optimal archiver, it just has the network effect utility of being most common and popular. Others like RAR and ARJ have outperformed it in compression ratio. There are others that shave a few percent off those. Usually though you are choosing speed OR high compression.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 6 of 28, by Harry Potter

User metadata
Rank Oldbie
Rank
Oldbie

Well, I need it for several PCs and a C64C and Plus4, as I use a lot of floppies and Zip disks on them. Also, earlier today, I found one of my flash drives to be almost full, so I kind of need better compression. 😀 I really also want to publish my techniques. 😀

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 8 of 28, by Harry Potter

User metadata
Rank Oldbie
Rank
Oldbie

Some of my compression techniques are doing extremely exceptional but others very poorly. 🙁 Right now, the C64 versions are doing much better than the PC versions, and I don't know why. Also, the C64 versions pass my realism tests, while the PC versions don't. Once I figure it out, I should be closer to debugging the PC versions. I thank you for your support! 😀

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 9 of 28, by root42

User metadata
Rank l33t
Rank
l33t

Can you elaborate a bit of what you did so far? Have you written your own compression algorithm? Are you using off the shelf tools?
Do you know there are competitions for most efficient algorithms, like this:
https://gdcc.tech/
Also, different algorithms perform different depending on the input. There is obviously also a hard limit of how much you can compress arbitrary data, depending on its entropy. Specific datasets might compress even better, but that really depends on the domain the data is coming from

YouTube and Bonus
80486DX@33 MHz, 16 MiB RAM, Tseng ET4000 1 MiB, SnarkBarker & GUSar Lite, PC MIDI Card+X2+SC55+MT32, OSSC

Reply 10 of 28, by Harry Potter

User metadata
Rank Oldbie
Rank
Oldbie

Well...I'm working on my own compression techniques, and so far, some of my compression techniques are doing much better than 7Zip--even some of my 8-bit compression techniques. However, they are no where near usable yet--I have to start debugging them soon. BTW, I have an early PC version I just found, and it was doing much better than 7Zip on large files but much worse on others. I believe I deprecated it because of the latter. BTW, I'm also working on compression of individual strings using tokenization and RLE of spaces and applying compression of literals, but the Commodore community turned me down. 🙁

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 11 of 28, by Jo22

User metadata
Rank l33t++
Rank
l33t++
Harry Potter wrote on 2023-07-22, 23:40:

BTW, I'm also working on compression of individual strings using tokenization and RLE of spaces and applying compression of literals, but the Commodore community turned me down. 🙁

Don't let yourself be put down or discouraged by such people, they're simply like that.
I've noticed similar attitude here in Vogons in past years, too. People change, start a family, getting old..
They're simply nolonger are the cheerful and energetic people they used to be. They need a practical purpose to justify doing things, too.
Anyway, it's not their fault, either.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 12 of 28, by root42

User metadata
Rank l33t
Rank
l33t
Harry Potter wrote on 2023-07-22, 23:40:

Well...I'm working on my own compression techniques, and so far, some of my compression techniques are doing much better than 7Zip--even some of my 8-bit compression techniques. However, they are no where near usable yet--I have to start debugging them soon. BTW, I have an early PC version I just found, and it was doing much better than 7Zip on large files but much worse on others. I believe I deprecated it because of the latter. BTW, I'm also working on compression of individual strings using tokenization and RLE of spaces and applying compression of literals, but the Commodore community turned me down. 🙁

Do you have any code to share or try out? Would be interesting. What kind of compression are you applying? Anything akin to existing stuff? Entropy encoders, Huffmann, Lempel-Ziv whatever...? I only re-implemented a few existing ones a long time ago. But I find the topic quite interesting.

And what do you mean by "turned down"? How can they turn you down? If you simply release something for use on the C64 or so people can just pick it up to use. I mean that's how all the popular IRQ Loaders and packers are anyway.

YouTube and Bonus
80486DX@33 MHz, 16 MiB RAM, Tseng ET4000 1 MiB, SnarkBarker & GUSar Lite, PC MIDI Card+X2+SC55+MT32, OSSC

Reply 13 of 28, by konc

User metadata
Rank l33t
Rank
l33t

So you are developing not just one but some compression techniques that are not just doing better but much better that 7zip? On real data?
Honestly I very much doubt this. I'll be very happy to be proved wrong though and salute a new Galileo 😉

Reply 14 of 28, by Harry Potter

User metadata
Rank Oldbie
Rank
Oldbie

Well, so far, it is, but I need to debug it. I thank all of you for your encouragement. 😀

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 15 of 28, by Harry Potter

User metadata
Rank Oldbie
Rank
Oldbie

Earlier today, I reached 5.6% better than 7Zip overall on my test files but was doing not quite so well on some small files. I am fixing that right now. Unfortunately, I lost the results--apparently, I didn't save the file that recorded them properly--and am retallying the results. Soon, I have to start optimizing and debugging. 😀

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 16 of 28, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie
konc wrote on 2023-07-23, 08:54:

So you are developing not just one but some compression techniques that are not just doing better but much better that 7zip? On real data?
Honestly I very much doubt this. I'll be very happy to be proved wrong though and salute a new Galileo 😉

Especially since we are talking C64 here I wonder if it's a situation of 7z header overhead making the 7z files bigger

Reply 17 of 28, by Harry Potter

User metadata
Rank Oldbie
Rank
Oldbie

Well, the technique about which I'm talking here is actually a 16-bit technique, even though I'm using 32-bit code on a 64-bit computer. I'm sorry about the confusion. 🙁

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 18 of 28, by Harry Potter

User metadata
Rank Oldbie
Rank
Oldbie

Now, I'm doing much better on small files. I'm doing too good, but my realism test passed on them. The realism test involves trying to compress an uncompressible file, in this case a 1.5k .zip file. If it gives me a small loss, the technique might be real or about real. If it costs too much, the technique's probably inefficient. If it gives me even a slight compression ratio, there has to be a bug in the code. I want to give myself until the end of tomorrow to buy some more points here.

Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

Reply 19 of 28, by megatron-uk

User metadata
Rank Oldbie
Rank
Oldbie

That's really not how you should be measuring effectiveness.

You take you source file and create a hash of it, using some standard tool like md5, sha or some other similar hashing algorithm.

You then compress the source file. Record the size.

Finally uncompress the compressed file.

Now run the hashing algorithm in the resulting file. If the two fingerprints are the same, then you have got the same out as you put in, and you know the compressIon (a) works, and (b) how effective it is.

My collection database and technical wiki:
https://www.target-earth.net