VOGONS


First post, by superfury

User metadata
Rank l33t++
Rank
l33t++

I'm trying to diff two ~4GB text logs(common emulator logs) with diff, but it tells me "insufficient memory" and quits. Using kdiff almost immediately after opening the two files with a regex added to filter the results(to make it ignore all general purpose registers but ESP and EBP) makes the entire computer hang with 100% hard disk access time(mouse&keyboard unresponsive), requiring the computer's hard reset button on the computer case to reboot and make it responsive again.

Anyone knows something I can use to diff those files(-u format) with regex filtering(for the above-mentioned registers) without making the entire machine lock up?

I'm using Windows 10 64-bit with 8GB of RAM(also i7-4790K).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 1 of 9, by root42

User metadata
Rank l33t
Rank
l33t

I wonder if you can break down this problem into smaller chunks. First of all, you can filter the input files through grep or sed to remove the lines with those registers you do not want to see.

Second, maybe you can chunk the files into smaller pieces, if there are any kind of milestones in the logs.

Also: what kind of diffs are you interested in? Is it only the contents of the registers and how they change? You can use grep to show only the part of the line that matches and then filter through uniq to remove line doubles. That way you can see a trace of how the values change and reduce the search space massively. But maybe you can give a few lines of examples from the logs so we can tear down your problem further.

YouTube and Bonus
80486DX@33 MHz, 16 MiB RAM, Tseng ET4000 1 MiB, SnarkBarker & GUSar Lite, PC MIDI Card+X2+SC55+MT32, OSSC

Reply 2 of 9, by superfury

User metadata
Rank l33t++
Rank
l33t++

I know I can use programs like 7-zip to break it in smaller binary files of e.g. 512MB each, but the problem is that it doesn't take into account that it's a text file. So I'll need to break it into managable chunks, which contains whole records only. But the problem is that the line containing the instruction differs in length, so a binary file split cannot be used?

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 4 of 9, by superfury

User metadata
Rank l33t++
Rank
l33t++

Just found out about another like head/tail, called split. That could be used to split it to chunks?

https://interworks.com/blog/dholm/2011/04/19/ … ead-tail-split/

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 5 of 9, by root42

User metadata
Rank l33t
Rank
l33t

Yeah, whatever works for you. However I think you should familiarize yourself with the standard UNIX toolset, which can already do all those things. Also, I still suspect that with clever filtering via grep et al. you can already reduce your problem size massively. Would be good to understand the actual problem you are trying to solve.

YouTube and Bonus
80486DX@33 MHz, 16 MiB RAM, Tseng ET4000 1 MiB, SnarkBarker & GUSar Lite, PC MIDI Card+X2+SC55+MT32, OSSC

Reply 6 of 9, by superfury

User metadata
Rank l33t++
Rank
l33t++

Well the actual problem is somewhere within my ~4GB logs of Windows 3.0 booting in standard/386 enhanced mode, as well as 3.1 in 386-enhanced mode. They both crash and I need to compare them to correct logs(from IBMulator I think).

It's in the common log format(both without memory transactions). I also need it to ignore all general purpose registers but ESP and EBP, but report the entire row changes when ESP/EBP differs. Also CR0 differs because of CPU(80486 vs Compaq Deskpro 386) differences(e.g. 60000011 vs 0000fff1 and the same with bit0(PM) cleared).

Author of the UniPCemu emulator.
UniPCemu Git repository
UniPCemu for Android, Windows, PSP, Vita and Switch on itch.io

Reply 7 of 9, by kode54

User metadata
Rank Member
Rank
Member

I sort of know the issue with Windows Subsystem for Linux and the dreaded Out of Memory error. Or maybe it was with MingW built tools. Basically you have to turn your terminal window scroll back buffer size waaaaay down, to like 100 lines or so, or else you get the aforementioned out of memory error.

Reply 8 of 9, by VileR

User metadata
Rank l33t
Rank
l33t

Get Swiss File Knife. Doesn't do everything that the gnutools do, but when it does, it outsmarts and outperforms them.
"sfk split -text" should be right up your alley here.

[ WEB ] - [ BLOG ] - [ TUBE ] - [ CODE ]

Reply 9 of 9, by kode54

User metadata
Rank Member
Rank
Member

My above post is especially pointed now that Windows 10 defaults command prompt scroll buffer size to 9001 lines. The same issue applies to Vim for Windows as bundled with Git for Windows, set as its default editor.