VOGONS


First post, by Licentious Howler

User metadata
Rank Newbie
Rank
Newbie

Alright, so here's my funny situation...
I scanned a game manual, my first time doing this sort of thing.
I'm pretty happy with the results, I have these massive 140 MB PNG files, look good imo...

but I don't really know what is the norm that people use to make these, like, conveniently legible for common people.
I'm struggling to get these into a pdf format without tediously downsampling every single image externally, and even then I can't really make it look nice and professional like some replacementdocs stuff or gog manuals. I've tried a Foxit trial, PDFGear, LibreOffice (lol), and while I think GIMP can do what I want it'll probably take me 2 entire days of fighting it.
I feel like there has to be a better way.

I'm wondering how smart preservationist minded people go about this sort of thing, especially because this may just be the beginning for me.

Of course I'll offer the images as high quality jpgs as well, that's easy to do.

(My apologies if this has been discussed before or if this isn't the right place for the topic! I feel like I searched well enough)

Reply 1 of 7, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

I have only "messed around" on and off over the years with this, but feel your pain, successive attempts at trying to get it down to compact and crisp low bit grayscales or even just 2 bit black and white leave it all eroded and speckled.... I swear there was stuff coming with 90s handscanners for Win 3.x that did a better job, convert to mono and despeckle, bam, done... I am not remembering real well the names of any, Textbridge OCR maybe, I know the actual OCR was a bit wonky sometimes, but the processing was great. Anyhoo, good luck finding something good, or the right combo of procedures.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 2 of 7, by DaveDDS

User metadata
Rank Oldbie
Rank
Oldbie

FWIW, In doing "Daves Old Computers", I scanned a lot of manual, some of them quite large...

I used Adobe Acrobat(4.0 is the one I happened to have) which let you scan directly into a PDF.
(I know newer versions can scan to slightly smaller ... but I had scanning/conversion problems
with 6 and haven't bother to buy/try newer .. so I stayed with 4).

Just looking at my "master copy" of the site, I have
413 PDF files, totaling 1,232,063,926 bytes.
Which works out to an average of <3M per file!

Given the amount of material, I don't scan at super-high resolution,
just enough to make the manuals reasonably readable.

Dave ::: https://dunfield.themindfactory.com ::: "Daves Old Computers"->Personal

Dave ::: https://dunfield.themindfactory.com ::: "Daves Old Computers"->Personal

Reply 3 of 7, by liqmat

User metadata
Rank l33t
Rank
l33t

A few things I’ve learned from scanning in large technical documents/manuals.

Images should all be the same resolution and dpi. That sounds obvious, but I constantly see PDFs with varied page sizes and it looks unsightly when scrolling through a document. If text from the other side of a page leaks through in your scans, up the contrast and it should start disappearing. That might make the page a tad harsher to look at so I compensate by lowering the brightness. I also have found with faint print where the letters seem thin and harder to read, upping the sharpness can thicken the font’s appearance. Experiment with that one as I’ve had varied results depending on the font and how faded the page was.

Last edited by liqmat on 2025-02-20, 13:45. Edited 9 times in total.

Reply 4 of 7, by wbahnassi

User metadata
Rank Oldbie
Rank
Oldbie

Ask the owner of archive.org, he had that cool aparatus for scanning thicker books without breaking their spines. A "V" shaped glass surface that pushes down on the open book and is operated by a foot pedal..
I'm guessing he's using cameras to "scan" instead of a typical slow scanner. Scanning a book seems to take 3 seconds per 2 pages, so it's very fast. And I'm sure the software side is tuned to produce files without intervention on each scanned page.

Turbo XT 12MHz, 8-bit VGA, Dual 360K drives
Intel 386 DX-33, TSeng ET3000, SB 1.5, 1x CD
Intel 486 DX2-66, CL5428 VLB, SBPro 2, 2x CD
Intel Pentium 90, Matrox Millenium 2, SB16, 4x CD
HP Z400, Xeon 3.46GHz, YMF-744, Voodoo3, RTX2080Ti

Reply 5 of 7, by Yoghoo

User metadata
Rank Member
Rank
Member

Can't give advice on what the best workflow is for these kind of things. But before you delve into this it's maybe a good idea to look if it's already available.

There is for example a very good DOS game bundle available (which can't be named on this restricted forum) which has 1000s of manuals for DOS games. Maybe look if it's available there (and what the quality is of course).

Don't know if something similar exists for other platforms but maybe someone else can give you a hint.

Reply 6 of 7, by ntalaec

User metadata
Rank Newbie
Rank
Newbie

Today, for preservation purposes, the scans should be at 600 or 1200 dpi and the file format should be PNG.

A DIN-A4 page scanned at 300 dpi has roughly the same number of pixels as 4K, 600 dpi as 8K, 1200 dpi as 16K and so on. If you are going to view the images in portrait mode (using a tablet) 300 dpi should be enough for a 4K display but if you are going to view the images in landscape mode (using a computer monitor) and you want to fit width, you should use 600 dpi for a 4K display. When 8K will become the standard, you should increase to 600 dpi and 1200 dpi.

To distribute, the best format is PDF (with loss of quality). Considering you keep the original scans, I will reduce the images to 300dpi to save them as PDF in order to keep a small file size.

But there are a lot of printed manuals already preserved. Which titles are you going to scan?

Reply 7 of 7, by Licentious Howler

User metadata
Rank Newbie
Rank
Newbie

Okay, sorry for the delay, but I asked around in just a couple of places, and I got a lot of answers to my questions!
It took a lot of time to parse and experiment with things.

One thing I want to share for anybody who might find this topic, is software that fits this use case basically perfectly:
https://imagemagick.org/
with a simple "$ magick *.jpg output.pdf" it auto generated a PDF that was formatted exactly like I imagined, all I had to do was batch downsize images to make the "sane quality" version.

Anyway, so the art and documentation that I scanned was actually everything that came with a Blood CD jewel case, since so many of you were asking, not exactly a hidden gem, hah.

I don't know if what I provided is actually noticeably higher quality than literally everything available out there (I'm not perfect at searching), but feel free to peep it yourself, and I'm definitely open to criticism or suggestions:
https://archive.org/details/manual-9-crop

I'm also learning how archive.org works here on the fly, so I'm not sure if I did this or that suboptimally in terms of presentation or upload (I definitely didn't think that was the URL it was going generate, for example, 🤣)...
shame about this scanner apparently having some scratch on its glass, but I'm likely to end up replacing it anyway.

I did not make the images in the PDF all the same resolution, liqmat, so that was a mistake, oops. In the future I'll scan them in full to make managing that easier. I at least used the same DPI settings, I just cropped out most of the white in the scanner software before clicking "scan".