VOGONS


First post, by evasive

User metadata
Rank Oldbie
Rank
Oldbie

Hi,

For the UH19 project we sometimes encounter files that will not convert properly to PDF. I now have 2 of those.

They appear to be some sort of Office/Word files but the headers don't make sense at all. They have PowerPoint 4 slides embedded.

We have tried Office for Dos, Office for Windows 3.x, Office95, Office97, LibreOffice, WordPerfect for Dos 5/6. They are not WordtStar files either.

Does anyone have a clue what these might be?

If they can be converted, we would like to know how, in case we encounter more of these.

Thanks in advance.

Attachments

  • Filename
    1545.zip
    File size
    29.39 KiB
    Downloads
    49 downloads
    File comment
    zipped .doc file 1545
    File license
    CC-BY-4.0
  • Filename
    1535.zip
    File size
    17.6 KiB
    Downloads
    28 downloads
    File comment
    zipped .doc file 1535
    File license
    CC-BY-4.0

Reply 1 of 18, by megatron-uk

User metadata
Rank Oldbie
Rank
Oldbie

Is it some type of printer-specific PS file, there's references to an HP LaserJet5 PCL and LPT: port in some of the non-ASCII data towards the end of the file after references to various fonts (Ariel, Times new Roman, Wingdings, etc:

000209e0 15 04 13 39 15 04 00 00 00 00 05 00 00 00 07 00 |...9............|
000209f0 00 00 12 00 00 00 13 21 14 ff 15 00 0c 00 09 44 |.......!.......D|
00020a00 44 45 5f 4c 49 4e 4b 31 57 06 00 00 01 51 00 00 |DE_LINK1W....Q..|
00020a10 00 00 79 06 00 00 01 51 00 00 00 00 48 50 20 4c |..y....Q....HP L|
00020a20 61 73 65 72 4a 65 74 20 34 4c 00 4c 50 54 31 3a |aserJet 4L.LPT1:|
00020a30 00 48 50 50 43 4c 35 45 00 48 50 20 4c 61 73 65 |.HPPCL5E.HP Lase|
00020a40 72 4a 65 74 20 34 4c 00 00 00 00 00 00 00 00 00 |rJet 4L.........|
00020a50 00 00 00 00 00 00 00 00 00 0a 03 04 00 44 00 78 |.............D.x|
00020a60 00 03 07 00 00 01 00 09 00 00 00 00 00 00 00 01 |................|
00020a70 00 01 00 fc ff 00 00 01 00 00 00 00 00 20 00 00 |............. ..|
00020a80 00 80 c9 93 94 03 00 00 00 02 00 00 00 00 00 00 |................|
00020a90 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................

My collection database and technical wiki:
https://www.target-earth.net

Reply 2 of 18, by megatron-uk

User metadata
Rank Oldbie
Rank
Oldbie

There's definite MS Word data in there - you can tell from the font definitions around the text and the !EMBED syntax for the slides... however nothing I've tried will parse the file correctly.

My collection database and technical wiki:
https://www.target-earth.net

Reply 3 of 18, by Horun

User metadata
Rank l33t
Rank
l33t

Yes have run into some odd .DOC files before, one was from Phil's and it had a RIFF formatted picture in it (which did not convert using standard PDF conversions).
This one has some PowerPoint as mentioned but all the PowerPoint and Web based Doc converters can not properly convert it.
If I can figure out the image types embedded might be able to hex edit and get the images back but that is a long shot.......

Hate posting a reply and then have to edit it because it made no sense 😁 First computer was an IBM 3270 workstation with CGA monitor.

Reply 4 of 18, by evasive

User metadata
Rank Oldbie
Rank
Oldbie
Horun wrote on 2021-01-27, 00:17:

If I can figure out the image types embedded might be able to hex edit and get the images back but that is a long shot.......

That would be absolutely awesome. Thank you for even considering doing so 😀

Reply 5 of 18, by Horun

User metadata
Rank l33t
Rank
l33t

As an update: last night tried a few more apps, one said 1535 was a Word 5 doc and the 1545 was a Word for Windows 2 doc. Installed DOS Word 5 and it opened 1535 but still no images. The Word Viewer v6 (supposed to support viewing Word 6, WinWord 1.x thru WinWord 6 and Mac Word 4+) would open both but no images and Word 97 + Office 2003 with all the converters did same.
It does appear thru Edit++ and Hex edit that they were created by diff versions of Word with 1545 definitely a Winword or MacWord version.
Maybe that is the issue ! If created by an early MS Mac office version without Windows compatibility that would explain no images in Windows.
Will try some other things in hopes of viewing and at least getting a screen shot of the images or getting an idea of the image types and try to hex edit them back out if possible.
Something to work on 😀

Hate posting a reply and then have to edit it because it made no sense 😁 First computer was an IBM 3270 workstation with CGA monitor.

Reply 6 of 18, by debs3759

User metadata
Rank Oldbie
Rank
Oldbie

OpenOffice 4 doesn't recognise them either.

See my graphics card database at www.gpuzoo.com
Constantly being worked on. Feel free to message me with any corrections or details of cards you would like me to research and add.

Reply 8 of 18, by Horun

User metadata
Rank l33t
Rank
l33t

Got some Word5 code docs and am making minor progress as to be able to extract the images on 1535. 1545 has me really confused and think it is corrupt, using every tool and code reference think it impossible so far to get the images out. It conforms to Winword but has missing code from what I can tell so far. Just an update with no good news...

Hate posting a reply and then have to edit it because it made no sense 😁 First computer was an IBM 3270 workstation with CGA monitor.

Reply 9 of 18, by evasive

User metadata
Rank Oldbie
Rank
Oldbie

Any news would be good news.
Getting the pics would be awesome.
Knowing the 1545 document was corrupted will give peace of mind.
Thank you for your efforts so far 😀

Reply 11 of 18, by Warlord

User metadata
Rank l33t
Rank
l33t

for those files why don't you just use something like cute pdf with ghost script. Just wanted to say I was able to open them with wordperfect later version using their converters. Big issue is the docs use strange fonts. So you have to convert the wingdings or whatever font into Ariel or something else.

Reply 12 of 18, by astonsmith

User metadata
Rank Newbie
Rank
Newbie
wiretap wrote on 2021-01-31, 20:09:

Print --> scan to PDF.

How can we print without being able to open the file to begin with?

Warlord wrote on 2021-01-31, 22:13:

for those files why don't you just use something like cute pdf with ghost script. Just wanted to say I was able to open them with wordperfect later version using their converters. Big issue is the docs use strange fonts. So you have to convert the wingdings or whatever font into Ariel or something else.

Were you able to open 1545? I found that WP12 was able to open 1535.doc, but not 1545. I have attached a conversion to RTF. Older versions of WordPerfect wouldn't touch them ("Unknown file format").

Attachments

  • Filename
    1535_rtf.zip
    File size
    17.8 KiB
    Downloads
    26 downloads
    File license
    Fair use/fair dealing exception

Reply 13 of 18, by Horun

User metadata
Rank l33t
Rank
l33t
astonsmith wrote on 2021-02-01, 12:36:

I found that WP12 was able to open 1535.doc, but not 1545. I have attached a conversion to RTF. Older versions of WordPerfect wouldn't touch them ("Unknown file format").

Thanks for trying ! Using a variety of diff Word procs can open both (by force) but have formatting errors and the images/line art are just ASCII code garbage.
Have been able to resurrect most of 1535 (including the wingding symbols for TM and copyright, etc) but minus pictures.
From what I have been able to decipher in the headers and with certain tools: tells me is that it appears they were written in Word6 Windows but saved for backward compatibility to Word 5 DOS which wrecked the images and left "wingding symbol" text since Word 5 DOS does not use Wingdings and also cannot keep embedded images if BMP, JPEG, TIFF, etc. Ok that is my best guess as to why they are corrupt, think it is beyond just missing some binary bits/bytes of code but an actual bad conversion of the original DOC.
Printing file raw in PCL5 to HP Laserjet4 "MAY" be possible but tried to my Samsung Laser using PCL6 to PCL5 conversion and looks same as opening the files in Word97 or Word 2003, just junk 🤣....
When done with 1535 will work on 1545 and see how much is salvageable....added: 1535 has over one hundred page breaks but is really only about 20 pages long = definite format corruption of some type ;p

Hate posting a reply and then have to edit it because it made no sense 😁 First computer was an IBM 3270 workstation with CGA monitor.

Reply 14 of 18, by evasive

User metadata
Rank Oldbie
Rank
Oldbie

I'd have to check how things will land inside a word doc if embedded with DDE as that seems to be the method they used. Worst case the actual images were on a different location and not completely embedded as such...

Reply 16 of 18, by evasive

User metadata
Rank Oldbie
Rank
Oldbie

So will the file viewer from Total Commander.

The text we can retrieve, but it would be nice if we can get the pictures as well + some formatting for the bios screens.

Reply 17 of 18, by Horun

User metadata
Rank l33t
Rank
l33t

Noticed the ATC-1000 R.2 and R.3 docs are in same funny format, but R.4 is in true WinWord 6 from looking at the headers, but it also has some formatting issues and odd code included.
Good point on possible DDE and why 1545 has links to PPT files. The .DOC sizes do seem like the images were not actually included or else would be much larger unless just line art types.
Still would like to know what is or why all that odd hex code near end of each of them is supposed to do 😀 ....

Hate posting a reply and then have to edit it because it made no sense 😁 First computer was an IBM 3270 workstation with CGA monitor.

Reply 18 of 18, by Warlord

User metadata
Rank l33t
Rank
l33t

I don't think they ever had pictures. But pretty sure they have tables. and other formating. Ya I can see what I can do. Here is the best I can do with 1535 I have never been good with word processors but its a good effort.

Attachments

  • Filename
    1535 mod.pdf
    File size
    119.89 KiB
    Downloads
    16 downloads
    File license
    Fair use/fair dealing exception