VOGONS


First post, by sofakng

User metadata
Rank Member
Rank
Member

I have a copy of King's Quest 1 (v2.0f) on a 720K floppy disk which I ripped to Kryoflux stream format and I'm trying to understand the copy protection format.

There is a bunch of information on the copy protection:
http://nerdlypleasures.blogspot.com/2021/03/b … tection-on.html
http://agi.sierrahelp.com/CopyProtection/index.html
http://agiwiki.sierrahelp.com/index.php?title … rypted_AGI_data

...and I've been analyzing the tracks using the HxC Floppy Emulator software and it seems to match these specifications:

  1. Sectors 2 through 8 must have a nonstandard sector size of 1024 bytes per sector (a normal DOS sector has 512 bytes).
  2. Sector 1 must have a nonstandard sector size of 8192 bytes per sector. This sector contains the decryption key.
  3. Sector 1 *must* have a CRC error. (BTW: This is what prevents the copy-protection from working under Windows 98, because Win98's Int13 handler returns incorrect error codes (0Ah instead of 10h for CRC error))
  4. All sectors on track 6 must overlap, i.e. the data for sector 2 also contains the beginning of sector 3, and so on. The data from sector 1 contains *all* the following sectors.

Sector 1 does have a sector size of 8192 but how is that possible?

Track 6 also has 8 sectors instead of 9 (like the rest of the tracks).

Can anybody explain how this non-standard track/sectors actually works?

Reply 1 of 8, by Disruptor

User metadata
Rank Oldbie
Rank
Oldbie

Not all sectors of a track need to have the same size.
Having a bigger sector reduces the space wasted by gaps between sectors.

The XDF format used on some IBM disks used HD disks:
https://www.os2museum.com/wp/the-xdf-diskette-format/
Each track (and side) contained a sector of 8 KB in size, one in 2 KB, one in 1 KB and one in 512 Bytes.
That is the equivalent of 23 sectors on a track, compared to disks with 21 sectors of 512 bytes (that required interleave 1:2) and the standard format with 18 sectors of 512 bytes.

I do not know how it was done on DD disks.

Having special formats and errors on the disk, it was more difficult to copy them. This was intended behaviour.

Reply 2 of 8, by mkarcher

User metadata
Rank l33t
Rank
l33t
sofakng wrote on 2023-10-13, 12:05:

Sector 1 does have a sector size of 8192 but how is that possible?

The sector header contains a byte that indicates the data size of the sector. It's encoded like sector_size=2^(size_code+7) bytes. So if the size_code in the sector header is 2 (the usual value), the sector size is 2^9 (^ means exponentiation here), which is 512 bytes. If the floppy contains a sector header that has a 6 in this byte, the FDC tries to decode 8192 bytes after seeing the data address mark following the sector header.

By the way: You can generate non-uniform sector size tracks using the PC-compatible NEC µPD765-type FDC without extra hardware (that's how XDF or 2M disks can be formatted on a PC): When you send the "format track" command to the FDC, you tell the FDC the data size it will be writing for any sector (which is typically filled with F6), but the size you tell to the FDC is not what ends up in the sector header! Instead, when the formatting procedure starts, the FDC receives a 4-byte chunk via DMA (or PIO, but no one uses PIO for the FDC on PC-type systems, so let's ignore that option), and these 4 bytes get written to the sector header. So you can instruct the FDC to format a lot of 128 byte sectors, and write sensible sector number and size codes into the sectors headers you actually want to keep (with the size code defining a sector way bigger than 128 byte). Then after formatting the track (which now contains no readable sectors, because each sector only has 128 byte of actual data followed by a valid CRC, but the sector headers indicating way bigger sectors), you write some data into all the sectors you want to actually have, which will overwrite the "stub" sector headers which are not needed anymore.

You can also use this technique to fine-tune the location where 512-byte sectors are located on the track. So if there is a physical defect (like a dent) in the floppy medium, you can arrange the 18 sectors of an HD disk to skip the damaged location, and you still get 1.44MB usable capacity. The way VGACOPY is able to format marginal media indicates that VGACOPY does support some kind of "sector moving magic", but possibly it just formats 19 sectors and shuffles a dummy sector around until all relevant 18 sectors work.

Reply 3 of 8, by sofakng

User metadata
Rank Member
Rank
Member

Thanks for the information!

I understand the floppy disks can contain sectors of different sizes, but in the example of the Sierra AGI disks, track 6 contains the following sectors:

Sector 1 = 8192 bytes
Sector 2-8 = 1024 bytes

That's a total of 15,360 bytes or 15 KB for that track.

If I understand correctly, typical 3.5" floppy disks contained (9 * 512) or (18 * 512) so 4.5 KB or 9 KB.

Does that mean Sierra just wrote data more data than usual to that single track? (similar to 2.88 MB disks that came much later?). If Sierra used this for every track that would be equivalent to a 2.4 MB disk, I think?

EDIT: Actually, I think I misunderstood this copy protection. I think sector 1 overlaps all the other sectors on that track so the entire track is 8192 bytes. That's only around double the 'normal' track size (4608 bytes) so I guess they just squeezed it all in there by essentially having no empty space between sectors?

Reply 4 of 8, by mkarcher

User metadata
Rank l33t
Rank
l33t
sofakng wrote on 2023-10-14, 15:46:
I understand the floppy disks can contain sectors of different sizes, but in the example of the Sierra AGI disks, track 6 contai […]
Show full quote

I understand the floppy disks can contain sectors of different sizes, but in the example of the Sierra AGI disks, track 6 contains the following sectors:

Sector 1 = 8192 bytes
Sector 2-8 = 1024 bytes

That's a total of 15,360 bytes or 15 KB for that track.

If I understand correctly, typical 3.5" floppy disks contained (9 * 512) or (18 * 512) so 4.5 KB or 9 KB.

Does that mean Sierra just wrote data more data than usual to that single track? (similar to 2.88 MB disks that came much later?). If Sierra used this for every track that would be equivalent to a 2.4 MB disk, I think?

You can't fit 15KB on a track. A HD track is 100'000 raw bits long, and these bits are allocated to the track index mark, the sector headers (including the ID address mark) and the data sectors themselves (including the data address mark). 100'000 raw bits is 12500 raw bytes, which is the ultimate limit of data storable on a single track of a 3.5" HD floppy. If the drive is rotating slowly, you might be able to write a couple of bytes more, if the drives rotates fast, the track might be over after 12400 raw bytes.

The answer to this mystery is that sector 1 overlaps a lot of the remaining sectors. I would guess the actual format of that track has space for 9 sectors of 1024 bytes, but sector 1 has been formatted with a "creative" sector header misrepresenting its size as 8192 bytes. This enables the copy protection to determine the exact location of sectors 2 to 6, and check whether the padding at that point looks correct.

Writing a big number into the sector header to be able to read "between the sectors" is the easiest approach, and works even when using BIOS calls, but as long as you only want to enlarge the first sector of a track, it's not necessary to write a creative header: The FDC has a "diagnostic read track" command that skips matching the sector header values. If you send a "diagnostic read track" command and pass that command that you want to read a sector identified as "track 5, head 0, sector 1, size code 6", it will start reading at the first sector at that track no matter what the sector header says and reads as much bytes as the size you passed to the floppy controller.

Reply 5 of 8, by sofakng

User metadata
Rank Member
Rank
Member

Thanks again so much! I'm really learning a lot and this is fascinating.

You said an HD floppy disk is 100,000 raw bits (max). Are these also called cells?

For example, HxC shows the following information:

Track RPM : 300 RPM
Bitrate : VARIABLE
Track format : ISOIBM_MFM_ENCODING
Track len : 99847 cells
Number of side : 2
Interface mode: GENERIC_SHUGART_DD_FLOPPYMODE, Shugart Interface

If I understand, then this track uses 99,847 cells (bits) out of a max of 100,000. (ie. almost all of them?)

Here is the sector 1 data:

MFM Sector
Sector ID:001
Track ID:006 - Side ID:000
Size:08192 (ID:0x06)
DataMark:0xFB
Head CRC:0xAD72 (Ok)
Data CRC:0x0000 (BAD CRC!)
Start Sector cell:2036
Start Sector Data cell:2740
End Sector cell:34061
Number of cells:32025

This sector starts at position 2036 so can I assume the track header is position 0 to 2035? ...and is the sector header 2036 to 2739?

Therefore, the data is from 2740 to 32025 or 29286 bits but that's (29286/8) 3660.75 bytes?

Sorry for so many questions and thanks again for helping me understand all of this!

Reply 6 of 8, by mkarcher

User metadata
Rank l33t
Rank
l33t
sofakng wrote on 2023-10-14, 17:33:
Thanks again so much! I'm really learning a lot and this is fascinating. […]
Show full quote

Thanks again so much! I'm really learning a lot and this is fascinating.

You said an HD floppy disk is 100,000 raw bits (max). Are these also called cells?

For example, HxC shows the following information:

Track RPM : 300 RPM
Bitrate : VARIABLE
Track format : ISOIBM_MFM_ENCODING
Track len : 99847 cells
Number of side : 2
Interface mode: GENERIC_SHUGART_DD_FLOPPYMODE, Shugart Interface

Please note that it says DD, not HD in the interface mode. As a bit (as well as a "cell") is twice as long in DD mode, the capacity of a 3.5" disk is just 50,000 "raw bits". The "cells" in HxC language are called "half-bits" by superformat, so two cells make one bit. A DD track has a nominal length of 50,000 bits or 100,000 cells.

sofakng wrote on 2023-10-14, 17:33:

If I understand, then this track uses 99,847 cells (bits) out of a max of 100,000. (ie. almost all of them?)

You should understand that cells on a track are not allocated, so some of them are "used" and other ones are "free", but instead, the floppy controller writes a continuous stream of cells when it formats a track. Some cells are used as address marks, dividing the track into sectors, some cells are used for the sector headers, some more cells are used for the sector payload, and the remaining cells are padding (called "gap" in official floppy format discussions). In this case, the original disk that was dumped was likely rotating slightly fast when it was written, so that the controller writing the track only managed to write 99,847 cells before the disk made a complete revolution. This would be 0,15% too fast, which is completely within specification of floppy drives.

When a disk is dumped, all cells are read until the revolution is complete. This can be coarsely identified by the index signal, and fine matching can be obtained by actually looking at the data received from the drive. All magnetic pulses that are delivered from the floppy drive to the floppy controller are supposed to be on "cell boundaries". This enables the reading floppy controller to keep in sync with the cell structure on the floppy drive. This mechanism is called "clock recovery".

sofakng wrote on 2023-10-14, 17:33:
Here is the sector 1 data: […]
Show full quote

Here is the sector 1 data:

MFM Sector
Sector ID:001
Track ID:006 - Side ID:000
Size:08192 (ID:0x06)
DataMark:0xFB
Head CRC:0xAD72 (Ok)
Data CRC:0x0000 (BAD CRC!)
Start Sector cell:2036
Start Sector Data cell:2740
End Sector cell:34061
Number of cells:32025

This sector starts at position 2036 so can I assume the track header is position 0 to 2035? ...and is the sector header 2036 to 2739?

Every element on a floppy track starts with a synchronization pattern (also known as preamble), which should make sure that the clock recovery mechanism is solidly locked onto the floppy signal when relevant data arrives, then an address mark that uniquely identifies the end of the preamble and the type of data that is going to follow. For everything except the "index address mark", after the address mark, there will be data bytes (4 data bytes in the sector header, 128-16384 data bytes in the sector data) followed by a 2-byte CRC. Between all of these elements, there is always some padding. So you can't assume the "track header" (actually called "Index Address mark") to occupy the whole space. The CPC wiki has a page describing the common DD MFM floppy format used in many computers. This format is also used in the IBM PC (and HD is just the same stuff with the cells being half as long). Before the first sector ID address mark (and its preamble), you will find around 80 gap bytes (the end of the gap "around the clock" between the last sector and the index address mark), 16 bytes for the index address mark and 50 bytes gap between the index address mark and the first ID address mark. This is 146 bytes in total, which is 1168 bits or 2336 cells. The lower number in the HxC dump indicates that the dump likely started a little bit late in the revolution, so that some of the 80 bytes of the gap "around the clock" are allocated to the end of the track instead of the start. This is no issue.

The sector header is supposed to consist of 12 preamble bytes, 4 address mark bytes, 4 payload bytes and 2 bytes CRC. This is 22 bytes = 176 bits = 352 cells. Following that, there is supposed to be 22 bytes gap until the sector data begins (again, with a preamble). This makes another 352 cells, yielding 704 cells in total. This perfectly matches with the data displayed by HxC, so the gap size of this disk is obviously standard.

sofakng wrote on 2023-10-14, 17:33:

Therefore, the data is from 2740 to 32025 or 29286 bits but that's (29286/8) 3660.75 bytes?

I'm unsure how HxC gets this "end sector cell" number - but the end is 34061, the amount of cells is 32025 cells ~ 2001.6 bytes. This does not match up with the sector header claiming there would be 8KB of data. Actually, this disk likely is formatted like a normal 9-sector DD disk, with sector headers and data areas layed out perfectly normal. There are just fake values in the sector header that claim bigger sector sizes and allow the software to read past the real sector end into the gap.

A regular DD floppy has 16 bytes preample/data address mark + 512 bytes sector data + 40 bytes gap until the next sector, which only makes 568 bytes until the next sector header is expected. It's not impossible that Sierra increased the size of the gap after the first sector, and decreased gaps between the later sectors, so that there are up to 2000 really existing bytes before the sector header of sector 2 starts. The "CRC" displayed here is 0000 (which is likely not an actual CRC). The synchronization pattern for the next sector ID consists of 12 bytes, all reading as 00. So it is likely that the dump of sector 1 terminated as soon as the header for sector 2 arrived, which was a lot earlier than the 8192 bytes declared in the sector header.

Reply 8 of 8, by mkarcher

User metadata
Rank l33t
Rank
l33t
sofakng wrote on 2023-10-15, 01:01:

Lots of great information! Do you know if there is a way to convert a Kryoflux dump to hex bytes? (not just the sector data but the entire track?)

I'd like to see the pre-amble, gap bytes, etc.

While I do have a KryoFlux, I do not know about a tool for this. In general, this is impossible, but the idea makes a lot if sense. It should be perfectly possible to convert properly mastered floppies that had their data written while formatting into a hex byte stream. But as soon as you write data after formatting (which is what you usually do on PC-type floppy controllers, as the format+write instruction was introduced very late in the PC era), the floppy controller will look for the sector header written while formatting (reading the track from the drive), then switch over into write mode 22 bytes (44 cell times that are no longer actively synchronized with the floppy data stream) later and write a new preamble + data address mark + data + CRC. As this is around 550 bytes on standard PC sectors, this is 8800 cells. Hitting exactly the length of the 8800 cells that have been written during formatting is extremely unlikely, because the rotational speed of floppies isn't controlled that well. So you will get two splice point. One is likely "not that much off", which is the splice point where the new preamble for the data starts, but there will definitely be a noticable hick-up after the CRC. Especially this means that the gap between a sector that has been re-written and the next sector header is likely not an integer number of bytes. Even worse: If you just continue read cells after the data end, you might lock in on the "wrong phase". A cell is just half a bit, locking in on cell boundaries might run the MFM decoder treating the second cell of one bit and the first cell of the next bit the two cells that would make up one data bit, which will return unusable data. You can observe this effect when you use the "diagnostic read track" command to force the floppy controller into reading 8K/16K/32K without interruption starting at the first data address mark: On a disk which has re-written sectors, some sectors are clearly identifiable in the output, while other sectors are not. The PC floppy controller keeps trying to read cells without re-synchronizing the cell phase while it reads one sector (but it still keeps track of the cell clock). This makes complete sense, because there are no splice points inside one properly written sector. On the other hand, after reading one sector, the controller will use the next pre-amble to identify the cell phase and properly lock to the correct bit clock (and not the "bad phase" variant) during the next preamble. Even the "small" splice at the start of a sector might be enough to hit the wrong phase sometimes, so the PC floppy controller locks into the correct phase on every preamble (which is a trivial task given the expected preamble pattern).

So while it is not possible to convert a full track of a floppy after everyday use with re-written sectors into one continous byte-stream, it is very much possible to turn the 18 sectors including the pre-amble, data addres mark and CRC into dedicated byte streams (this is what gets re-written if you write an "image" to a floppy disk), and to turn the data that has been unchanged since formatting (the 18 gaps between the sectors including the sectors headers) into dedicated byte streams. The splice point at the end of a sector (after the CRC) might contain partially overwritten cells from multiple different writes to the same location, and usually contains "unintellegible garbage" for some cell times, which can not be accurately represented by a bit stream or byte stream at all.

If you use a conversion tool to extract the MFM sectors from a Kryoflux image, this tool does the same thing as the PC floppy controller: It resynchronizes the MFM decoder on every new preamble, it does not construct a continous "raw track bytestream".

In case you are not aware how the Kryoflux image works at the lowest level: A "normal floppy signal" consists of fixed-length cells, and at each cell boundary, there might be a pulse, or there might be no pulse. The "pulse" might not be as sharp as this explanation sounds (in fact, the duration of pulse might be around a full cell time). On MFM encoded disks, no two consecutive cells boundaries may have a pulse. Instead, there always is one, two or three cell boundaries without a pulse between two pulses. Because there also may not be four or more cell boundaries without at pulse, it is easy to keep the reader in sync (as long as the medium is in sufficiently good condition to provide a clean signal). A standard floppy controller only looks every other cell boundary whether there is a pulse or not. This directly extracts the binary data stream from an MFM encoded signal. The pulses at the boundaries that are not (ditially) read by the floppy controller are used only for the clock synchronization mechanism that tells the floppy controller when the "data" pulses are supposed to happen. So what a PC application sees when "reading" a floppy is not the raw signal, but already a lot of interpretation applied onto it. The Kryoflux directly samples the output of the floppy drive (which already contains some analog pre-processing to detect the pulses), and stores the time between pulses at a condirable higher resolution than the floppy controller does. If I remember correctly, the resolution of the KryoFlux is around 18.5ns. On the other hand, the floppy controller on a HD floppy is checking for the presence of a pulse every 2µs (500kbit/s). The cool thing about the KryoFlux format is that this format is able to store the data delivered by the floppy drive in a format data rate independent way, so that it can be accurately replayed by a HxC/Gotek-type floppy emulator, even if it is not recorded at a standard PC data rate.