laurence dougal myers

After nearly two years since I poked around in Robin Hood's guts (I'm referring to the game, of course), I got the urge to investigate further, and now have a rough idea of how the data in the rules files is stored.

I will be discussing the format based on the "ERULES.PRG" file, which contains English text. The other rules files should be in exactly the same format, only the string values should be different. All bytes are in little-endian format (unless otherwise stated).

I'll state up here at the start of this post, if anyone wants to join forces in exploring the intricacies of the game, feel free to send me an e-mail: jestarjokin@jestarjokin.net

View after the break, IF YOU DAAAAARE...!


ERULES.PRG

The rules file is split into 10 sections (based on the number of functions called to read in the data). The file begins with a small 2-byte header (containing a value of 0x0000); the 10 sections follow.
Section Start End Size
Header 0x00000000 0x00000002 0x00000002
Section 1 0x00000002 0x00000666 0x00000664
Section 2 0x00000666 0x000012E6 0x00000C80
Section 3 0x000012E6 0x0000628E 0x00004FA8
Section 4 0x0000628E 0x00006502 0x00000274
Section 5 0x00006502 0x00007260 0x00000D5E
Section 6 0x00007260 0x000110E6 0x00009E86
Section 7 0x000110E6 0x00011122 0x0000003C
Section 8 0x00011122 0x000113BE 0x0000029C
Section 9 0x000113BE 0x000114E8 0x0000012A
Section 10 0x000114E8 0x00011562 0x0000007A

Section 1

Section 1 is pretty simple. The number of entries is given as 2 bytes, that number of bytes then follows.
Section 1
Data Size
num_entries 2
entry+ 1 * num_entries

Section 2

Section 2 starts with 2 bytes for the number of entries.

Each entry is made up of a number of components, starting with s2WordA, which is 2 bytes. If this value is not 0xFFFF, it is multiplied by 8 and added with 4 when read. This occurs again for s2WordB, another 2 bytes. s2WordC is read as 2 bytes, s2WordD is another 2 bytes.

After this, there are 10 individual bytes. Then follows two arrays, each 32 bytes long.

In total, one entry will be 82 bytes long. Robin Hood contains 39 entries.

Section 2
Data Size
num_entries 2
entries 82 * num_entries
Section 2 Entry
Data Size
s2WordA 2
s2WordB 2
s2WordC 2
s2WordD 2
s2ByteA 1
s2ByteB 1
s2ByteC 1
s2ByteD 1
s2ByteE 1
s2ByteF 1
s2ByteG 1
s2ByteH 1
s2ByteI 1
s2ByteJ 1
s2ArrayA 32
s2ArrayB 32

Section 3

Section 3 contains string data! It begins with the number of entries, and the total size of the string data (in sub-section 2). Following are two sub-sections.

The first sub-section contains 2-byte values; these values represent the start of each string, given as an offset from the beginning of the string data chunk (in sub-section 2). For example, if the first string in sub-section 2 is "Wh\xCEps!\x00", this is 7 characters long (the "\x" values are my way of denoting raw ASCII values). Therefore, the first entry in sub-section 1 is 0x00, and the second entry is 0x07.

The second sub-section contains all of the string values. The string values are all null-terminated. They are also compressed in an unusual manner; characters with an ASCII value over 127 (possibly higher) are treated specially, and converted based on a lookup table hardcoded into the executable. For example, the value 0xC8 is translate to "oo" (double O). This compresses the string "Whoops!" down to "Wh.ps!", with "." representing the value of 0xC8. Other common letter combinations are compressed, such as "me", "or", "I am", etc.

Strings are also interesting in that they can contain multiple variations of one line of dialogue. These are represented by nesting one or more entries at the start of the string, enclosed by square brackets. e.g.

"[[Hello my sweet.]Hello darling.]Good morrow, my love."

It seems the game will randomly (or perhaps linearly) choose which variation to display.

Section 3
Data Size
num_entries 2
string_data_size 2
sub-section 1 2 * num_entries
sub-section 2 string_data_size
Section 3 Sub-Section 1
Data Size
string_offset 2 * num_entries
Section 3 Sub-Section 2
Data Size
string_data variable size * num_entries (totalling string_data_size)

Section 4

Section 4 starts with 2 bytes for the number of entries, followed by that many entries. Each entry is a word (16-bit).
Section 4
Data Size
num_entries 2
entry+ 2 * num_entries

Section 5

Section 5 starts with 2 bytes for the number of entries, followed by that many entries. Each entry is a word (16-bit). This seems to be the same format as Section 4.
Section 5
Data Size
num_entries 2
entry+ 2 * num_entries

Section 6

Section 6 starts with 2 bytes for the number of entries, followed by that many entries. Each entry is a word (16-bit). It is then followed by another 2 bytes for the second lot of entries. Each of these second entries is only one byte long.
Section 6
Data Size
num_entries_one 2
entry_one+ 2 * num_entries_one
num_entries_two 2
entry_two+ 1 * num_entries_two

Section 6

Section 6 starts with 2 bytes for the number of entries, followed by that many entries. Each entry is a word (16-bit). It is then followed by another 2 bytes for the second lot of entries. Each of these second entries is only one byte long.
Section 6
Data Size
num_entries_one 2
entry_one+ 2 * num_entries_one
num_entries_two 2
entry_two+ 1 * num_entries_two

Section 7

Section 7 is hardcoded as being 60 (0x3C) bytes long. Each value in this section seems to be a single byte.
Section 7
Data Size
s7ByteA 1 * 60

Section 8

Section 8 starts with 1 byte for the number of entries. The number of entries can be 0. There are two sub-sections here.

Section 8 Sub-Section 1 contains an array of bytes. Each byte represents the size of the data chunk in Sub-Section 2. This array is summed to get the total size of the data.

Section 8 Sub-Section 2 contains the variable-sized data. The size of this section is determined by the summed value generated when reading Sub-Section 1.

Section 8
Data Size
num_entries 1
Section 8 Sub-Section 1
Data Size
size_of_data 1
Section 8 Sub-Section 2
Data Size
data_chunk num_entries * variable size

Section 9

Section 9 starts with 2 bytes for the number of entries. Each entry consists of four lots of 16-bit words.

In Robin Hood, there are 37 entries.

Section 9
Data Size
num_entries 2
entry+ 16 * num_entries
Section 9 Entries
Data Size
s9WordA 2
s9WordB 2
s9WordC 2
s9WordD 2

Section 10

Section 10 is an odd one. It contains with a 2-byte value, followed by an array of 20 bytes, then an array of 20 words, then another array of 20 words, then another array of 20 bytes.

The values of the last array are manipulated when each value is read in, like so:
if val == 0x20: val = 0x39 else if val == 0x06: val = 0x1C else: val = val - 0x41

Section 10
Data Size
s10WordA 2
s10ArrayA 1 * 20
s10ArrayB 2 * 20
s10ArrayC 2 * 20
s10ArrayD 1 * 20
That was a massive wall of text, right? I've also added it to the [http://rewiki.regengedanken.de/wiki/.PRG|REWiki] site.