The following details the .EXS file compression (largely how the dictionary is stored). I haven't 100% confirmed it but I plan to do so tonight by translating S1_1_3.EXS. Some of it repeats stuff you already know but I wanted to write it out for my own benefit.
The below should hold for, at least, the smallest files (2 KB). Larger files tend to have multiple dictionaries.
With the application of the roman font patch, it should be possible to begin translating .EXS files now, at least the 2 KB ones, by either constructing a null dictionary or using one of the available DTE utilities to compress your file and then encoding the dictionary. If you are unsure how to edit the text and pointers in the decompressed .EXS.txt file, I'll probably detail that process soon as I start translating.
Suikogaiden Compression
.EXS refers to a compressed file in the SCRIPT.R2 file.
.EXS.txt refers to the decompressed version of an .EXS file.
This tutorial assumes you have a perfect and properly formatted .EXS.txt file, with a 0x34 byte header, ready to be compressed.
The compressed .EXS file has following structure:
- a twelve byte header
- the dictionary
- the compressed .EXS.txt file
- zeroes as necessary to pad the file
HEADER
Bytes 1 - 4 are the magic word TEN2 in ASCII: "54 45 4E 32" in hex.
Bytes 5 - 8 are the length of the compressed .EXS file as a little-endian word, excluding this header, not including zeroes for padding at the end. It seems to vary by 1 in order to hit an odd or even number?
Bytes 9 - 12 are the length of the decompressed .EXS.txt file as a little-endian word, including header and all.
For example, S1_1_3.EXS is 0x35A bytes long. S1_1_3.EXS.txt is 0x666 bytes long.
The header of S1_1_3.EXS is
"54 45 4E 32 4F 03 00 00 66 06 00 00".
DICTIONARY
Initially the dictionary is blank: all hex values represent themselves.
The dictionary chunk of the compressed file, consisting of everything after byte 12 and before the ASCII word "TESL" ("54 45 53 4C"), is read in and sequentially fills in the blank dictionary with any dual-tile codes.
The cursor for the dictionary begins at "letter" 0x00. The dictionary ends when a Type J block moves the cursor to 0x100. This block is special because instead of writing a code, the two bytes YY YY are a halfword (big-endian) equal to the number of bytes to be decompressed.
The dictionary code is composed of two different kinds of block. The first byte of each is always a control byte, and the rest of the bytes are entries for letters. Control bytes >= 80 are Type J, and control bytes < 80 are Type M.
After writing a block, the cursor moves forward one spot and reads in the next block.
Type J - JUMP BLOCK - length 3 bytes
[XX YY YY]
This instruction immediately moves the cursor XX - 0x7F letters, and writes the code YY YY for that letter.
For example, the first code in S1_1_3.EXS is "88 C3 C3 80 55 B4 84 F3 A6".
Instead of writing to "00", the cursor immediately jumps forward 0x88 - 0x7F = 9 spots to "09" and writes the code "C3 C3". This means that all instances of 09 will be replaced by C3 C3 during decompression.
After this, the cursor moves to "0A".
Next, the cursor jumps forward one spot to "0B" and writes the code "55 84".
After this, the cursor moves to "0C".
Next, the cursor jumps forward five spots to "11" and writes the code "F3 A6".
After this, the cursor moves to "12".
At this point, the dictionary looks like this:
00 00
01 01
02 02
03 03
04 04
05 05
06 06
07 07
08 08
09 C3 C3
0A 0A
0B 55 B4
0C 0C
0D 0D
0E 0E
0F 0F
10 10
11 F3 A6
Type M - MULTIPLE CODE BLOCK - length 3+ bytes
[XX AA AA BB BB CC DD DD]
This instruction writes XX + 1 letters at the current cursor location.
Letters are listed sequentially. Letters that aren't dual-entry are listed as themselves and skipped, but still count as one of the XX + 1 letters.
For example, after the three Type J blocks in S1_1_3.EXS, the next block is
"0A 8E 96 94 89 14 22 89 BB 82 17 B3 82 AE 82 DE 82 27 85 00 84", or
"0A 8E 96 94 89 14 22 89 BB 82 17 B3 82 AE 82 DE 82 27 85 00 84".
Looks pretty awful! But it's just 0A followed by eleven letter codes (0A + 1).
This block writes the following codes to the dictionary:
12 8E 96
13 94 89
14
15 22 89
16 BB 82
17
18 B3 82
19 AE 82
1A D3 82
1B 27 85
1C 00 84
COMPRESSED FILE
This is the entirety of the EXS.txt file with recursive DTE (dual-tile entry) compression applied according to the preceding dictionary. Yup.
ZEROES
The file will be padded to a multiple of 2K (2,048 bytes). For example, S1_1_3.EXS has 858 bytes of data followed by 1,190 zero bytes.


