Welcome
Welcome to <strong>suikogaidentranslation</strong>.

You are currently viewing our boards as a guest, which gives you limited access to view most discussions and access our other features. By joining our free community, you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content, and access many other special features. Registration is fast, simple, and absolutely free, so please, <a href="/profile.php?mode=register">join our community today</a>!

Technical Help

Suikogaiden Translation Project Technical Work Devision

Re: Technical Help

Postby Pokeytax » Fri Nov 18, 2011 7:43 am

The following details the .EXS file compression (largely how the dictionary is stored). I haven't 100% confirmed it but I plan to do so tonight by translating S1_1_3.EXS. Some of it repeats stuff you already know but I wanted to write it out for my own benefit.

The below should hold for, at least, the smallest files (2 KB). Larger files tend to have multiple dictionaries.

With the application of the roman font patch, it should be possible to begin translating .EXS files now, at least the 2 KB ones, by either constructing a null dictionary or using one of the available DTE utilities to compress your file and then encoding the dictionary. If you are unsure how to edit the text and pointers in the decompressed .EXS.txt file, I'll probably detail that process soon as I start translating.



Suikogaiden Compression

.EXS refers to a compressed file in the SCRIPT.R2 file.
.EXS.txt refers to the decompressed version of an .EXS file.

This tutorial assumes you have a perfect and properly formatted .EXS.txt file, with a 0x34 byte header, ready to be compressed.

The compressed .EXS file has following structure:

- a twelve byte header
- the dictionary
- the compressed .EXS.txt file
- zeroes as necessary to pad the file

HEADER

Bytes 1 - 4 are the magic word TEN2 in ASCII: "54 45 4E 32" in hex.
Bytes 5 - 8 are the length of the compressed .EXS file as a little-endian word, excluding this header, not including zeroes for padding at the end. It seems to vary by 1 in order to hit an odd or even number?
Bytes 9 - 12 are the length of the decompressed .EXS.txt file as a little-endian word, including header and all.

For example, S1_1_3.EXS is 0x35A bytes long. S1_1_3.EXS.txt is 0x666 bytes long.

The header of S1_1_3.EXS is
"54 45 4E 32 4F 03 00 00 66 06 00 00".

DICTIONARY

Initially the dictionary is blank: all hex values represent themselves.

The dictionary chunk of the compressed file, consisting of everything after byte 12 and before the ASCII word "TESL" ("54 45 53 4C"), is read in and sequentially fills in the blank dictionary with any dual-tile codes.

The cursor for the dictionary begins at "letter" 0x00. The dictionary ends when a Type J block moves the cursor to 0x100. This block is special because instead of writing a code, the two bytes YY YY are a halfword (big-endian) equal to the number of bytes to be decompressed.

The dictionary code is composed of two different kinds of block. The first byte of each is always a control byte, and the rest of the bytes are entries for letters. Control bytes >= 80 are Type J, and control bytes < 80 are Type M.

After writing a block, the cursor moves forward one spot and reads in the next block.

Type J - JUMP BLOCK - length 3 bytes

[XX YY YY]

This instruction immediately moves the cursor XX - 0x7F letters, and writes the code YY YY for that letter.

For example, the first code in S1_1_3.EXS is "88 C3 C3 80 55 B4 84 F3 A6".

Instead of writing to "00", the cursor immediately jumps forward 0x88 - 0x7F = 9 spots to "09" and writes the code "C3 C3". This means that all instances of 09 will be replaced by C3 C3 during decompression.

After this, the cursor moves to "0A".

Next, the cursor jumps forward one spot to "0B" and writes the code "55 84".

After this, the cursor moves to "0C".

Next, the cursor jumps forward five spots to "11" and writes the code "F3 A6".

After this, the cursor moves to "12".

At this point, the dictionary looks like this:

00 00
01 01
02 02
03 03
04 04
05 05
06 06
07 07
08 08
09 C3 C3
0A 0A
0B 55 B4
0C 0C
0D 0D
0E 0E
0F 0F
10 10
11 F3 A6


Type M - MULTIPLE CODE BLOCK - length 3+ bytes

[XX AA AA BB BB CC DD DD]

This instruction writes XX + 1 letters at the current cursor location.

Letters are listed sequentially. Letters that aren't dual-entry are listed as themselves and skipped, but still count as one of the XX + 1 letters.

For example, after the three Type J blocks in S1_1_3.EXS, the next block is
"0A 8E 96 94 89 14 22 89 BB 82 17 B3 82 AE 82 DE 82 27 85 00 84", or
"0A 8E 96 94 89 14 22 89 BB 82 17 B3 82 AE 82 DE 82 27 85 00 84".

Looks pretty awful! But it's just 0A followed by eleven letter codes (0A + 1).

This block writes the following codes to the dictionary:

12 8E 96
13 94 89
14
15 22 89
16 BB 82
17
18 B3 82
19 AE 82
1A D3 82
1B 27 85
1C 00 84

COMPRESSED FILE

This is the entirety of the EXS.txt file with recursive DTE (dual-tile entry) compression applied according to the preceding dictionary. Yup.

ZEROES

The file will be padded to a multiple of 2K (2,048 bytes). For example, S1_1_3.EXS has 858 bytes of data followed by 1,190 zero bytes.
Last edited by Pokeytax on Thu Dec 08, 2011 7:50 pm, edited 2 times in total.
User avatar
Pokeytax
 
Posts: 69
Joined: Wed Oct 26, 2011 1:30 pm

 

Re: Technical Help

Postby Rufas » Fri Nov 18, 2011 9:25 am

You might be more interested in the tracer version of the decompression tool. David translate it verbatim from PSX-EXE, and I add in the printf functions to echo out those variables.

The script (in C)
http://suikogaiden.wikispaces.com/sgdecode-tracer.c

Quote back the relevant part
Sample output from files:
S1_1_1.EXS : https://sites.google.com/site/rufaswan/S1.1.1-tracer.zip
ADV1_1.EXS : https://sites.google.com/site/rufaswan/adv1.1-tracer.zip

Some random notes:
1) Smaller files means lesser loops. S1_1_1.EXS is only loop 4 times but ADV1_1.EXS loops 24 times!
2) Each loop will produce an output of 5000 bytes. (NOTE: test with first loop only, with any files.)
3) Each loop begins with recreate the dictionary (a.k.a. Part 1) with a section of code executed for 64 times.


Might be useful to you.

- Rufas
User avatar
Rufas
 
Posts: 185
Joined: Wed Nov 12, 2008 8:44 am

Re: Technical Help

Postby Pokeytax » Fri Nov 18, 2011 10:46 pm

I finally successfully compressed a file, using the "blank dictionary + uncompressed file" method. I still prefer the idea of using compression, but this is easier for testing purposes... if there aren't any issues rebuilding R2 archives, it might be a more realistic option than I thought.

Here is the "dummy header" that creates a blank dictionary and simply reads in the file after the header:

Code: Select all
54 45 4E 32  XX XX XX XX  YY YY YY YY  FF 80 FE ZZ
ZZ


XX XX XX XX is the length of your uncompressed file plus 5 as a little-endian word.
YY YY YY YY is the length of your uncompressed file as a little-endian word.
ZZ ZZ is the length of your uncompressed file as a big-endian halfword.

If your uncompressed file size is 0x666 (S1_1_3), then your dummy header would be:

Code: Select all
54 45 4E 32  6B 06 00 00  66 06 00 00  FF 80 FE 06
66
User avatar
Pokeytax
 
Posts: 69
Joined: Wed Oct 26, 2011 1:30 pm

Re: Technical Help

Postby Pokeytax » Sat Nov 19, 2011 5:42 pm

I hope you like a constant hum of white noise posting until this game is translated because that's what you're getting!

Image

There's some kind of hiccup with the text routine - because it's built off of a double-byte routine, it crashes out when a printed string has an odd length. You can always just add a space, I guess, but ugh.

Those quote marks are not terrific - they could stand to be spaced out a bit, periods are the same way. Right now I'm thrilled to be clear of that part but maybe some basic kerning should be on the list.

Now that I've figured that even-length thing out, I'm going to take a second go at translating this full scene, and once that works, try to figure out the multiple dictionaries/loops/whatever is going on with longer .EXS files, so I can start doing testing on S1_1_1.EXS, instead of S1_1_3.EXS because it's small enough to handle. With the number of times I have fast-forwarded and mashed through S1_1_1 and S1_1_2 I could have carried Sierra to the Nameless Lands and back.

Rufas that is a cool script! It looks like the first part is just a fancy mathematical way of writing a dictionary to the PSX scratchpad, the second part reads in the dictionary, and the third part reads in XXXX bytes based on the last code in the dictionary... and then loops through again based on a parameter I haven't nailed down yet. I am glad you guys can work with C/automation because really I find assembly easier to follow.

EDIT: Multiple Dictionaries

It looks like the file format is as follows:
HEADER
DICTIONARY 1
SEGMENT 1
DICTIONARY 2
SEGMENT 2
...
ZEROES

Once the number of bytes specified by dictionary 1 are decompressed, if there are still more bytes in the file according to the header, the routine immediately begins reading in a new dictionary from the next byte and loops until all segments are read.
User avatar
Pokeytax
 
Posts: 69
Joined: Wed Oct 26, 2011 1:30 pm

Re: Technical Help

Postby RinUzuki » Sun Nov 20, 2011 2:58 pm

WOW! A screenshot?!! You guys are really making progress!!! This is brilliant...!! Thank you so much for your hard work, everyone! You're an inspiration to all us translators and proofreaders and things!

(Sorry for the interuption, heheh, but I had to say something!)
User avatar
RinUzuki
Site Admin
 
Posts: 829
Joined: Mon Jun 16, 2008 10:05 pm

Re: Technical Help

Postby Pokeytax » Tue Nov 22, 2011 3:55 pm

Don't be sorry for the interruption - it's a little scary doing so much talking to myself.

The lack of posting is because I successfully translated and compressed S1_1_3.EXS (using the dummy header method). That means, theoretically, everything needing doing is done. Niceties like customizing number of lines per message box would be great, but I don't want to delay a translation for stuff that can be added later. So I'm moving on and automating the process of script insertion/pointer modification/compression, which is going fine but won't generate much to share for a while.

If anyone has questions in the meantime please ask away.

Miscellaneous notes:
- Some further work will have to be done to make saving the game work with the text hack. Amazingly this isn't gamebreaking given the existence of save states, but yes, that's really something that needs to work.
- The maximum line length is 48 characters.
- Dummy compressing the files is easier, but requires working with the archive and possibly expanding the ISO/fiddling with the table of contents. True compression of the files requires writing a compression routine, but assuming that yields a solid ratio, it might mean we can just overwrite the existing locations in SCRIPT.R2 and ignore the virtual and real filesystem. I am still leaning toward the latter but we need this script/pointer tool regardless.
- Rin, your translated English text seems to run about 100% - 120% length of the Japanese text (accounting for 2:1 storage). I would guess it will compress well enough that it won't be an issue (most files have some free space) and we can always figure out how to rebuild things. Having read through the whole thing it looks quite good!
User avatar
Pokeytax
 
Posts: 69
Joined: Wed Oct 26, 2011 1:30 pm

Re: Technical Help

Postby Rufas » Wed Nov 23, 2011 6:17 am

As I recalled, some script are not meant for translation and must remain in SJIS latin characters, specifically this screen:

Image

This is loaded from BIOS.

Unlike Suikoden 2, Suikogaiden series didn't hardcode and use LBA address (or Table of Content) to find its files. They use normal filenames. As long as all the required files (with correct filename) presents, it will have no problem.

To demonstrate this example, I PM'd you two links to download the rebuilt version of the game. Try comparing both LBA address, they are different.

- Rufas
User avatar
Rufas
 
Posts: 185
Joined: Wed Nov 12, 2008 8:44 am

Re: Technical Help

Postby Pokeytax » Wed Nov 23, 2011 5:24 pm

Okay, that sounds great then. I'm not used to tinkering with that stuff but if you know what to do already that makes things a lot easier! I probably won't have much time to work for the next couple weeks, but after that I'll finish automating a set of .EXS files based on the wiki translation and work from there.
User avatar
Pokeytax
 
Posts: 69
Joined: Wed Oct 26, 2011 1:30 pm

Re: Technical Help

Postby RinUzuki » Thu Nov 24, 2011 1:53 pm

That means, theoretically, everything needing doing is done.

Wow! Seriously?! That's fantastic! You're a hero, Pokeytax! Everybody here is, working so hard...! Man, these advancements in the technical side of things really breathe a breath of fresh air into this whole project! Whoo! I'm pumped to be be working on this, and I've finally got some time to! The perfect combination!!

- Rin, your translated English text seems to run about 100% - 120% length of the Japanese text (accounting for 2:1 storage). I would guess it will compress well enough that it won't be an issue (most files have some free space) and we can always figure out how to rebuild things. Having read through the whole thing it looks quite good!

I was a little concerned when I read that the maximum line length was 48 characters, but this point sounds like good news! Forgive me; I'm no math wiz. Wouldn't 100% - 120% be a bad thing? In that case, we'd generally want the English script to be shorter and more concise, right? But this 2:1 storage bit means... we can get two English characters in for every one Japanese?

And the free space sounds like a lovely bit of leeway!
User avatar
RinUzuki
Site Admin
 
Posts: 829
Joined: Mon Jun 16, 2008 10:05 pm

Re: Technical Help

Postby Pokeytax » Fri Dec 02, 2011 7:47 pm

Status update: working on a spreadsheet that will handle most of the heavy lifting.

Easily edited pointers and script: done
Producing a decompressed file: done
Compression routine: half done (yes I could probably just wrangle the SCRIPT.R2 maker instead of writing a complicated function... but, I wanna)
Cleaning up the script (primarily slicing 100-character lines into three 45/20/35 lines): not done

I'm guessing probably another week to complete the above and then a week of debugging and fixing before I get to anything worth playtesting.

RinUzuki wrote:Wouldn't 100% - 120% be a bad thing? In that case, we'd generally want the English script to be shorter and more concise, right? But this 2:1 storage bit means... we can get two English characters in for every one Japanese?


Well, it's more along the lines of 2.2 English characters for every Japanese character. Even counting that we can go 2:1, that's still slightly larger, but it shouldn't be a problem with adequate compression.

The script cuts will come from dialogue box constraints: a two line dialogue box will hold 96 characters. That's enough to render the meaning, but often the script expands on the Japanese to convey attitude and adequately render Nash's chatty idiom. Some of those bits might need to be trimmed, but I don't see it impacting the script much.

For example, the first line of S1_1_3 is

ナッシュ
「ずいぶんと良い造りの部屋じゃないか。
家具とかも値打ちもんだぜ。」

which is translated as

Nash
"Now this is what I call a room!
This furniture must be worth a pretty penny!"

You may have noticed that the screenshot above renders this as

Nash
"Now this is what I call a room!
This kind of furniture doesn't come cheap!"

in order to fit. In general I will take more care when trimming than that; I just wanted to get something onscreen. But that sort of thing is the primary impact space constraints will have.

If we have a wealth of extra time before the translation is complete, I'll see about adding and subtracting lines.
User avatar
Pokeytax
 
Posts: 69
Joined: Wed Oct 26, 2011 1:30 pm

Re: Technical Help

Postby RinUzuki » Wed Dec 07, 2011 7:46 pm

Given how slow I am, I suspect we will have a wealth of time, as you put it. But I will definitely do all I can to work as quick;y and efficiently as possible!

Playing through the first chapter, I did see you'd changed some lines to fit. (Which is fine, of course.) I guess what I'd like to do is go through and find any lines that won't fit, and then do the editing myself--though I'm absolutely fine with some of the changes you made, given that there's several changes that will need to be made throughout the script, it'd probably be best to do as many at a time as possible.

So... what would be the best possible method of seeing when lines are too long? (Is there one?)
User avatar
RinUzuki
Site Admin
 
Posts: 829
Joined: Mon Jun 16, 2008 10:05 pm

Re: Technical Help

Postby Pokeytax » Wed Dec 07, 2011 8:04 pm

RinUzuki wrote:I guess what I'd like to do is go through and find any lines that won't fit, and then do the editing myself...
So... what would be the best possible method of seeing when lines are too long? (Is there one?)


Well, if you have access to Microsoft Excel, open up the Suikogaiden Hacking Utility.xls file. You should just be able to type text into column Y with the yellow background. The cells will be red when you have too many characters. Scroll over to 4_10 or so if you want to see what the script pulled straight from the wiki looks like. This is the most feasible method I found.

If you don't have Excel, LibreOffice or another free spreadsheet is probably just fine for editing the script. I just need Excel to run the compression, etc. on my end.

If even that is too scary, we'll work something out. Don't worry too much now, I will do the prep work on a document for you to make the final edits on when you're ready.
User avatar
Pokeytax
 
Posts: 69
Joined: Wed Oct 26, 2011 1:30 pm

Re: Technical Help

Postby RinUzuki » Wed Dec 07, 2011 8:13 pm

I thought I had excel, but I see now it's just opening up in Word, so there's no wonder that it wasn't making any sense to me, huh... I guess it's too much for me to tackle right now, but as you say, we'll figure something out in the end!

Thank you again for everything!
User avatar
RinUzuki
Site Admin
 
Posts: 829
Joined: Mon Jun 16, 2008 10:05 pm

Re: Technical Help

Postby Raww Le Klueze » Wed Dec 07, 2011 11:41 pm

Get OpenOffice then, specifically the Calc program. It's compatible with excel files and free.

http://www.openoffice.org/

I've looked it over and the space constraints are a bit more rigid than I suspected they'd be, which is why I warned against needlessly flourished translations.
Prejudice is a great timesaver. It enables you to form opinions without bothering to get facts.
User avatar
Raww Le Klueze
 
Posts: 108
Joined: Sun Jul 05, 2009 12:39 am

Re: Technical Help

Postby Pokeytax » Sun Dec 18, 2011 8:14 pm

I have a build running a custom 8x15 font (right now the custom font is... the exact same BIOS font pasted into NISUI.BIN).

Any ideas on what we should use for a custom variable-width font? Visuals are not my comfort zone. We need something black-on-white (no fancy drop shadows).

Unfortunately, it would be nice if the font had a max width of 8 pixels. That's really small, I know. Anything wider will have trouble fitting into the expanded NISUI.BIN (2 KB). Although, maybe we can rebuild the ISO as Rufas said...

If nobody has anything I can just finagle the BIOS font a little so it's variable-width friendly, it would at least look better than the status quo (those hideous left quotes that run into the next letter especially) and allow lines over 48 characters.
User avatar
Pokeytax
 
Posts: 69
Joined: Wed Oct 26, 2011 1:30 pm

PreviousNext

Return to Technical Work Division (We can do this!)

Who is online

Users browsing this forum: No registered users and 0 guests

cron
suspicion-preferred