Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where does the MSDOS engine (and other releases) store the text strings? #15

Open
felipesanches opened this issue Mar 25, 2022 · 7 comments
Assignees

Comments

@felipesanches
Copy link
Owner

felipesanches commented Mar 25, 2022

We need to figure it out and then to write a script to generate a pair of str_data.rom and str_index.rom files (which AWVM_trace.py uses to place comments with the string contents on the generated disasm listings).

These files will also be used on the FPGA project at: https://github.com/felipesanches/AnotherWorld_FPGA

@fabiensanglard's VM (https://github.com/fabiensanglard/Another-World-Bytecode-Interpreter) keeps the strings hardcoded. For this AW-VMTools project we want to extract absolutely everything from the original game files, instead.

@felipesanches
Copy link
Owner Author

I have a hunch that the text strings may be encoded in one of these places:

  • In the ANOTHER.EXE main executable (hardcoded in the original VM implementation)
  • In the "unknown" resource number 0x11 (which has type=6)

@felipesanches
Copy link
Owner Author

felipesanches commented Mar 25, 2022

But it seems resource 0x11 has polygon data (fabiensanglard/Another-World-Bytecode-Interpreter#23)

Also we see resource 0x11 being loaded as "video2" by my AnotherWorldVM driver on MAME:
https://github.com/felipesanches/mame/blob/d7ea76b1b731b69fa9cb4d7c34e31fdd3e0f8333/src/mame/drivers/another_world_vm.cpp#L232-L233
Screenshot from 2022-03-25 07-49-32

So perhaps the text strings are indeed in the ANOTHER.EXE file.

@felipesanches
Copy link
Owner Author

felipesanches commented Mar 25, 2022

For the record, the hardcoded string data is available at:
https://github.com/fabiensanglard/Another-World-Bytecode-Interpreter/blob/dea6914a82f493cb329188bcffa46b9d0b234ea6/src/staticres.cpp#L123-L265
Screenshot from 2022-03-25 08-02-06

And there's also hardcoded font data which we also don't know where/how it was stored on the original game files. On @fabiensanglard's VM it is declared at:
https://github.com/fabiensanglard/Another-World-Bytecode-Interpreter/blob/dea6914a82f493cb329188bcffa46b9d0b234ea6/src/staticres.cpp#L71-L120
Screenshot from 2022-03-25 08-03-09

felipesanches added a commit that referenced this issue Mar 25, 2022
while we haven't yet figured out how to extract the data directly from the original game files.
(issue #15)
@toymak3r toymak3r self-assigned this Apr 2, 2022
@felipesanches
Copy link
Owner Author

@toymak3r, this is an easier task on some releases. I know that for the "SEGA Genesis - Europe" release, the string data is uncompressed within the ROM (can be easily seen with the strings unix command) so it can be a good initial target, before trying to figure out decompression of data on other releases such as in the SNES cartridge ROM.

@felipesanches
Copy link
Owner Author

And also the MSDOS release seems to involve some sort of compression (or obfuscation) which is also more challenging than starting with the releases that were shipped with raw text string data.

An interesting caveat is that some releases seem to provide multiple sets of text strings (for supporting the game in multiple languages, instead of only English)

@felipesanches felipesanches changed the title Where does the MSDOS engine store the text strings? Where does the MSDOS engine (and other releases) store the text strings? Apr 5, 2022
@felipesanches
Copy link
Owner Author

Fun fact discovered by c9d4618:

It seems that the SEGA Genesis Europe cartridge has a typo in the string "SURE ?" (missing the letter E), while the same text on the MSDOS release does not have that typo.

Screenshot from 2022-04-06 01-29-33

@felipesanches
Copy link
Owner Author

felipesanches commented Apr 17, 2022

And also the MSDOS release seems to involve some sort of compression

Yes! Strings in the MSDOS release are in the ANOTHER.EXE file which is compressed using something similar to LZSS, but I haven't yet fully decoded it. This is work in progress. I'm not yet sure how those 16-bit control words work (EC8F, 1F07, 807F, etc):

photo_2022-04-11_22-34-10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants