The files
This part will go over the files used by thcrap, and will explain the concept around these files as we go. This may seem like a short part, but all the thcrap features are enabled by some files, so covering the files will already provide a huge overview of the thcrap core features. As I write these lines, I don't even know if the "code" part will be needed. Let's start by looking at the thcrap directory.
thcrap_configure.exe and thcrap_loader.exe
Unlike what you might think, these 2 files are not thcrap_configure and thcrap_loader. They are just tiny launchers. They have 2 jobs - Install the C runtime if needed. - Run the revelant exe from the bin directory. We need these 2 launchers in order to stuff all the DLLs in a subdirectory, where the user don't see them. More details on thcrap_configure and thcrap_loader in the "bin" section.
bin
Programs
thcrap_configure
Let's start with an easy one. thcrap_configure is the thcrap configuration tool. You use it the first time you run thcrap, in order to choose which patches you want and where are you game. You can use it again if you want a new patch stack, or if you want to add a new game. You can also use it in a discussion to complain about our lack of UI design skills. Because patch stacks are stored in files, you can have more than one. Making a new one won't erase the old one, unless you create one with the same name. It has the following steps: 1. Splash screen 2. Download repositories list 3. Let the user create a patch stack 4. Download common files 5. Let the user select its games 6. Ask about shortcuts creation 7. Download game-specific files 8. End screen
I'll describe step 2 when describing the "neighbors" field of repo.js.
There are 2 distinct download steps because:
- We need step 3 for steps 4 and 5.
- We need step 4 for step 5.
- We need step 5 for step 7.
In step 5, we look for games on the user's computer. For that, we need the games exes' hashes. All these hashes are stored in nmlgc/base_tsa/versions.js and nmlgc/base_tasofro/versions.js. Other patches (even 3rd-party ones) could also provide a version.js with new games if they wanted to. So, for step 5, we need to download at least the common files (as in, files that aren't game-specific) for base_tsa and base_tasofro.
And because we treat these 2 patches just like any patch, we need the user to choose them before we do that.
As for the step 7, if a user only has th06, he don't need to download 3 terabytes of Tasofro fighter translations, he only needs to download the th06 translations. So we want to do this step after we know which games the user have.
From a user interaction PoV, I think we would benefit from moving the game selection before the patch stack creation. This is the first (and only) question that traditional patching tools ask, so this is the only one users expect. Also, we patch games. How could we patch them if we don't even know where they are? For us, it makes sense because we only download the patch files in our directory and we apply them at runtime, but it doesn't match with what our users expect. I even had a friend who know a bit about the thcrap internals, who still got confused about the game selection step not being at the start. By putting the game selection first, we could merge the 2 download steps. We could also hide patches that aren't compatible with the user's games from the patches selection list. On the implementation side, I guess the most flexible way to do this would be to move the games hashes from versions.js to repo.js. Anyway, all this describes a potential design improvement, not the current implementation.
thcrap_loader
The 2nd executable. You never run this one directly, yet you run it indirectly all the time: the shortcuts generated by thcrap points to thcrap_loader with a few arguments.
The thcrap_loader process has 3 parts:
Command line parsing
Usage: thcrap_loader.exe [config.js ...] game
config.js is the run configuration generated by thcrap_configure. You can specify several run configurations, but only the last one will be used.
For backward compatibility reasons, if the run configuration is a relative path, it is relative to the config directory.
game can be either a game name or a game path. thcrap_loader will run this game and patch it. If it is a game name, it will use games.js to resolve it to a game path.
Update (optionnal)
If updates are enabled, the update UI will be displayed here. The UI starts by checking for updates for thcrap itself. Then, it checks for updates for "core files" - game-independant files. They tend to depend on each other, and new versions of thcrap itself tend to depend on them being up to date, so we update them first. Then we update the files for the game the user is running, and then we run the game.
Run and patch the game
The goal of thcrap_loader is to run and patch a game, so this step is the most important. The patching is done in-process by thcrap.dll, so we need to inject thcrap.dll into the game.
We start by running the game in suspended mode. Windows will load the program and put it in suspended mode before executing any instruction. We find the program entry point, and we replace 2 bytes there with the instruction for an infinite loop. Then, we let the program run a bit, until it reached its entry point. This let it execute the Windows in-process loading code, which initializes a lot of things we will need later. Then, we allocate some executable memory in the game process, write some core in there that basically says "load thcrap.dll and run its init function", and create a thread in the game process with this executable memory as its entry point. With this, the game will load thcrap.dll and run its init function. Then we resume the game process, and we leave. Our work is done.
Plugins
thcrap.dll
The core of thcrap. All the in-game patching is done from here. All the common functions used by plugins are here. Basically, it contains everything that don't need to be something else.
thcrap_tsa and thcrap_tasofro
The game support plugin. thcrap_tsa will provide support for Team Shanghai Alice games (the ones made by ZUN), and thcrap_tasofro will provide support for Tasofro games (the official fighters, th175, and some Tasofro fangames). When patching a game, only the relevant game support plugin is loaded.
thcrap_update
The update plugin. It updates thcrap itself and all the patches.
It can be disabled by renaming it or removing it.
In order to make it easily removable, no binary links directly to it. thcrap.dll provides wrapper functions for every function in thcrap_update.dll. These wrapper functions will try to load thcrap_update.dll, and will provide a fallback implementation if the dll doesn't exist. Of course, these fallback implementations will never access internet, and will try to do whatever makes sense in a no-internet scenario. RepoDiscover_wrapper won't try to download new repositories, and will just load the local ones. stack_update_wrapper won't do anything.
win32_utf8
An UTF-8 library. For many Windows functions with A/W variants (for example, CreateFileA and CreateFileW), it adds a new U variant (for example CreateFileU). This variant can take UTF-8 strings, and also supports strings encoded with a fallback encoding.
In thcrap, we need to support many languages, so we need to support some Unicode variant. And because the original games don't support any form of Unicode, we need to somehow inject some Unicode into them. The native Unicode format on Windows is UTF-16, but UTF-16 tend to have a NUL byte every 2 bytes (at least for English), and the game uses a NUL byte as a string delimiter (because it works with plain C strings in SHIFT-JIS encoding). UTF-8 have been designed to be easier to use in these scenarios. It also uses a NUL byte as its string delimiter, and won't allow any NUL byte anywhere else in the string. So, when we replace the game's SHIFT-JIS strings with our UTF-8 strings, the end of string marker doesn't change, basic string manipulation functions like strlen and strcpy continue to work, looking for an ASCII substring also works. There are a few edge cases where it doesn't work, but overall, it works rather well. This is why thcrap uses UTF-8 everywhere internally, forces the game to use UTF-8, and converts to UTF-16 at the Windows API boundary.
Thcrap also replaces every call in the game to a A function by a call to a U function. For example, if the game tries to call TextOutExA, thcrap will replace that with a call to TextOutExU. This is needed because, after we pass a UTF-8 string to the game, it needs to be able to use it. If it calls TextOutExA with our UTF-8 string, Windows will try to use it as a SHIFT-JIS string and it won't work. So we replace it with TextOutExU, which will try to use the string parameter as a UTF-8 string first, then as a SHIFT-JIS string if it doesn't work. That way, it can work with both patched and non-patched strings.
Dependencies
- jansson: a JSON library, used all over the place. We use it for all our JSON I/O (and we have a lot of it). We also use it too much when passing objects between modules, and we plan to change that.
- libpng: a PNG library, used mostly for image patching.
- zlib: a deflate library. Dependency of libpng, and used during updates to decompress the zip file.
Update file
update.json is the updates description file. It provides instructions for non-trivial updates, which are executed after downloading and extracting the update. For now, we only used it once, when we moved everything from the root directory to subdirectories. We wanted to update everything automatically when running the update.
This file is an array of objects. Every object describes a step, and has a "detect" key which determines if this step needs to be executed. The other keys describe operations that will be done by the update step.
Example:
[
{
detect: {
exist: "a.txt",
dont_exist: "b.txt"
},
delete: [
"c.txt"
],
move: {
"*.js": "config/"
},
update_repo_paths: {
cfg_files: "config/*.js",
old_path: "/",
new_path: "repos/"
},
fix_repo_paths_post_restructuring: {
cfg_files: "config/*.js",
broken_prefix: "repos/"
}
}
]
This example described an update step, which contains all the supported fields. This step will be executed only if a.txt exists and if b.txt doesn't exist. Both fields can also be arrays. When this step executes, c.txt will be removed, and all the .js files in the root directory will be moved to the config/ directory (which will be created if it doesn't exist). Note that the "move" operation is the only one which supports wildcards. In config/.js, all the run configurations will be edited to replace the repo_paths from "/" to "repos/". And, for all the run configurations in "config/.js", we will look for repeated "repos/" prefixes on repo paths, and keep only one of them, fixing errors if the update_repo_paths have been executed many times during a failed update (it happened).
The shortcuts
They are generated by thcrap_configure, and provide a convenient way to run patches. They don't point to the games. They all run thcrap_loader with a few arguments.
For example, a shortcut for running Touhou 11 with the English patch would look like this:
C:\\Path\\To\\thcrap\\thcrap_loader.exe en.js th11
logs
Well, these are the thcrap logs. We rotate them, which means thcrap_log.txt is always the last log file, then thcrap_log.1.txt, thcrap_log.2.txt etc. We clean older log files. They can be either logs for thcrap_configure, thcrap_loader of a in-process thcrap. Nothing more to say about them.
config
There are 3 kind of configuration files in this folder. config.js, games.js and the run configurations.
config.js
This file just contains a few configuration settings, mostly used by the thcrap_loader UI. We may document them eventually, but they aren't that important.
games.js
games.js is an optional list of games. It only contains a list of key/value pairs in the form of:
th10: path/to/th10.exe
This file is used for the 1st parameter of thcrap_loader. This 1st parameter can either be the path to an exe file, or a game name. And if it is a game name, games.js provides the path to the exe file.
Run configurations
All the other files in the config directory are run configurations. The run configuration is the 2nd parameter of thcrap_loader. While the game path describes which game you want to run, the run configuration describes how you want to run it. The most important part is the patch stack, which describes which patches to apply. The concept of a patch stack have already been described in the introduction. Every patch in the run configuration has a "archive" field, which is the path to the patch root directory. It can either be relative to the thcrap root directory (most common), or absolute.
repos
Repositories contain a lot of things that need to be documented. To learn more about them, [continue to the next part].