Jump to content

[tutorial] Slightly streamlined handling of tra files via HANDLE_CHARSETS


Recommended Posts

This is an attempt to make handling languages on EE and non-EE games a bit smoother, building on work by @AL|EN (here) and myself (here).

You need to put the attached file into the 'lib' subfolder of your mod folder (i.e. mymod/lib).

Then, providing

(1) you keep your .tra files in 'mymod/tra' or 'mymod/lang'

(2) the names of your components, and similar strings used by WEIDU while installing, live in setup.tra

(3) you don't want to automatically load any tra files except setup.tra (you're going to load them all using LOAD_TRA and WITH_TRA, and/or use USING and AUTO_TRA)

(4) your .tra files are encoded in UTF-8

(5) you wrote the mod in English originally (so all tra files are definitely there in English)

all you need to do is

BACKUP ~weidu_external/backup/mymod~
AUTHOR "DavidW [insert link to forum page]"
VERSION ~[whatever]~

ALWAYS
	INCLUDE "%MOD_FOLDER%/lib/charset_wrapper.tph"
	LAF charset_wrapper RET out_path END
END 
AUTO_TRA "%out_path%/%s"

LANGUAGE "English" english "test/lang/english/setup.tra"
LANGUAGE "Old English" oldenglish "test/lang/english/setup.tra" "test/lang/oldenglish/setup.tra"

In case any of (1)-(5) aren't true, you can adjust for them by inputs to the charset_wrapper function, like this:

BACKUP ~weidu_external/backup/mymod~
AUTHOR "DavidW [insert link to forum page]"
VERSION ~[whatever]~

ALWAYS
	INCLUDE "%MOD_FOLDER%/lib/charset_wrapper.tph"
	LAF charset_wrapper 
		INT_VAR 
			from_utf8=1 // set this to 0 if your tra files are encoded for non-EE games and need to be converted to utf-8
			overwrite=0 // set this to 1 if you want the function to redo the conversion every time you install a new
				// component. (I recommend doing this while developing so that changes you make to the .tra files
				// get caught immediately.)
			verbose=0 // set this to 1 to get more feedback from HANDLE_CHARSETS
		STR_VAR 
			tra_path="" // set this to the folder containing your tra files, if it's not 'tra' or 'lang'. It doesn't matter
					// whether or not you include the mod folder root, i.e. 'mymod/mytra' and 'mytra' get treated the same
			setup="setup" // set this to the name of the tra file with your component names in, if it's not setup.tra
			load="" // set this to a list of tra files you want to always be loaded, separated by spaces. (You can include 
				// the '.tra' or not as you like, it doesn't matter)
			default_language="english" // set this to the language you wrote the mod in
RET out_path END
END 
AUTO_TRA "%out_path%/%s"

LANGUAGE "English" english "test/lang/english/setup.tra"
LANGUAGE "Old English" oldenglish "test/lang/english/setup.tra" "test/lang/oldenglish/setup.tra"

Functionally, the main differences with earlier versions are:

  • we copy tra files over to weidu_external/lang/%LANGUAGE% whether or not iconv is run. The main point of doing so is to make sure all the tra files are present, by using default-language versions if they're not present on the desired language. HANDLE_CHARSETS does this automatically, but it is good to have it happen even if we don't need to convert to/from UTF-8.
  • the whole thing only runs once (not once per component), which saves time.
  • Hopefully the interface is a bit easier for novices.

Here's the actual function:

DEFINE_ACTION_FUNCTION charset_wrapper
	INT_VAR from_utf8=1//boolean // set to 0 if your tra files aren't in UTF8
			overwrite=0//boolean	 // set to 1 if you want to regenerate the converted files every run
			verbose=0//boolean	// set to 1 to get more feedback from HANDLE_CHARSETS
			silent=0//boolean // set to 1 if you don't want to be warned when tra files from a folder in the extra_tra_folders list overlap with the main list
	STR_VAR	tra_path="" // set to where your tra files are (with or without '%MOD_FOLDER%'
						// if they're not in %MOD_FOLDER%/tra or %MOD_FOLDER%/lang
			setup_tra="setup"	// set to whatever you're keeping your WEIDU installation strings in
			load=""	// set to a space-separated list of any tra files you want loaded 
			default_language="english" // set to whatever language you wrote the mod in
			extra_tra_folders="" // set to a space-separated list of any additional folders (relative to %MOD_FOLDER%) containing TRA files
	RET out_path
BEGIN
	// set out path
	OUTER_SPRINT out_path "weidu_external/lang/%MOD_FOLDER%"
	// find the tra location
	ACTION_IF "%tra_path%" STR_EQ "" BEGIN
		// try to guess 'tra' or 'lang'
		ACTION_IF DIRECTORY_EXISTS "%MOD_FOLDER%/lang" && DIRECTORY_EXISTS "%MOD_FOLDER%/tra" BEGIN
			FAIL "charset_wrapper error: %MOD_FOLDER% contains both tra and lang subfolders(!) You need to specify which one you use for your tra files."
		END ELSE
		ACTION_IF DIRECTORY_EXISTS "%MOD_FOLDER%/lang" BEGIN
			OUTER_SPRINT tra_path "%MOD_FOLDER%/lang"
		END ELSE
		ACTION_IF DIRECTORY_EXISTS "%MOD_FOLDER%/tra" BEGIN
			OUTER_SPRINT tra_path "%MOD_FOLDER%/tra"
		END ELSE BEGIN
			FAIL "charset_wrapper error: you didn't specify tra_path and it's not 'tra' or 'lang'."
		END
	END ELSE 
	// add '%MOD_FOLDER%' if it's not there already
	ACTION_IF !(INDEX ("%MOD_FOLDER%" "%tra_path%")=0) BEGIN
		OUTER_SPRINT tra_path "%MOD_FOLDER%/%tra_path%"
	END
	ACTION_IF overwrite BEGIN 
		OUTER_SET proceed=1 // always proceed if overwrite=1
	END ELSE BEGIN
		OUTER_SET proceed=1
		ACTION_BASH_FOR "%out_path%/%LANGUAGE%" ".*\.tra$" BEGIN
			OUTER_SET proceed=0 // proceed if out_path is empty
		END
	END
	ACTION_IF proceed BEGIN
		// set the no-copy array
		ACTION_CLEAR_ARRAY noconvert_array
		OUTER_SPRINT $noconvert_array(0) "%setup_tra%"
		// set the reload array
		ACTION_CLEAR_ARRAY reload_array
		OUTER_PATCH "%load% " BEGIN
			ind=0
			REPLACE_EVALUATE "\([^ ]+\) " BEGIN
				SPRINT $reload_array("%ind%") "%MATCH1%"
				++ind
			END
			""
		END
		// get extra_tra into an array
		OUTER_PATCH "%extra_tra_folders% " BEGIN
			CLEAR_ARRAY extra_tra_array
			REPLACE_EVALUATE "\([^ %TAB%]+\)[ %TAB%]+" BEGIN
				SPRINT $extra_tra_array("%MOD_FOLDER%/%MATCH1%") ""
			END
			""
		END
		// are we on enhanced edition?
		// later versions of EE have bgee.lua. Very early versions of BGEE have monkfist.2da
		OUTER_SET enhanced_edition=( FILE_EXISTS_IN_GAME bgee.lua || FILE_EXISTS_IN_GAME monkfist.2da )
		// are we converting?
		OUTER_SET convert=!(enhanced_edition=from_utf8)
		ACTION_IF convert && !FILE_EXISTS "%tra_path%/iconv/iconv.exe" BEGIN
			OUTER_SET convert=0
			WARN "charset_wrapper warning: can't find iconv.exe"
		END
		ACTION_IF convert BEGIN
			OUTER_SPRINT iconv_path "%tra_path%/iconv"
			// run HANDLE_CHARSETS
			LAF HANDLE_CHARSETS
				INT_VAR from_utf8=from_utf8
						infer_charsets=1
						verbose=verbose
				STR_VAR tra_path = EVAL "%tra_path%" // I hate not being able to assume AUTO_EVAL_STRINGS
						out_path = EVAL "%out_path%"
						noconvert_array = noconvert_array
						reload_array = reload_array
						default_language = EVAL "%default_language%"
			END
			ACTION_PHP_EACH extra_tra_array AS extra_tra_path=>discard BEGIN
				ACTION_IF !silent BEGIN
					ACTION_BASH_FOR "%extra_tra_path%/%default_language%" ".*\.tra" BEGIN
						ACTION_IF FILE_EXISTS "%tra_path%/%default_language%/%BASH_FOR_FILE%" BEGIN
							WARN "charset_wrapper: namespace conflict (%BASH_FOR_FILE% exists in both %tra_path% and %extra_tra_path%)"
						END
					END
				END
				LAF HANDLE_CHARSETS
					INT_VAR from_utf8=from_utf8
							infer_charsets=1
							verbose=verbose
					STR_VAR tra_path = EVAL "%extra_tra_path%" 
							out_path = EVAL "%out_path%"
							iconv_path = EVAL "%iconv_path%"
							default_language = EVAL "%default_language%"
				END			
			END
			// run it again for the default language
			ACTION_IF !("%default_language%" STR_EQ "%LANGUAGE%") BEGIN
				LAF HANDLE_CHARSETS
					INT_VAR from_utf8=from_utf8
							infer_charsets=1
							verbose=verbose
					STR_VAR tra_path = EVAL "%tra_path%" 
							out_path = EVAL "%out_path%"
							noconvert_array = noconvert_array
							reload_array = reload_array
							language = EVAL "%default_language%"
				END
				ACTION_PHP_EACH extra_tra_array AS extra_tra_path=>discard BEGIN
					LAF HANDLE_CHARSETS
						INT_VAR from_utf8=from_utf8
								infer_charsets=1
								verbose=verbose
						STR_VAR tra_path = EVAL "%extra_tra_path%" 
								out_path = EVAL "%out_path%"
								iconv_path = EVAL "%iconv_path%"
								language = EVAL "%default_language%"
					END			
				END
			END			
		END ELSE BEGIN
			// do our own copies
			MKDIR "%out_path%"
			MKDIR "%out_path%/%LANGUAGE%"
			MKDIR "%out_path%/%default_language%"
			ACTION_BASH_FOR "%tra_path%/%default_language%" ".*\.tra$" BEGIN
				ACTION_IF !"%BASH_FOR_RES%" STR_EQ "%setup_tra%" BEGIN // don't bother copying this
					COPY "%tra_path%/%default_language%/%BASH_FOR_FILE%" "%out_path%/%default_language%"
					ACTION_IF !("%default_language%" STR_EQ "%LANGUAGE%") BEGIN
						ACTION_IF FILE_EXISTS "%tra_path%/%LANGUAGE%/%BASH_FOR_FILE%" BEGIN
							OUTER_SPRINT tra "%tra_path%/%LANGUAGE%/%BASH_FOR_FILE%"
						END ELSE BEGIN
							OUTER_SPRINT tra "%tra_path%/%default_language%/%BASH_FOR_FILE%"
						END
						COPY "%tra%" "%out_path%/%LANGUAGE%"
					END
				END
			END	
			ACTION_PHP_EACH extra_tra_array AS extra_tra_path=>discard BEGIN
				ACTION_BASH_FOR "%extra_tra_path%/%default_language%" ".*\.tra" BEGIN
					ACTION_IF !silent && FILE_EXISTS "%out_path%/%default_language%/%BASH_FOR_FILE%" BEGIN
						WARN "charset_wrapper: namespace conflict (%BASH_FOR_FILE% exists in both %tra_path% and %extra_tra_path%)"						
					END
					COPY "%extra_tra_path%/%default_language%/%BASH_FOR_FILE%" "%out_path%/%default_language%"
					ACTION_IF !("%default_language%" STR_EQ "%LANGUAGE%") BEGIN
						ACTION_IF FILE_EXISTS "%extra_tra_path%/%LANGUAGE%/%BASH_FOR_FILE%" BEGIN
							OUTER_SPRINT tra "%extra_tra_path%/%LANGUAGE%/%BASH_FOR_FILE%"
						END ELSE BEGIN
							OUTER_SPRINT tra "%extra_tra_path%/%default_language%/%BASH_FOR_FILE%"
						END
						PRINT "tra is %tra%; language is %LANGUAGE%; default language is %default_language%"
						COPY "%tra%" "%out_path%/%LANGUAGE%"
					END					
				END
			END
			// do our own reloads
			ACTION_PHP_EACH reload_array AS int=>tra BEGIN
				ACTION_IF "%tra%" STRING_MATCHES_REGEXP ".+\.tra$" = 0 BEGIN
					LOAD_TRA "%out_path%/%LANGUAGE%/%tra%"				
				END ELSE BEGIN
					LOAD_TRA "%out_path%/%LANGUAGE%/%tra%.tra"
				END
			END
		END
	END
END

Feedback (and bug fixes!) welcomed.

 

 

charset_wrapper.tph

Link to comment
On 10/10/2023 at 12:14 AM, DavidW said:

(3) you don't want to automatically load any tra files except setup.tra (you're going to load them all using LOAD_TRA and WITH_TRA, and/or use USING and AUTO_TRA)

Is this a prerequisite for your function to work or could this be expanded upon? - It's common for mods to have all install strings in setup.tra, but ingame strings in another file (e.g. game.tra), then there is the EE specific game_ee.tra which needs to be reloaded (in case of ANSI -> utf8 conversion), and journal entries are sometimes in a journal.tra, all loaded in the LANGUAGE definition because the strings are either used in the tp2 or over several d/baf files (for journal entries, because they need to match exactly for EraseJournalEntry to work).

Link to comment

I've uploaded an updated version with two changes:

(1) The previous version didn't deal well with situations where a tra file was only partially translated; it tended to choke rather than using the strings from the default-language version when it couldn't find them in the translated version.

(2) The function can now handle a situation where your tra files are in more than one folder: you can feed it a STR_VAR 'extra_tra_folders' which is a space-separated list of folders (relative to %MOD_FOLDER%) where other tra files exist. You still need to avoid namespace conflicts. (I have a somewhat-esoteric use case in mind for this.)

Link to comment

@DavidW I wonder if you could add an option that places converted translation into "weidu_external/%MOD_FOLDER%/lang" instead of "weidu_external/lang/%MOD_FOLDER%" ? While my wish would be to have one standard for the weidu_external directory, it appears that people already prefer the former for their mods, so it could make function adoption easier.

EDIT: typo

Edited by AL|EN
Link to comment

I could, but what's the point? It doesn't make any actual difference (both are guaranteed-unique) and it's transparent to the user: you reference it via the out_path variable anyway so you don't actually need to know where it lives. If people care that much about implementation details they can write their own code!

As for 'one standard', again, why does it matter? My original architecture for weidu_external has categories (backup, data, batch, lang) at the top level, MOD_FOLDER below that, and I still prefer that, but nothing is going to break if people decide they prefer @CamDawg's convention and put MOD_FOLDER at the top level.

Link to comment

Understandably, you write code to suit your preferences. I'm dealing with many mods from many different authors, I have to constantly be aware if I'm checking the right folder. This is also for convenience: it's easier to check everything inside one folder for every mod instead of constantly getting back to two levels. Now, let's say a modded who uses"BACKUP ~weidu_external/tod/backup~ adopted the function. Now the mod-related files are placed in two different folder schemas. The 'standard' matters to make things easier for debugging.

I'm nitpicking here, I will gladly make some tiny modifications to this lib to cover my use case. Can I?

Link to comment

Right, but since WEIDU doesn't enforce any standard here anyway and it's unrealistic to imagine enforcing one on the whole mod ecosystem, any external tool would have to allow for multiple possibilities anyway.

Sure, you're welcome to borrow modify the code; maybe give the function a different name ('charset_wrapper_alien' or something) just to disambiguate.

Link to comment

Hi,

I'd like to use this for my French translation of an IWD2 mod: IWD2EE.

I'm currently obliged to use ANSI encoding for my french .tra files, otherwise I get character errors in game.

I've looked at a few other iwd2 mods translated into French and they all use ANSI for their french .tra files.

I'd like to use vscode instead of notepad++ to work but vscode doesn't have the ability to open a file with ansi encoding.
It suggests Windows 1252 but I can't commit anything in this encoding so it's not a solution.

I found HANDLE_CHARSETS in the weidu documentation which can convert UTF8 to/from ANSI but I had no idea how to use this function. Then I found AL|EN's topic and this one from DavidW.

I don't need to touch .tra files in languages other than French, so how can I use your code to convert only my French .tra files so as not to overload the installation for non-French speakers?

Thank you

 

Link to comment

Join the conversation

You are posting as a guest. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...