Jump to content

Fails to install 3020 (Replace many +1 magic weapons) in french


Recommended Posts

This one involves substrings lookup I think

In the "lib_ietools.tph" file there is a function

DEFINE_DIMORPHIC_FUNCTION handle_unusable

It searches inside an item for the strrref for "unusable by:" like that:

OUTER_SET strref=is_iwd?34753:is_bg2?74251:31522

For example on bg2ee in french it willl decide on

@74251 = ~Non utilisable par :~

The little space between "par" and the colon is a "no-breaking space"

Then it looks up (IIUC) this string inside an item description.

index=INDEX_BUFFER ("%unusable_string%")

For the first one (order in fineweapon.2da created in fine_weapons.tpa), ax1h01, it apparentlu obtains

[...]
Non utilisable par :
[...]

But there the space between "par" and the colon is a basic ascii space (so not the same character).

So it find nothing and index is -1

The DELETE operation that follows ends up with

ERROR: [ax1h01.itm] -> [override/ax1h02.itm] Patching Failed (COPY) (Invalid_argument("String.sub / Bytes.sub"))

 

NOTE: The french version is not actually installable at the moment (missing too many tra entries). So this is not easily reproducible right now. Maybe just this component (3020) is installable by itself.

Link to comment

There is a similar problem with the price change requested by Gealan Bayle in v34.3 (not yet tested in v35) : https://github.com/Gibberlings3/SwordCoastStratagems/pull/60

Since REPLACE_TEXTUALLY works with no-breaking spaces, i suggest this as a starting point for finding a cleaner solution: 

DEFINE_DIMORPHIC_FUNCTION handle_unusable
	STR_VAR arguments=""
	RET value
BEGIN
	ACTION_IF enhanced_edition BEGIN
		OUTER_SET strref=is_iwd?34753:is_bg2?74251:31522 // string with common space
		OUTER_SET strref_no_breaking_space=is_iwd?34753:is_bg2?742519:31522 // string with no-breaking space
		ACTION_GET_STRREF strref unusable_string
		ACTION_IF strref_no_breakin_space != strref BEGIN
		    ACTION_GET_STRREF strref_no_breaking_space unusable_string_no_breaking_space
		    OUTER_PATCH_SAVE arguments "%arguments%" BEGIN
		        REPLACE_TEXTUALLY "%unusable_string_no_breaking_space%" "%unusable_string%"
		    END
		END
		OUTER_PATCH_SAVE value "%arguments%" BEGIN
			index=INDEX_BUFFER ("%unusable_string%")
			DELETE_BYTES index (BUFFER_LENGTH - index)
		END
	END ELSE BEGIN
		OUTER_SPRINT value "%arguments%"
	END
END

 

Link to comment

Sooo I finally understood.

On obg2,

DEFINE_DIMORPHIC_FUNCTION handle_unusable

will skip everything because of

ACTION_IF enhanced_edition BEGIN

and just return the string that was given as argument (ie leaves the description from fine_weapon.tra as is)

On the other hand, on bg2ee

OUTER_SET strref=is_iwd?34753:is_bg2?74251:31522

will select @74251

@74251 = ~Non utilisable par ̲:~

The " ̲" character is actually U+00A0 a non-ascii character and (alas) a character that does not exist in cp1252, the charset used for french by obg1/2.

The function the looks in the item description the @74251 string to remove it. The description _does not_ come from the bg2ee dialog.tlk but from fine_weapons.tra which is encoded so as to be compatible with obg2. Which means it _can't_ contain the U+00A0 character. It uses (as obg2 does actually) an ascii space U+0020.

So as I said in the OP, the lookup of the substring fails, index is -1 and DELETE_BYTES fails, the components ERRORs out.

At the time I didn't go far enough and I though, on your advice "I should correct the fine_weapons.tra file to use nbsp instead".

The component now works correctly on bg2ee. It even installs on obg2. But then it adds an U+00A0 character in the item description and it shows in-game as a "dot" character instead of a space.

I don't see a way to work around this only with the translation file. I'd say the cause is a "source mismatch" where we're modifying data aimed at obg2 with data from bg2ee and we can't really reconcile both.

I see some ways to fix it (there are probably others)

  • the option offered by Alywena above
  • using different version of fine_weapons.tra (I think iwdification has both a xxx.tra and a xxx_ee.tra in some places for example)

But maybe someone else can propose something that involves only tra files...

Link to comment

I'm not sure I follow (yet).

The intention for tra files in v35 is that they're all encoded in UTF-8, and then if you're on oBG2 they're all converted by HANDLE_CHARSETS to oBG2 format. It sounds as if you're saying that @74251 is encoded in a different framework than fine_weapons.tra - but that shouldn't be possible. 

Can you explain further? (I can try to look directly, since my French is not quite 0%, but I'm still terrible enough that I'd rather get expert advice.)

Link to comment

Ok

So I was wrong on one point : U+00A0 _is_ a non-breaking space in cp1252 so HANDLE_CHARSET is correctly _not_ converting it in anything when on obg2.

BUT the obg2 engine appear to materialize the non-breaking space not as a space but as a dot sign a bit above the bottom line.

To be honest, I didn't see it (the dot) myself, someone i had to someone ( @Jazira ) point it (my sight being less precise than it once was)

(To be even more honest, I don't use obg2 or this component myself, and testing was inconvenient because tobex under linux seems to have some... quirks, unless it's not linux-specific of course)

Screenshot_20231226_232916.png

 

EDIT: and my screenshot is not even really convincing either, being made on a low res game 😕

Edited by mickabouille
Link to comment

French translators use this rule: EE Baldur's Gate uses non-breaking spaces (before [;:?!%»], in the "middle" of [0-9]{4,} (1 500, 10 000, 25 000) and after [«]), vanilla Baldur's Gate does not.

If a mod is compatible for both, then there are two options to translate:

- Either the mod provides two sets of files/entries for every item, one for EE, one for vanilla. (This is the best option, for all languages, since the format of anything below "STATISTICS:" (and the item title) is not handled the same by the engine in EE and vanilla)

- Or, there is a single file/entry for every item. This means that the translator has to adapt the EE format of anything below "STATISTICS:", be sure to remove every non-breaking space, and add a "(not )usable by:" for the vanilla users. BUT, the EE users will have "(not )usable by:" twice because it is automatically generated by the engine.

Same for the item "title", the item name has to be directly in the description for vanilla users. But it is automatically generated on EE, so EE users will have 2 titles.

In my opinion it would be best, for all languages, if we got two sets of files/entries for any item present in any mod compatible for both vanilla and EE. ;)

Edited by Jazira
Link to comment
On 12/10/2023 at 6:21 PM, Alywena said:

There is a similar problem with the price change requested by Gealan Bayle in v34.3 (not yet tested in v35) : https://github.com/Gibberlings3/SwordCoastStratagems/pull/60

Since REPLACE_TEXTUALLY works with no-breaking spaces, i suggest this as a starting point for finding a cleaner solution: 

DEFINE_DIMORPHIC_FUNCTION handle_unusable
	STR_VAR arguments=""
	RET value
BEGIN
	ACTION_IF enhanced_edition BEGIN
		OUTER_SET strref=is_iwd?34753:is_bg2?74251:31522 // string with common space
		OUTER_SET strref_no_breaking_space=is_iwd?34753:is_bg2?742519:31522 // string with no-breaking space
		ACTION_GET_STRREF strref unusable_string
		ACTION_IF strref_no_breakin_space != strref BEGIN
		    ACTION_GET_STRREF strref_no_breaking_space unusable_string_no_breaking_space
		    OUTER_PATCH_SAVE arguments "%arguments%" BEGIN
		        REPLACE_TEXTUALLY "%unusable_string_no_breaking_space%" "%unusable_string%"
		    END
		END
		OUTER_PATCH_SAVE value "%arguments%" BEGIN
			index=INDEX_BUFFER ("%unusable_string%")
			DELETE_BYTES index (BUFFER_LENGTH - index)
		END
	END ELSE BEGIN
		OUTER_SPRINT value "%arguments%"
	END
END

 

This option could work too.

But it brings me to something I'd wished to express for a while. A bit of criticism toward how SCS (and any mod) handle scripts to edit tiny parts of an entry instead of replacing the whole. I understand this is to maximize eventual compatibility but I find it a bit rather counter productive, and very difficult for translators, who usually don't have the same level of knowledge (regex for example) than say, a modder.

We have this example with the Spellhold cost component and non-breaking spaces, as Alywena stated.

But there are many others too, like in dw_iwdspells.tra:

Quote

 

@201 = ~Peut utiliser la capacit[^ ]+ de[^:]+:~ // 1-4 must exactly match the appropriate strings from the BG2 description of the bard song (other than replacing '.' with '\.')
@202 = ~ *[Aa]ugmente \(le \)?moral [^ ]+ \(sa \)?valeur moyenne,?~
@203 = ~ *[Pp]rotection contre la peur\( et\)?~
@204 = ~ *[Aa]paisement de la peur\.?~

@222 = ~Inconv[^n]+nients[^:]*:\|D[^s]+savantages[^:]*:~ // must match appropriate string from kit descriptions
// bg1ee/bg2ee -> "Inconvénients :" <- this is a non-breakable space
// obg2 -> "Désavantages :" <- this is an ascii space
// correctfrbg1ee/correctfrbg2ee -> "Désavantages :" <- this is a non-breakable space

@223 = ~Cette capacité remplace le Chant du Barde actuel.~ // must exactly match string from description of EE spcl920.spl
@224 = ~Cette faculté remplace la chanson du barde.~ // must exactly match string from description of vanilla spcl920.spl
@225 = ~Cette capacité remplace le Chant du barde.~ // must exactly match string from description of correctfrbg2ee spcl920.spl

 

I find it overly complicated, add tons of investigative work (just BG2, for example got 4 sets of French localization: obg2, CorrectfrBG2, bg2ee and CorrectfrBG2EE with massive differences), and we don't even know if it covers every case possible. AND regex, additionally with non-breaking spaces, does not handle special characters (éèïîêôûù, etc.), very common in some languages, at all.

I'd rather have something that replaces the whole entry than that. And eventually, we (the French translators) could remove the non-breaking spaces directly on unique entries to ensure compatibility for both EE and vanilla since it's our "shit" to deal with and it does not matter that much to have or not non-breaking spaces at the end. ;)

Edited by Jazira
Link to comment

Addressing the specifics (I'll comment on the more philosophical issues separately when I have a chance).

I'm still a bit unsure what the problem is with descriptions, but let me describe it more explicitly (I should probably put this in a 'note to translators' too). An SCS item description consists of

  1. some text
  2. One of two unique strings, which in English are 'Usable by:' or 'Unusable by:' 
  3. some more text

On an EE install, SCS strips out (2) and (3).

To handle this in a translation, you need to make sure that the unique strings, which occur in shared.tra at @100400 and @100401, occur exactly in the item descriptions. They can have whatever spaces, special characters, or the like they need; all that matters is that they occur in the item description exactly as they occur in shared.tra. Everything is internal to SCS's tra files; what the dialog.tlk file says is irrelevant. If there's some language-specific way this could break, I can't see what it is.

Link to comment

On the general issue Jazira raises: I certainly appreciate that swapping out descriptions can cause language problems (especially in languages that have multiple dialog.tlk files). There may be places where this can be avoided, and I'll try to keep a lookout for them. 

In general, though, SCS (and, even more so, ToF) does this sort of thing because it's needed for compatibility. The example Jazira gives above (where a string is matched to the bard kit description) is an example of this: it needs to patch *all* bard descriptions, including potentially descriptions added by third-party mods, to add 'cannot use additional bard songs' to the disadvantages list of the kit.

 

Link to comment

Join the conversation

You are posting as a guest. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...