mickabouille Posted December 10, 2023 Posted December 10, 2023 This one involves substrings lookup I think In the "lib_ietools.tph" file there is a function DEFINE_DIMORPHIC_FUNCTION handle_unusable It searches inside an item for the strrref for "unusable by:" like that: OUTER_SET strref=is_iwd?34753:is_bg2?74251:31522 For example on bg2ee in french it willl decide on @74251 = ~Non utilisable par :~ The little space between "par" and the colon is a "no-breaking space" Then it looks up (IIUC) this string inside an item description. index=INDEX_BUFFER ("%unusable_string%") For the first one (order in fineweapon.2da created in fine_weapons.tpa), ax1h01, it apparentlu obtains [...] Non utilisable par : [...] But there the space between "par" and the colon is a basic ascii space (so not the same character). So it find nothing and index is -1 The DELETE operation that follows ends up with ERROR: [ax1h01.itm] -> [override/ax1h02.itm] Patching Failed (COPY) (Invalid_argument("String.sub / Bytes.sub")) NOTE: The french version is not actually installable at the moment (missing too many tra entries). So this is not easily reproducible right now. Maybe just this component (3020) is installable by itself. Quote
DavidW Posted December 10, 2023 Posted December 10, 2023 OK, I'll deprecate the French translation for now, until someone who speaks French can sort it out. Quote
Alywena Posted December 10, 2023 Posted December 10, 2023 There is a similar problem with the price change requested by Gealan Bayle in v34.3 (not yet tested in v35) : https://github.com/Gibberlings3/SwordCoastStratagems/pull/60 Since REPLACE_TEXTUALLY works with no-breaking spaces, i suggest this as a starting point for finding a cleaner solution: DEFINE_DIMORPHIC_FUNCTION handle_unusable STR_VAR arguments="" RET value BEGIN ACTION_IF enhanced_edition BEGIN OUTER_SET strref=is_iwd?34753:is_bg2?74251:31522 // string with common space OUTER_SET strref_no_breaking_space=is_iwd?34753:is_bg2?742519:31522 // string with no-breaking space ACTION_GET_STRREF strref unusable_string ACTION_IF strref_no_breakin_space != strref BEGIN ACTION_GET_STRREF strref_no_breaking_space unusable_string_no_breaking_space OUTER_PATCH_SAVE arguments "%arguments%" BEGIN REPLACE_TEXTUALLY "%unusable_string_no_breaking_space%" "%unusable_string%" END END OUTER_PATCH_SAVE value "%arguments%" BEGIN index=INDEX_BUFFER ("%unusable_string%") DELETE_BYTES index (BUFFER_LENGTH - index) END END ELSE BEGIN OUTER_SPRINT value "%arguments%" END END Quote
DavidW Posted December 10, 2023 Posted December 10, 2023 If I'm understanding things correctly, this is something to fix at the level of the translation files: both the 'not usable by:' string and the weapon descriptions appear in SCS's tra files, they just need to be translated the same way each time. Quote
mickabouille Posted December 10, 2023 Author Posted December 10, 2023 Ah So that's not modifying the (orignal) description from the vanilla dialog.tlk? Quote
DavidW Posted December 10, 2023 Posted December 10, 2023 No: IIRC this is internal code, used for SCS's own tra files. (It basically lets me enter an oBG2-type item description, including 'not usable by', and then remove it on EE.) Quote
mickabouille Posted December 10, 2023 Author Posted December 10, 2023 Ok, thanks,that'll be easy to fix then. Quote
mickabouille Posted December 26, 2023 Author Posted December 26, 2023 On 12/10/2023 at 9:44 PM, mickabouille said: Ok, thanks,that'll be easy to fix then. Meh famous last words indeed Somehow I can make work on bg2ee OR obg2 but I struggle to have it working on both Quote
mickabouille Posted December 26, 2023 Author Posted December 26, 2023 Sooo I finally understood. On obg2, DEFINE_DIMORPHIC_FUNCTION handle_unusable will skip everything because of ACTION_IF enhanced_edition BEGIN and just return the string that was given as argument (ie leaves the description from fine_weapon.tra as is) On the other hand, on bg2ee OUTER_SET strref=is_iwd?34753:is_bg2?74251:31522 will select @74251 @74251 = ~Non utilisable par ̲:~ The " ̲" character is actually U+00A0 a non-ascii character and (alas) a character that does not exist in cp1252, the charset used for french by obg1/2. The function the looks in the item description the @74251 string to remove it. The description _does not_ come from the bg2ee dialog.tlk but from fine_weapons.tra which is encoded so as to be compatible with obg2. Which means it _can't_ contain the U+00A0 character. It uses (as obg2 does actually) an ascii space U+0020. So as I said in the OP, the lookup of the substring fails, index is -1 and DELETE_BYTES fails, the components ERRORs out. At the time I didn't go far enough and I though, on your advice "I should correct the fine_weapons.tra file to use nbsp instead". The component now works correctly on bg2ee. It even installs on obg2. But then it adds an U+00A0 character in the item description and it shows in-game as a "dot" character instead of a space. I don't see a way to work around this only with the translation file. I'd say the cause is a "source mismatch" where we're modifying data aimed at obg2 with data from bg2ee and we can't really reconcile both. I see some ways to fix it (there are probably others) the option offered by Alywena above using different version of fine_weapons.tra (I think iwdification has both a xxx.tra and a xxx_ee.tra in some places for example) But maybe someone else can propose something that involves only tra files... Quote
DavidW Posted December 26, 2023 Posted December 26, 2023 I'm not sure I follow (yet). The intention for tra files in v35 is that they're all encoded in UTF-8, and then if you're on oBG2 they're all converted by HANDLE_CHARSETS to oBG2 format. It sounds as if you're saying that @74251 is encoded in a different framework than fine_weapons.tra - but that shouldn't be possible. Can you explain further? (I can try to look directly, since my French is not quite 0%, but I'm still terrible enough that I'd rather get expert advice.) Quote
mickabouille Posted December 26, 2023 Author Posted December 26, 2023 (edited) Ok So I was wrong on one point : U+00A0 _is_ a non-breaking space in cp1252 so HANDLE_CHARSET is correctly _not_ converting it in anything when on obg2. BUT the obg2 engine appear to materialize the non-breaking space not as a space but as a dot sign a bit above the bottom line. To be honest, I didn't see it (the dot) myself, someone i had to someone ( @Jazira ) point it (my sight being less precise than it once was) (To be even more honest, I don't use obg2 or this component myself, and testing was inconvenient because tobex under linux seems to have some... quirks, unless it's not linux-specific of course) EDIT: and my screenshot is not even really convincing either, being made on a low res game Edited December 26, 2023 by mickabouille Quote
Jazira Posted December 27, 2023 Posted December 27, 2023 (edited) French translators use this rule: EE Baldur's Gate uses non-breaking spaces (before [;:?!%»], in the "middle" of [0-9]{4,} (1 500, 10 000, 25 000) and after [«]), vanilla Baldur's Gate does not. If a mod is compatible for both, then there are two options to translate: - Either the mod provides two sets of files/entries for every item, one for EE, one for vanilla. (This is the best option, for all languages, since the format of anything below "STATISTICS:" (and the item title) is not handled the same by the engine in EE and vanilla) - Or, there is a single file/entry for every item. This means that the translator has to adapt the EE format of anything below "STATISTICS:", be sure to remove every non-breaking space, and add a "(not )usable by:" for the vanilla users. BUT, the EE users will have "(not )usable by:" twice because it is automatically generated by the engine. Same for the item "title", the item name has to be directly in the description for vanilla users. But it is automatically generated on EE, so EE users will have 2 titles. In my opinion it would be best, for all languages, if we got two sets of files/entries for any item present in any mod compatible for both vanilla and EE. Edited December 27, 2023 by Jazira Quote
Jazira Posted December 27, 2023 Posted December 27, 2023 (edited) On 12/10/2023 at 6:21 PM, Alywena said: There is a similar problem with the price change requested by Gealan Bayle in v34.3 (not yet tested in v35) : https://github.com/Gibberlings3/SwordCoastStratagems/pull/60 Since REPLACE_TEXTUALLY works with no-breaking spaces, i suggest this as a starting point for finding a cleaner solution: DEFINE_DIMORPHIC_FUNCTION handle_unusable STR_VAR arguments="" RET value BEGIN ACTION_IF enhanced_edition BEGIN OUTER_SET strref=is_iwd?34753:is_bg2?74251:31522 // string with common space OUTER_SET strref_no_breaking_space=is_iwd?34753:is_bg2?742519:31522 // string with no-breaking space ACTION_GET_STRREF strref unusable_string ACTION_IF strref_no_breakin_space != strref BEGIN ACTION_GET_STRREF strref_no_breaking_space unusable_string_no_breaking_space OUTER_PATCH_SAVE arguments "%arguments%" BEGIN REPLACE_TEXTUALLY "%unusable_string_no_breaking_space%" "%unusable_string%" END END OUTER_PATCH_SAVE value "%arguments%" BEGIN index=INDEX_BUFFER ("%unusable_string%") DELETE_BYTES index (BUFFER_LENGTH - index) END END ELSE BEGIN OUTER_SPRINT value "%arguments%" END END This option could work too. But it brings me to something I'd wished to express for a while. A bit of criticism toward how SCS (and any mod) handle scripts to edit tiny parts of an entry instead of replacing the whole. I understand this is to maximize eventual compatibility but I find it a bit rather counter productive, and very difficult for translators, who usually don't have the same level of knowledge (regex for example) than say, a modder. We have this example with the Spellhold cost component and non-breaking spaces, as Alywena stated. But there are many others too, like in dw_iwdspells.tra: Quote @201 = ~Peut utiliser la capacit[^ ]+ de[^:]+:~ // 1-4 must exactly match the appropriate strings from the BG2 description of the bard song (other than replacing '.' with '\.') @202 = ~ *[Aa]ugmente \(le \)?moral [^ ]+ \(sa \)?valeur moyenne,?~ @203 = ~ *[Pp]rotection contre la peur\( et\)?~ @204 = ~ *[Aa]paisement de la peur\.?~ @222 = ~Inconv[^n]+nients[^:]*:\|D[^s]+savantages[^:]*:~ // must match appropriate string from kit descriptions // bg1ee/bg2ee -> "Inconvénients :" <- this is a non-breakable space // obg2 -> "Désavantages :" <- this is an ascii space // correctfrbg1ee/correctfrbg2ee -> "Désavantages :" <- this is a non-breakable space @223 = ~Cette capacité remplace le Chant du Barde actuel.~ // must exactly match string from description of EE spcl920.spl @224 = ~Cette faculté remplace la chanson du barde.~ // must exactly match string from description of vanilla spcl920.spl @225 = ~Cette capacité remplace le Chant du barde.~ // must exactly match string from description of correctfrbg2ee spcl920.spl I find it overly complicated, add tons of investigative work (just BG2, for example got 4 sets of French localization: obg2, CorrectfrBG2, bg2ee and CorrectfrBG2EE with massive differences), and we don't even know if it covers every case possible. AND regex, additionally with non-breaking spaces, does not handle special characters (éèïîêôûù, etc.), very common in some languages, at all. I'd rather have something that replaces the whole entry than that. And eventually, we (the French translators) could remove the non-breaking spaces directly on unique entries to ensure compatibility for both EE and vanilla since it's our "shit" to deal with and it does not matter that much to have or not non-breaking spaces at the end. Edited December 27, 2023 by Jazira Quote
DavidW Posted December 29, 2023 Posted December 29, 2023 Addressing the specifics (I'll comment on the more philosophical issues separately when I have a chance). I'm still a bit unsure what the problem is with descriptions, but let me describe it more explicitly (I should probably put this in a 'note to translators' too). An SCS item description consists of some text One of two unique strings, which in English are 'Usable by:' or 'Unusable by:' some more text On an EE install, SCS strips out (2) and (3). To handle this in a translation, you need to make sure that the unique strings, which occur in shared.tra at @100400 and @100401, occur exactly in the item descriptions. They can have whatever spaces, special characters, or the like they need; all that matters is that they occur in the item description exactly as they occur in shared.tra. Everything is internal to SCS's tra files; what the dialog.tlk file says is irrelevant. If there's some language-specific way this could break, I can't see what it is. Quote
DavidW Posted December 29, 2023 Posted December 29, 2023 On the general issue Jazira raises: I certainly appreciate that swapping out descriptions can cause language problems (especially in languages that have multiple dialog.tlk files). There may be places where this can be avoided, and I'll try to keep a lookout for them. In general, though, SCS (and, even more so, ToF) does this sort of thing because it's needed for compatibility. The example Jazira gives above (where a string is matched to the bard kit description) is an example of this: it needs to patch *all* bard descriptions, including potentially descriptions added by third-party mods, to add 'cannot use additional bard songs' to the disadvantages list of the kit. Quote
Recommended Posts
Join the conversation
You are posting as a guest. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.