boorlakov Posted April 10, 2023 Share Posted April 10, 2023 (edited) Hey guys! I'm a machine learning researcher, and I'm working on making a language model that can talk like a non-playable character (NPC) in video games. Basically, I want to teach it how to talk like the characters in Baldur's Gate and Planescape: Torment. So, I found a way to get the dialogue from the game files by using WeiDU and turning it into a JSON format. The problem is, I can't figure out how to match up the dialogue with the actual names of the characters in the game. I tried searching for a table that maps the filenames to the NPC names, but no luck so far. Can you guys help me out with this? Any tips on how to extract this info from the game files would be much appreciated! Thanks a bunch ! Edited April 10, 2023 by boorlakov Quote Link to comment
jmerry Posted April 10, 2023 Share Posted April 10, 2023 (edited) It's not direct. Creature files have a "name" field, which is a string reference number. Those strings being referenced are in dialog.tlk, and vary based on the game's language. Different creatures can have the same name, and different creatures can also have the same dialogue. Script actions can change creatures' dialogue files as well, and joinable NPCs always have multiple dialogue files. It's not one-to-one anywhere in the process. As for a general convention, DLG files most often have the same name as the CREs they're attached to. The exceptions come when DLG files are shared across multiple CREs, and when a CRE (usually a recruitable companion) has multiple DLG files. For the recruitable NPCs, their additional DLG files (banter, dialog while joined, dialog after leaving) are defined in 2DA files such as PDIALOG and INTERDIA. Edited April 10, 2023 by jmerry Quote Link to comment
boorlakov Posted April 10, 2023 Author Share Posted April 10, 2023 (edited) I've seen a references to .tlk file in .d files when parsed into JSON in format [NPC turn] -> [Player turn], so I extracted references into strings, but I've not seen any name field in .d file, only things like that: BEGIN ~SCSARLES~ So anyway, if I have a set of possible names it would be great! I asked this, because I've seen a bit of one-to-one name conversion in some translation project and I want to get NPCs names in semi-automatic way. And many creatures does not have much dialogues with character, a clean description, so I want to focus on NPCs, that have description in wikis with alignment and lore. Thanks anyway! Edited April 10, 2023 by boorlakov Quote Link to comment
jmerry Posted April 10, 2023 Share Posted April 10, 2023 Of course .D/.DLG files don't have creature names attached to them. There's nothing stopping you from attaching the same dialogue to multiple creatures with different names, after all. For example, Ashatiel's followers in SoD (Wormgums, Small Kimble, Marius of Tethyr, Zoe Kryn, Sarginson, and Balvin Steadyhand) all share the same BDCRUS.DLG dialogue file. And they're not even the only creatures with that dialogue. The .d format also supports modifying multiple DLGs in a single file; while you won't have this if you're generating .d files by decompiling .dlg files, it's extremely common in the mods the .d format was designed to enable. If you're going to try to connect dialogue to creature names, you'll have to connect dialogue to creatures and creatures to names. Overlap is possible at both stages. Quote Link to comment
boorlakov Posted April 10, 2023 Author Share Posted April 10, 2023 Thank you for your replies, I'm gonna try Quote Link to comment
Recommended Posts
Join the conversation
You are posting as a guest. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.