Alternatives to WeiDU? Or, how to write a smarter sound set installer?

Taylan · June 23, 2021

WeiDU is a monster, both in a good and a bad way...

The good thing about it, I suppose, is that you don't need to be a programmer to be able to understand and modify TP2 files. While TP2 is technically a programming language (I'm 99% sure it's Turing complete), it's not really a programming language in the conventional sense and includes a massive amount of convenience commands that are fine-tuned to the task of patching IE game files.

The bad thing is, since it's not a proper programming language, you will easily hit very annoying limitations where a little bit of custom code in a proper language would have solved the issue.

Currently I'm trying to write a sound set installer that will work for all game types: since the first column in CHARSND.2da specifies which sound the row is for, it should be possible for the script to fully automate the patching of CHARSND.2da based on the files you put in the SND directory of your sound set mod. For instance, if I put the files MySound6.wav, MySound9.wav, and MySound20.wav into the SND directory, and I supply a TRA file that defines @6, @9, and @20, then the installer script should automatically patch CHARSND.2da the right way, inserting the values from the TRA into the rows identified by 6, 9, and 20.

I've been reading the WeiDU documentation to figure out whether this is possible, but it seems very difficult. The documentation is also not very well organized I'm afraid. (A single massive table of alphabetically sorted commands is not very nice for navigation, or searching for commands that are useful for a particular purpose, like 2da patching.)

So I wonder: is there an alternative to WeiDU that's more for programmers? I've been thinking something like a Python library for instance, that contains a bunch of classes and functions that more or less correspond to all those useful WeiDU commands, but it would allow me to read a 2da file into a Dictionary object, so I can modify it with regular Python code, then tell the library to turn it back into a 2da. (I'm not actually a Python programmer, but I think Python would be a good fit for this.)

Sometimes I wish I was rich and unemployed. I would love to create something like that myself, or fork WeiDU to create a more powerful variant that supports a proper programming language.

If you don't know of any tool/library like that, maybe you can help me brainstorm how a smart sound set installer can be written in the TP2 format.

EDIT: Looks like someone beat me to it when it comes to writing a smart sound set installer:

https://forums.beamdog.com/discussion/82008/

https://github.com/GraionDilach/Planescape-Torment-Voice-Pack-for-EE-2.6

But that kinda proves my point when you look at the code. It's so over-complicated, defining huge tables and all. (Maybe there's still a better way to do it, dunno. Code clarity might not have been the top priority of the author.) And with that much complexity, you may as well use a regular programming language anyway.

Edited June 23, 2021 by Taylan

subtledoctor · June 23, 2021

I believe some people - @AL|EN and @Aquadrizzt have put some work into something like that. But I don’t know much about it and don’t want to spoil anything in case it’s more than just a gleam in their eye.

Cutting to the chase, though, my broad takeaway from discussions around that subject is that any replacement for Weidu would likely be better; but would likely take years to get to the point of being a viable replacement. Not to mention you would want to include legacy support for .tp2 scripts.

kjeron · June 23, 2021

You can give this a try. Can't guarantee I have all of the legacy slots accurate (char_slot array), but it should take a folder of WAV files suffixed 01-99 and assign them to the rows 1-99 that are available in CHARSND.2DA. Takes a separate TRA file, named same as the WAV files without number suffix, with entries @01 - @99 (only requires entries for those that have a matching WAV file). May not work as is for the gendered languages.

DEFINE_ACTION_FUNCTION  ADD_SND  INT_VAR  label = 0  STR_VAR  res = ~~  res_path = ~~  tra_path = $res(path)  BEGIN
	ACTION_DEFINE_ASSOCIATIVE_ARRAY  char_slot BEGIN
		09=>A  06=>B  07=>C  08=>D  20=>E  26=>F 27=>G  28=>H  32=>I  33=>J  34=>K  18=>L  19=>M  21=>N  22=>O  23=>P 24=>Q  25=>R  75=>S  76=>T  77=>U  78=>V  53=>W
		63=>X  64=>Y  65=>Z  29=>0  66=>1  67=>2  68=>3  69=>4  70=>5  71=>6  72=>7  10=>8  11=>9
	END
	ACTION_CLEAR_ARRAY  CHECK  ACTION_CLEAR_ARRAY  WRITE
	LOAD_TRA  ~%tra_path%/%res%.tra~  OUTER_SPRINT  col ~$ $ %res%~
	COPY_EXISTING  ~CHARSND.2DA~  override  READ_2DA_ENTRIES_NOW  READ 3
		FOR  (i = 1; i < READ; ++i)  BEGIN  SET  $CHECK($READ(~%i%~ 0)) = 0  END
	BUT_ONLY
	ACTION_PHP_EACH  CHECK  AS x => y  BEGIN
		OUTER_WHILE (STRING_LENGTH ~%x%~) < 2  BEGIN  OUTER_SPRINT x ~0%x%~  END
		ACTION_IF  FILE_EXISTS ~%res_path%/%res%%x%.WAV~  BEGIN
			ACTION_MATCH  ~%x%~  WITH
				09 06 07 08 20 26 27 28 32 33 34 18 19 21 22 23 24 25 75 76 77 78 53 63 64 65 29 66 67 68 69 70 71 72 10 11
					BEGIN  OUTER_SPRINT dest $char_slot(~%x%~)  OUTER_SPRINT  dest_path ~sounds~  END
				DEFAULT  OUTER_SPRINT dest ~%x%~              OUTER_SPRINT  dest_path ~override~
			END
			COPY  ~%res_path%/%res%%x%.WAV~ ~%dest_path%/%res%%dest%.WAV~  SPRINT  text (AT x)  SET  $WRITE(~%x%~) = RESOLVE_STR_REF (~%text%~ [%res%%dest%])
		END  ELSE  BEGIN
			OUTER_SET  $WRITE(~%x%~) = ~-1~
		END
	END
	ACTION_PHP_EACH  WRITE  AS  x => y  BEGIN  OUTER_SPRINT   col ~%col% %y%~  END
	APPEND_COL ~CHARSND.2DA~ ~%col%~  COPY_EXISTING ~CHARSND.2DA~ override PRETTY_PRINT_2DA
	APPEND  ~BGEE.LUA~  ~filenames_stringrefs['%res%'] = {%label%, 2}~
END

LAF	ADD_SND INT_VAR label = RESOLVE_STR_REF (~Male: Test Voice~) STR_VAR res = ~TEST~ res_path = EVAL ~%MOD_FOLDER%/SOUNDS~ tra_path = EVAL ~%MOD_FOLDER%/LANG~ END

Magus · June 23, 2021

As @subtledoctor said, you'd need to make the tool backwards compatible with WeiDU. (Not necessarily re-implementing TP2, but at least installation/uninstallation routines).

We briefly discussed this with @AL|EN a year or two ago. In general, I would say it's possible. However, if your only goal is to package soundsets, it's 100% not worth it. Just bite the bullet and learn WeiDU (and just look for ready functions, most any idea has already been implemented in one form or another).

But if you're really serious about development, and have time to spare, the place to start with is iesh.

Jarno Mikkola · June 23, 2021

49 minutes ago, Magus said:

(Not necessarily ... but at least... uninstallation routines).

And you would need this for exactly WHY ?

PS: The BWS does fine with "let's smoke all the files that are extra and overwrite the few that need to be". The uninstall takes about 20 MBs space for a game that can be 5 to 15 GBs with everything installed, biffed and so forth.

But back to the point, which is: Why would you need to uninstall sound files ? Did downloading make you stupi...

suy · June 23, 2021

15 hours ago, Taylan said:

So I wonder: is there an alternative to WeiDU that's more for programmers? I've been thinking something like a Python library for instance, that contains a bunch of classes and functions that more or less correspond to all those useful WeiDU commands, but it would allow me to read a 2da file into a Dictionary object, so I can modify it with regular Python code, then tell the library to turn it back into a 2da. (I'm not actually a Python programmer, but I think Python would be a good fit for this.)

Give a look at iesh: https://github.com/gemrb/iesh Maybe it's enough to get something started for your use case, but it's not something created to make mods.

There are several projects that have implemented parsing of some file formats of the engine, and some can also write them back. But I've seen no mod (yet) using any of those projects.

Yes, WeiDU is fairly bad, but it is what we have so far.

Jarno Mikkola · June 23, 2021

1 hour ago, suy said:

But I've seen no mod (yet) using any of those projects.

Here, Infinity Engine on Steroids on an ... say, (Jailbroken) Iphone. You are not looking at the bird, but it's genes there.

suy · June 23, 2021

I know of GemRB. I meant a mod. Something that programatically alters the game data files, like WeiDU.

Graion Dilach · June 23, 2021

17 hours ago, Taylan said:

EDIT: Looks like someone beat me to it when it comes to writing a smart sound set installer:

https://forums.beamdog.com/discussion/82008/

https://github.com/GraionDilach/Planescape-Torment-Voice-Pack-for-EE-2.6

But that kinda proves my point when you look at the code. It's so over-complicated, defining huge tables and all. (Maybe there's still a better way to do it, dunno. Code clarity might not have been the top priority of the author.) And with that much complexity, you may as well use a regular programming language anyway.

Hey, that's my thread!

Code quality is actually a priority in that project - it is based on https://github.com/Gibberlings3/EE_soundset_tool which is supposed to be the tutorial behind it - so could you explain the issues you have with it? I understand that there are some complications in it (and we just discussed this morning in the G3 Discord how to abstract away the prettyname additions with an inlined plaintext and normalize this up, I'll clean that up this week as soon as I find time for it and try to PR that into the tutorial mod). It is my first mod for the Infinity Engine though, so I can imagine some of that could be even better (esp. considering that the PST pack was what I started with, I'm realllllly not proud of some of the commits in it's history).

The main aim there is to expose the least amount of mess to the novice modder though and the fact that this starts traified (and I include the optional prettyname UI edit to expose a human-friendly name to be exposed into a selection menu instead of the internal filename) requires some complication though. The strategy is to try to minimize the tp2 content to just what the soundset creator needs to declarea nd obscure everything else into the tpa macro (again, I know that the prettyname function violates this but that's because I always seen that as a messy haxxx workaround and I didn't know until this morning how to fix it properly).

Sure this can be even simpler, you don't even need WeiDU for soundmods even. WeiDU only starts being required if you want to use a sound slot which doesn't have a single-character suffix and/or you want subtitle support.

Yes, WeiDU isn't really a common language, but I don't consider it bad about it. I mean, you bring Python up and I couldn't stand that one at all back at the uni (disclaimer; I work as a Linux admin). In fact back at the uni I had neural networks as a course and working with WeiDU somewhat reminds me of that... both needs a different mindset compared to the more common and widely-known languages. It's still fairly human-readable for the most part so I guess it's still better than dealing with the formats in binary (and it uses regexp, that can be helpful in the longterm) .

EDIT: Oh, @kjeron just FYI, 2.6 heavily expanded CHARSND.2DA. It has more than 70 elements by now, although some of them are padding, but basically the NPC slots are all available and listed now.

Edited June 23, 2021 by Graion Dilach

kjeron · June 23, 2021

41 minutes ago, Graion Dilach said:

EDIT: Oh, @kjeron just FYI, 2.6 heavily expanded CHARSND.2DA. It has more than 70 elements by now, although some of them are padding, but basically the NPC slots are all available and listed now.

Yes, my code will fill in any of the 99 rows it may have for which a WAV and tra entry are provided (even if some are still unused by the game). I don't see it expanding past 99, as that would require reformatting the CRE structure.

critto · June 23, 2021

Taylan, check out SCS. It uses Perl to do a lot of heavy lifting when it comes to code generation. IIRC, Perl binary is stand-alone and packaged with the mod, at least for the Windows version. Linux/OS X usually have it by default at /usr/env/perl or something. You can similarly use Perl or another light-weight portable language (Lua comes to mind) to code custom logic.

Taylan · June 24, 2021

Wow, this thread got more attention than I expected.

Indeed iesh sounds like what I had in mind. Funny that it's indeed in Python! Guess I'll look into it in the future, but if it requires a lot of extra development to be WeiDU compatible then I'm not sure if/when I could deliver. Too many hobby projects I'm working on already...

On 6/23/2021 at 4:09 PM, kjeron said:

You can give this a try. [...]

That code is... monstrous. I could try it out, but honestly I'd feel terrible about using it. In any decent programming language this task wouldn't be nearly so complicated.

On 6/23/2021 at 7:03 PM, Graion Dilach said:

Hey, that's my thread!

Code quality is actually a priority in that project - it is based on https://github.com/Gibberlings3/EE_soundset_tool which is supposed to be the tutorial behind it - so could you explain the issues you have with it? I understand that there are some complications in it (and we just discussed this morning in the G3 Discord how to abstract away the prettyname additions with an inlined plaintext and normalize this up, I'll clean that up this week as soon as I find time for it and try to PR that into the tutorial mod). It is my first mod for the Infinity Engine though, so I can imagine some of that could be even better (esp. considering that the PST pack was what I started with, I'm realllllly not proud of some of the commits in it's history).

The main aim there is to expose the least amount of mess to the novice modder though and the fact that this starts traified (and I include the optional prettyname UI edit to expose a human-friendly name to be exposed into a selection menu instead of the internal filename) requires some complication though. The strategy is to try to minimize the tp2 content to just what the soundset creator needs to declarea nd obscure everything else into the tpa macro (again, I know that the prettyname function violates this but that's because I always seen that as a messy haxxx workaround and I didn't know until this morning how to fix it properly).

Sure this can be even simpler, you don't even need WeiDU for soundmods even. WeiDU only starts being required if you want to use a sound slot which doesn't have a single-character suffix and/or you want subtitle support.

Yes, WeiDU isn't really a common language, but I don't consider it bad about it. I mean, you bring Python up and I couldn't stand that one at all back at the uni (disclaimer; I work as a Linux admin). In fact back at the uni I had neural networks as a course and working with WeiDU somewhat reminds me of that... both needs a different mindset compared to the more common and widely-known languages. It's still fairly human-readable for the most part so I guess it's still better than dealing with the formats in binary (and it uses regexp, that can be helpful in the longterm) .

EDIT: Oh, @kjeron just FYI, 2.6 heavily expanded CHARSND.2DA. It has more than 70 elements by now, although some of them are padding, but basically the NPC slots are all available and listed now.

Note that my criticism isn't necessarily aimed at the author of the code. I'm no TP2 expert, so for all I know, that might be the cleanest possible way to write that code in TP2.

But let's think about how else that code could look, and how sound set "packages" for our theoretical code could look, if we had a full general-purpose programming language at our disposal.

Let's start from the end goal of how we want our sound set packages to look. In essence, a sound set consists of:

Name of the sound set.
A set of WAV files.
A mapping from WAV files to subtitle strings.
A mapping from sound IDs (greeting1, battle_cry3, etc.) to WAV files.

Ideally, that's ALL our package would contain. As you said: we want to hide all complexity from the person creating sound set packages.

(Digression: We could "encode" the information mentioned in point 4 directly into the names of the WAV files --i.e., the name of each WAV file would be the sound ID it's for-- but that would force us to duplicate WAV files when we want to use the same WAV for multiple sound IDs. For instance, if we only have 2 battle cry WAV files but the game supports 3 battle cry sounds, then we might use one of the two battlecry WAV files for 2 of the 3 battlecry sound IDs to fill the gap. Therefore, let's scrap the idea of sound IDs as filenames and say that our WAV files can be named however we want; the package will contain an explicit mapping from sound ID to file name.)

Our sound set packages could be zipped directories like:

my_sound_pack
|- greeting1.wav
|- greeting2.wav
|- battlecry1.wav
|- battlecry2.wav
|- ...
|- definitions.txt (file containing the mappings)
|- installer.py (theoretical Python installer)

Where the file definitions.txt contains text like the following:

[config]
name = MyCustomSoundSet
language = en_us

[sounds]
greeting1 = greeting1.wav
greeting2 = greeting2.wav
greeting3 = greeting2.wav
battlecry1 = battlecry1.wav
battlecry2 = battlecry1.wav
battlecry3 = battlecry2.wav
...

[subs]
greeting1.wav = Yes?
greeting2.wav = What is it?
battlecry1.wav = You will fall by my hand!
battlecry2.wav = No mercy for enemies!
...

And that's... literally all "code" the sound set packager would have to write!

(The installer.py in this example is the readily provided sound set installer, like your TPH. The sound set packager doesn't have to touch it.)

Sounds unrealistic? It isn't. I wrote the definitions.txt in an INI format, which Python can parse with its configparser library. The following is an example of how the code could look, if we had a Python library called "weidu" which provides some of the WeiDU functionality in the form of Python classes.

Spoiler


import configparser
import glob
import weidu

# Initialize our theoretical WeiDU library
# This would do some magic to find the game files and such
w = weidu.Init()

# Define mapping from human-readable sound IDs to game-specific integers
if w.game == 'bgee'
  soundIds = {
    'greeting1': 1,
    'greeting2': 2,
    'greeting3': 3,
    'battlecry1': 4,
    'battlecry2': 5,
    ...
  }
elif w.game == 'bg2ee'
  soundIds = {
    ...
  }
elif ...

# Read the definitions for the sound set
defs = configparser.ConfigParser()
defs.read('definitions.txt')

soundSetName = defs['config']['name']
language = defs['config']['language']
sounds = defs['sounds']
subs = defs['subs']

# Copy the WAV files to where they belong
for file in glob.glob('./*.wav')
  w.installSoundFile(file, language)

# In this dictionary we will save mappings from WAV file name to string reference
strRefs = {}

# Open the TLK file for our language
tlk = w.openTLK(language)

for (fileName, text) in subs
  # Add the string to the game, receiving its numeric ref
  strRef = tlk.addString(string, sound=fileName)
  # Remember what WAV file name it belongs to
  strRefs[fileName] = strRef

# Save the modified TLK file
tlk.save()

# Let's say the function parse2DA gives us a "2DA object"
charsnd = w.parse2DA('CHARSND')

# We will build up a new column to add to the 2DA
newCol = []

# The 2DA object lets us get the rows via the rows property
# And each row in turn has a cols property
for row in charsnd.rows
  # The sound's game-specific integer id is in the first column
  intId = row.cols[0]
  # Get the human-readable sound ID for that integer
  soundId = soundIds[intId]
  # Get the WAV file name for that sound ID
  fileName = sounds[soundId]
  # Get the string ref for that WAV file name
  strRef = strRefs[fileName]
  # Add that to the column we're building
  newCol.push(strRef)

# The function .addColumn takes a column title and the list of values for it
charsnd.addColumn(soundSetName, newCol)

# Install the modified 2DA file into the game directory
charsnd.installOverride()

Did I miss any steps? I think I just spent about 2-3 hours writing that, but when I began I didn't even know how string references work so that time includes learning about TLK. Also had to Google some Python stuff as I'm no regular Python programmer.

Note that I've written the code in a super friendly way with lots of comments. A more condensed form might look like:

Spoiler


import configparser
import glob
import weidu

w = weidu.Init()

if w.game == 'bgee'
  soundIds = {
    ...
  }
elif w.game == 'bg2ee'
  soundIds = {
    ...
  }
elif ...

defs = configparser.ConfigParser()
defs.read('definitions.txt')

soundSetName = defs['config']['name']
language = defs['config']['language']
sounds = defs['sounds']
subs = defs['subs']

for file in glob.glob('./*.wav')
  w.installSoundFile(file, language)

strRefs = {}

tlk = w.openTLK(language)
for (fileName, text) in subs
  strRefs[fileName] = tlk.addString(string, sound=fileName)
tlk.save()

charsnd = w.parse2DA('CHARSND')
newCol = []
for row in charsnd.rows
  soundId = soundIds[row.cols[0]]
  fileName = sounds[soundId]
  newCol.push(strRefs[fileName])
charsnd.addColumn(soundSetName, newCol)
charsnd.installOverride()

This assumes one sound set package = one sound set. But it would be fairly easy to allow many sound sets per package. And that still without requiring the packager to modify the Python code. For example every sound set could reside in its own sub-directory, with the WAV files plus the definitions.txt file, and the Python code would be modified to wrap all the code above into a function (like in your TPH) and that function would be called once per sub-directory in the sound set package.

Tell me that's not a lot nicer!

All the pseudo-functions I've used above from the made-up "weidu" Python library should be easy to implement, either by using code from iesh, or with a bit of manual coding.

Edited June 24, 2021 by Taylan
Put long code snippets in spoilers.

kjeron · June 24, 2021

1 hour ago, Taylan said:

Tell me that's not a lot nicer!

It's really not. I think you're confusing IE compatibility for Weidu complexity.

On any game prior to v2.6, one sound file per slot is mandatory, as their suffix is fixed, and which is about 30% of the code I posted (since your original post had them named with other suffixes). Storing the original files by row# is actually very practical, especially now.

Edited June 24, 2021 by kjeron

Taylan · June 24, 2021

1 hour ago, kjeron said:

On any game prior to v2.6, one sound file per slot is mandatory, as their suffix is fixed, and which is about 30% of the code I posted (since your original post had them named with other suffixes). Storing the original files by row# is actually very practical, especially now.

Hmm, help me understand if I got this right: for <=2.5 compatibility, the WAV files need to have a suffix of one character (letter/digit/symbol) in their name, is that right? And the character is equivalent to one of the integer sound IDs, as per the mapping in your code?

If I understood correctly, then I think that's very easy to resolve in my code.

First of all I'd make sure that my imaginary installSoundFile() function supports an optional named argument withName=... which tells the function to rename the file when installing it in the game directory. (That would be a very simple addition to its implementation, like 2-3 lines.)

Then at the start, I'd define the mapping from integer ID to character, just like in your code, just with Python syntax:

soundIdChars = {
  9: 'A', 6: 'B', 7: 'C', ...
}

Then, drop the for-loop with the glob, and instead move the installation of the WAV files into the for-loop over charsnd.rows, like this:

Spoiler


for row in charsnd.rows
  intId = row.cols[0]
  soundId = soundIds[intId]
  fileName = sounds[soundId]
  # Use a <=2.5 compatible name if possible
  if intId in soundIdChars
    name = soundSetName + soundIdChars[intId]
  else
    name = fileName
  w.installSoundFile(fileName, language, withName=name)
  strRef = strRefs[fileName]
  newCol.push(strRef)

Honestly I'm liking this idea so much that I might try to implement those few imaginary "weidu" Python functions over the weekend and see if I can get a super simple sound set package installer like this working.

Edit: Full updated code:

Spoiler


import configparser
import weidu

w = weidu.Init()

if w.game == 'bgee'
  soundIds = {
    ...
  }
elif w.game == 'bg2ee'
  soundIds = {
    ...
  }
elif ...

soundIdChars = {
  9: 'A', 6: 'B', 7: 'C', ...
}

defs = configparser.ConfigParser()
defs.read('definitions.txt')

soundSetName = defs['config']['name']
language = defs['config']['language']
sounds = defs['sounds']
subs = defs['subs']

strRefs = {}

tlk = w.openTLK(language)
for (fileName, text) in subs
  strRefs[fileName] = tlk.addString(string, sound=fileName)
tlk.save()

charsnd = w.parse2DA('CHARSND')
newCol = []
for row in charsnd.rows
  intId = row.cols[0]
  soundId = soundIds[intId]
  fileName = sounds[soundId]
  if intId in soundIdChars
    name = soundSetName + soundIdChars[intId]
  else
    name = fileName
  w.installSoundFile(fileName, language, withName=name)
  strRef = strRefs[fileName]
  newCol.push(strRef)
charsnd.addColumn(soundSetName, newCol)
charsnd.installOverride()

Edited June 24, 2021 by Taylan

kjeron · June 24, 2021

13 minutes ago, Taylan said:

Hmm, help me understand if I got this right: for <=2.5 compatibility, the WAV files need to have a suffix of one character (letter/digit/symbol) in their name, is that right? And the character is equivalent to one of the integer sound IDs, as per the mapping in your code?

Yes, though as I originally mentioned I don't know if my mapping is 100% accurate, one or two were a best guess.

Sign In

Alternatives to WeiDU? Or, how to write a smarter sound set installer?

Recommended Posts

Taylan

Link to comment

subtledoctor

Link to comment

kjeron

Link to comment

Magus

Link to comment

Jarno Mikkola

Link to comment

suy

Link to comment

Jarno Mikkola

Link to comment

suy

Link to comment

Graion Dilach

Link to comment

kjeron

Link to comment

critto

Link to comment

Taylan

Link to comment

kjeron

Link to comment

Taylan

Link to comment

kjeron

Link to comment

Join the conversation

Website

Forums

My Activity Streams

Downloads

Gallery