Jump to content

diff tool for IE formats?


critto

Recommended Posts

Hmm... I was more referring to the fact that it outputs the diff in human-readable form which is immensely useful by itself for debugging and sanity checks (without losing your own sanity reading hex diffs between files or messing around in GUI). It was part of my plan to add offsets to the ieparse.py's output somehow :) Didn't realise it already does that (should've paid more attention to your commit log example).

 

How can I enable output of the offsets? I played around with diff and ieparse, but haven't figured it out yet. I suspect there may be some command line option, but it's definitely not in iediff, since it appears to pass any arguments to the actual diff command.

Link to comment

Thanks. I've managed to build and run it on Mac Os X:

 

 

MBP-Mac-2:gemrb-ielister critto$ g++ ielister.cpp -o ielister -pedantic -Wall
ielister.cpp:3173:5: warning: variable 'controlcount' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
        if(winoffset==os.fileoffset)
           ^~~~~~~~~~~~~~~~~~~~~~~~
ielister.cpp:3194:31: note: uninitialized use occurs here
        int *ctable = (int *) malloc(controlcount*sizeof(int));
                                     ^~~~~~~~~~~~
ielister.cpp:3173:2: note: remove the 'if' if its condition is always true
        if(winoffset==os.fileoffset)
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
ielister.cpp:3163:28: note: initialize the variable 'controlcount' to silence this warning
        int ctoffset, controlcount;
                                  ^
                                   = 0
1 warning generated.

Not sure if the warning would lead to something catastrophic but the tool appear to be functioning properly. I'll try to use them both.

And you were right, the offset info is indeed already present in ieparse.py, so no need for further hacking.

Link to comment

Something's wrong with ieparse.py's output when it displays a structure of ITMv1 file. For all four kit usability bytes, if the bits are set then the mask names from the first byte are shown. Example (that's iediff's output):

0x002F Kit usability 4: 6 (Cleric of Helm|Cleric of Lathlande 0x002F Kit usability 4: 6 (Cleric of Helm|Cleric of Lathlande

Offset 0x2F does not contain those masks. In fact, these are bits for Wizardslayer & Kensai. They set and displayed correctly but the labels are wrong. It's been a while since I wrote something in Python. If you can offer a quick solution, that'd be great. Otherwise, I'll see if I can get it figured out later.

 

Format.py, most probably, line 546. A single byte is read and when a lambda function iterates over set bits, it concatenates some of the first eight records from the kit_usability_mask dictionary (or is it hash?). The structure is not processed as long. Either the structure of the format should be changed or a quick hack of kit_usability_mask_1 / kit_usability_mask_2 / etc. should be implemented somehow.

Link to comment

you're right, splitting the mask into 4 pieces is the simplest way (just reassign them lower in the format description in itm_v1.py). I doubt it's worth it to add an extra structural member just to specify the mask byte offset.

Link to comment

Yes, indeed. Now it functions properly. By the way, the Flags DWORD in the same ITM v1 format, its bit 5 in the respective mask is labeled as unknown bit although it is the Copyable bit according to IESDP.

 

Another question, now about the ielister. Is there a simple way to teach it to print the size of record located at specific offset? Its output is more appropriate for me to build a generator of WeiDU patch sequences on top of in order to process my files and turn them into weidu-style patches. Having size of record (byte, short, etc.) would be helpful so that I don't re-create the same sort of dictionary that is listed in itm_v1.py et al. I'll try to figure something out, but it's been even more time since I wrote something serious in C++.

 

EDIT: OK, disregard that, I haven't woken up yet. The actual value written at the offset already allows me to figure out what length is it.

Link to comment

One more cosmetic thing. Hex values are converted to decimals for display: "Saving throw bonus: 4294967294"

These are actually signed values, it would be more appropriate to display value above as "-2". The quick solution would be to patch stream.py at line 145, but I am not sure how appropriate is this in the grand scheme of things:

-        return struct.unpack ('<I', v)[0]
+        return struct.unpack ('<i', v)[0]

One more thing I am thinking about. Neither ielister nor ieparse do not take into consideration the actual size in bytes of both files. For example, I hacked together a quick tool that weidu-style patches an item according to the diff.

The vanilla item had 34 equipping effects whereas the modded item (that was copied over in the past) had only 14. I've generated a sequence of write_somethings to patch the vanilla item so that it equals the mod's version. What happens is I write a new amount of effects and appropriate feature/ext. header table offsets into respective positions according to the ITM format and patch up the first 14 effects of the original item so that they match the 14 effects of the modded version.

Now, technically I have a proper diff (both ieparse and ielister confirm this), but there is a huge chunk of garbage bytes still left in the file where the old 20 effects used to be. Tools such as NI show them as "Unused bytes".

Two issues now:

1) how to teach iediff / ielister to show these leftovers?

2) how to get rid of them?

I remember seeing some sort of WeiDU tools that validate and fix the structure of IE file formats. Do such tools exist, preferably in WeiDU format, so that I can append it to my auto-generated patch sequence? I can see from documentation that WeiDU includes things like FJ_CRE_REINDEX, but none for item files.

Link to comment

For the unpack, there's no simple answer. However, I think most unsigned values are pretty low, so making that conditional on the unpacked value being > UINT_MAX/2 could be a good compromise.

 

As for the removal of extra cruft, I'd just calculate where the last feature block ends and truncate the file there. Ideally WeiDU would do this for you once you set a lower feature count than before. Oh, I'm sure it has some feature deletion function — wouldn't it work if you first wiped them?

Link to comment
As for the removal of extra cruft, I'd just calculate where the last feature block ends and truncate the file there. Ideally WeiDU would do this for you once you set a lower feature count than before. Oh, I'm sure it has some feature deletion function — wouldn't it work if you first wiped them?

No, WeiDU does not appear do so. Probably, because I am not using any of the smarter functions such as alter_item_effect. The patches I generate are sequences of write_byte / write_long / write_short / write_ascii parsed from diffs generated with the help of ielister:

  WRITE_LONG   0x0412   91            // Parameter 2
  WRITE_BYTE   0x0416   0             // Timing metho
  WRITE_BYTE   0x0417   1             // Resistance
  WRITE_LONG   0x0418   30            // Duration
  WRITE_LONG   0x042e   1             // S. throw typ
  WRITE_LONG   0x0432   "-2"          // S. throw bon
...

Damn, post editor removed all that I wrote after the code block. Long story short, I ended up calculating difference between two files and if the modded one is shorter I calculate and do a truncate via DELETE_BYTES.

Link to comment

I knew this is not going to be simple. One of the files has been severely truncated (from 930 to 162), now the offset has only 2 bytes in size as shown in the output below. This is messing up the whole diff for me.

0000h Signature    ITM V1            | 00h Signature    ITM V1  
0008h UnID name    00001121          | 08h UnID name    00001121
000ch ID name      00009b23          | 0ch ID name      00001135
0010h Used up                        | 10h Used up              
0018h Attributes   0000006c          | 18h Attributes   00000068

I can make a local patch for myself (lines 222-223), but is there any practical reason for it being two bytes for files below 256 bytes in size? I see that it is used in memory reallocation for output buffer so there might be an issue there, theoretically. But I am not competent enough to draw a definitive conclusion. The tool seems to work fine so far after recompilation.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...