Jump to content

Weidu strings and regexes


mickabouille

Recommended Posts

For some reason, I find the strings and regex support in weidu very unwieldy. Actually, the whole value/variable notion seems somewhat fuzzy.

Most of the things I will state below are what I gathered in a very limited, short and incomplete attempt at learning to use it. It's _suspicions_ not certitudes, and an attempt to obtain confirmation or counter-examples (or maybe "don't do that").

  1. Weidu strings are byte arrays, meaning there is no encoding/decoding done from files, they are in memory as they are on files.
  2. As a consequence, you can't compare or match strings coming from files with different source encodings, except when using ASCII, that is, the characters with codes 0-127, in a ASCII compatible charset (mostly, utf8 and latin charsets ISO-8859-XX)
  3. another probable consequence: STRING_LENGTH doesn't return the character count but the byte count (?) which is different when the text comes from utf8
  4. weidu (probably?) doesn't do normalization of unicode "strings"
  5. (I suppose utf8 is the only unicode encoding that is used, but I have not checked that)
  6. you probably can't compare/match two strings either (for example comapre the precomposed character é with e with diacritic acute accent)
  7. STRING_COMPARE_CASE, FILE_CONTAINS probably don't ignore case for characters not in ASCII 0-127

Those are probably more "I didn't find how":

  1. How do you do non-capturing groups in regexes ?
  2. How do you check is a string contains another one (STRING_CONTAINS_REGEXP doesn't, the second parameter is interpreted as a regex, not a string, so for example dots need to be escaped)
Link to comment

1. You can't by default, you'd have to use an external program that does support pcre-like regexes. It's more of a limit of the ocaml stdlib than anything else.

2. You suggested a workaround yourself — you can have an escaping function mangle the string first.

Link to comment

Join the conversation

You are posting as a guest. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...