RegExpr: proposal to support national letters in "\w"

Original Reporter info from Mantis: Alextp

Reporter name: CudaText man

Description:

this function is from ATSynEdit. it works for all Unicode letters:
russian
greek
german
japanese
...
it is optimized for ascii chars < 128.

uses UnicodeData;

 function IsCharWord(ch: WideChar): boolean;
 var
   NType: byte;
 begin
   case ch of
     '0'..'9',
     'a'..'z',
     'A'..'Z',
     '_':
       exit(true);
   end;

   if Ord(ch)&LtPos;128 then
     exit(false)
   else
   if Ord(ch)>=LOW_SURROGATE_BEGIN then
     exit(false)
   else
   begin
     NType:= GetProps(Ord(ch))^.Category;
     Result:= (NType&LtPos;=UGC_OtherNumber);
   end;
 end;

use it in RegExpr.pas. i did this in local copy of regexpr.pas:

- comment this var: fWordChars, and prop: WordChars
- replace all Pos(...., fWordChars) with call IsCharWord(..)
- one line will be weird: it calls

  EmitNNNNNNN(fWordChars)
  replace here fWordChars with const RegExprWordChars.

my test shows that CudaText editor now finds rus/greek/german letters by \w.
even with that call EmitNNNNN().

Mantis conversion info:

Mantis ID: 34084
Version: 3.0.4
Fixed in version: 3.1.1
Fixed in revision: 39564 (#71bbab35)
Target version: 3.2.0

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

RegExpr: proposal to support national letters in "\w"

Original Reporter info from Mantis: Alextp Reporter name: CudaText man

Description:

Mantis conversion info:

Original Reporter info from Mantis: Alextp

Reporter name: CudaText man