SkillAgentSearch skills...

RegXwild

⏱ Superfast ^Advanced wildcards++? | Unique algorithms that was implemented on native unmanaged C++ but easily accessible in .NET via Conari (with caching of 0x29 opcodes +optimizations) etc.

Install / Use

/learn @3F/RegXwild

README

regXwild

⏱ Superfast ^Advanced wildcards++? *,|,?,^,$,+,#,>,++??,##??,>c in addition to slow regex engines and more.

✔ regex-like quantifiers, amazing meta symbols, and speed...

Unique algorithms that was implemented on native unmanaged C++ but easily accessible in .NET through Conari (recommended due to caching of 0x29 opcodes + related optimizations), and others such as python etc.

Build status release License NuGet package Tests

Build history

Samples | regXwild filter | n ----------------------|----------------------|--------- number = '1271'; | number = '????'; | 0 - 4 year = '2020'; | '##'|'####' | 2 | 4 year = '20'; | = '##??' | 2 | 4 number = 888; | number = +??; | 1 - 3

Samples | regXwild filter ----------------------|---------------------- everything is ok | ^everything*ok$ systems | system? systems | sys###s A new 'X1' project | ^A*'+' pro?ect professional system | pro*system regXwild in action | pro?ect$|open*source+act|^regXwild

Why regXwild ?

It was designed to be faster than just fast for features that usually go beyond the typical wildcards. Seriously, We love regex, I love, You love; 2013 far behind but regXwild still relevant for speed and powerful wildcards-like features, such as ##?? (which means 2 or 4) ...

🔍 Easy to start

Unmanaged native C++ or managed .NET project. It doesn't matter, just use it:

C++

#include <regXwild.h>
using namespace net::r_eg::regXwild;
...
EssRxW rxw;
if(rxw.match(_T("regXwild"), _T("reg?wild"))) {
    // ...
}

C# if Conari

using dynamic l = new ConariX("regXwild.dll");
...
if(l.match<bool>("regXwild", "reg?wild")) {
    // ...
}

🏄 Amazing meta symbols

ESS version (advanced EXT version)

metasymbol | meaning -----------|---------------- * | {0, ~} | | str1 or str2 or ... ? | {0, 1}, ??? {0, 3}, ... ^ | [str... or [str1... |[str2... $ | ...str] or ...str1]| ...str2] + | {1, ~}, +++ {3, ~}, ... # | {1}, ## {2}, ### {3}, ... > | Legacy > (F_LEGACY_ANYSP = 0x008) as [^/]*str | [^/]*$ >c | 1.4+ Modern > as [^c]*str | [^c]*$

EXT version (more simplified than ESS)

metasymbol | meaning -----------|---------------- * | {0, ~} > | as [^/\]+ | | str1 or str2 or ... ? | {0, 1}, ??? {0, 3}, ...

🧮 Quantifiers

1.3+ ++??; ##??

regex | regXwild | n ----------------|------------|--------- .* | * | 0+ .+ | + | 1+ .? | ? | 0 | 1 .{1} | # | 1 .{2} | ## | 2 .{2, } | ++ | 2+ .{0, 2} | ?? | 0 - 2 .{2, 4} | ++?? | 2 - 4 (?:.{2}|.{4}) | ##?? | 2 | 4 .{3, 4} | +++? | 3 - 4 (?:.{1}|.{3}) | #?? | 1 | 3

and similar ...

Play with our actual Unit-Tests.

🚀 Awesome speed

  • ~2000 times faster when C++.
  • For .NET (including modern .NET Core), Conari provides optional caching of 0x29 opcodes (Calli) and more to get similar to C++ result as possible.

Match result and Replacements

1.4+

EssRxW::MatchResult m;
rxw.match
(
    _T("number = '8888'; //TODO: up"),
    _T("'+'"),
    EssRxW::EngineOptions::F_MATCH_RESULT,
    &m
);
//m.start = 9
//m.end = 15
...
input.replace(m.start, m.end - m.start, _T("'9777'"));
tstring str = _T("year = 2021; dd = 17;");
...
if(rxw.replace(str, _T(" ##;"), _T(" 00;"))) {
    // year = 2021; dd = 00;
}

🍰 Open and Free

Open Source project; MIT License, Enjoy 🎉

License

The MIT License (MIT)

Copyright (c) 2013-2021  Denis Kuzmin <x-3F@outlook.com> github/3F

[ ☕ Make a donation ]

regXwild contributors: https://github.com/3F/regXwild/graphs/contributors

We're waiting for your awesome contributions!

Speed

Procedure of testing

  • Use the algo subproject as tester of the main algorithms (Release cfg - x32 & x64)
  • In general, calculation is simple and uses average as i = (t2 - t1); (sum(i) / n) where:
    • i - one iteration for searching by filter. Represents the delta of time t2 - t1
    • n - the number of repeats of the matching to get average.

e.g.:

{
    Meter meter;
    int results = 0;

    for(int total = 0; total < average; ++total)
    {
        meter.start();
        for(int i = 0; i < iterations; ++i)
        {
            if((alg.*method)(data, filter)) {
                //...
            }
        }
        results += meter.delta();
    }

    TRACE((results / average) << "ms");
}

for regex results it also prepares additional basic_regex from filter, but of course, only one for all iterations:

meter.start();

auto rfilter = tregex(
    filter,
    regex_constants::icase | regex_constants::optimize
);

results += meter.delta();
...

Please note:

  • +icase means ignore case sensitivity when matching the filter(pattern) within the searched string, i.e. ignoreCase = true. Without this, everything will be much faster of course. That is, icase always adds complexity.
  • Below, MultiByte can be faster than Unicode (for the same platform and the same way of module use) but it depends on specific architecture and can be about ~2 times faster when native C++, and about ~4 times faster when .NET + Conari and related.
  • The results below can be different on different machines. You need only look at the difference (in milliseconds) between algorithms for a specific target.
  • To calculate the data, as in the table below, you need execute algo.exe

Sample of speed for Unicode

340 Unicode Symbols and 10^4 iterations (340 x 10000); Filter: L"nime**haru*02*Magica"

algorithms (see impl. from algo) | +icase [x32]| +icase [x64] ------------------------------------------|-------------|------------- Find + Find | ~58ms | ~44ms
Iterator + Find | ~57ms | ~46ms
Getline + Find | ~59ms | ~54ms
Iterator + Substr | ~165ms | ~132ms
Iterator + Iterator | ~136ms | ~118ms
main :: based on Iterator + Find | ~53ms | ~45ms
​ ​ | ​ | Final algorithm - EXT version: | ~50ms | ~26ms
Final algorithm - ESS version: | ~50ms | ~27ms
​ ​ | ​ | regexp-c++11(regex_search) | ~59309ms | ~53334ms
regexp-c++11(only as ^match$ like a '==') | ~12ms | ~5ms
regexp-c++11(regex_match with endings .*) | ~59503ms | ~53817ms

ESS vs EXT

350 Unicode Symbols and 10^4 iterations (350 x 10000);

Operation (+icase) | EXT [x32] | ESS [x32] | EXT [x64] | ESS [x64] ----------------------|------------|------------|------------|------------ ANY | ~54ms | ~55ms | ~32ms | ~34ms ANYSP | ~60ms | ~59ms | ~37ms | ~38ms ONE | ~56ms | ~56ms | ~33ms | ~35ms SPLIT | ~92ms | ~94ms | ~58ms | ~63ms BEGIN | --- | ~38ms | --- | ~19ms END | --- | ~39ms | --- | ~21ms MORE | --- | ~44ms | --- | ~23ms SINGLE | --- | ~43ms | --- | ~22ms

For .NET users through Conari engine:

Same test Data & Filter: 10^4 iterations

Release cfg; x32 or x64 regXwild (Unicode)

Attention: For more speed you need upgrading to Conari 1.3 or higher !

algorithms (see impl. from snet) | +icase [x32] | +icase [x64] | ​
--------------------------------------------|--------------|--------------|--- regXwild via Conari v1.2 (Lambda) - ESS | ~1032ms | ~1418ms | x regXwild via Conari v1.2 (DLR) - ESS | ~1238ms | ~1609ms | x regXwild via Conari v1.2 (Lambda) - EXT | ~1117ms | ~1457ms | x regXwild via Conari v1.2 (DLR) - EXT | ~1246ms | ~1601ms | x ​ ​ | ​ | ​ | regXwild via Conari v1.3 (Lambda) - ESS | ~58ms | ~42ms | << regXwild via Cona

View on GitHub
GitHub Stars28
CategoryDevelopment
Updated6mo ago
Forks5

Languages

C++

Security Score

87/100

Audited on Sep 5, 2025

No findings