'Matching strings with wildcard

I would like to match strings with a wildcard (*), where the wildcard means "any". For example:

*X = string must end with X
X* = string must start with X
*X* = string must contain X

Also, some compound uses such as:

*X*YZ* = string contains X and contains YZ
X*YZ*P = string starts with X, contains YZ and ends with P.

Is there a simple algorithm to do this? I'm unsure about using regex (though it is a possibility).

To clarify, the users will type in the above to a filter box (as simple a filter as possible), I don't want them to have to write regular expressions themselves. So something I can easily transform from the above notation would be good.



Solution 1:[1]

Often, wild cards operate with two type of jokers:

  ? - any character  (one and only one)
  * - any characters (zero or more)

so you can easily convert these rules into appropriate regular expression:

// If you want to implement both "*" and "?"
private static String WildCardToRegular(String value) {
  return "^" + Regex.Escape(value).Replace("\\?", ".").Replace("\\*", ".*") + "$"; 
}

// If you want to implement "*" only
private static String WildCardToRegular(String value) {
  return "^" + Regex.Escape(value).Replace("\\*", ".*") + "$"; 
}

And then you can use Regex as usual:

  String test = "Some Data X";

  Boolean endsWithEx = Regex.IsMatch(test, WildCardToRegular("*X"));
  Boolean startsWithS = Regex.IsMatch(test, WildCardToRegular("S*"));
  Boolean containsD = Regex.IsMatch(test, WildCardToRegular("*D*"));

  // Starts with S, ends with X, contains "me" and "a" (in that order) 
  Boolean complex = Regex.IsMatch(test, WildCardToRegular("S*me*a*X"));

Solution 2:[2]

Using of WildcardPattern from System.Management.Automation may be an option.

pattern = new WildcardPattern(patternString);
pattern.IsMatch(stringToMatch);

Visual Studio UI may not allow you to add System.Management.Automation assembly to References of your project. Feel free to add it manually, as described here.

Solution 3:[3]

For those using .NET Core 2.1+ or .NET 5+, you can use the FileSystemName.MatchesSimpleExpression method in the System.IO.Enumeration namespace.

string text = "X is a string with ZY in the middle and at the end is P";
bool isMatch = FileSystemName.MatchesSimpleExpression("X*ZY*P", text);

Both parameters are actually ReadOnlySpan<char> but you can use string arguments too. There's also an overloaded method if you want to turn on/off case matching. It is case insensitive by default as Chris mentioned in the comments.

Solution 4:[4]

A wildcard * can be translated as .* or .*? regex pattern.

You might need to use a singleline mode to match newline symbols, and in this case, you can use (?s) as part of the regex pattern.

You can set it for the whole or part of the pattern:

X* = > @"X(?s:.*)"
*X = > @"(?s:.*)X"
*X* = > @"(?s).*X.*"
*X*YZ* = > @"(?s).*X.*YZ.*"
X*YZ*P = > @"(?s:X.*YZ.*P)"

Solution 5:[5]

*X*YZ* = string contains X and contains YZ

@".*X.*YZ"

X*YZ*P = string starts with X, contains YZ and ends with P.

@"^X.*YZ.*P$"

Solution 6:[6]

It is necessary to take into consideration, that Regex IsMatch gives true with XYZ, when checking match with Y*. To avoid it, I use "^" anchor

isMatch(str1, "^" + str2.Replace("*", ".*?"));  

So, full code to solve your problem is

    bool isMatchStr(string str1, string str2)
    {
        string s1 = str1.Replace("*", ".*?");
        string s2 = str2.Replace("*", ".*?");
        bool r1 = Regex.IsMatch(s1, "^" + s2);
        bool r2 = Regex.IsMatch(s2, "^" + s1);
        return r1 || r2;
    }

Solution 7:[7]

public class Wildcard
{
    private readonly string _pattern;

    public Wildcard(string pattern)
    {
        _pattern = pattern;
    }

    public static bool Match(string value, string pattern)
    {
        int start = -1;
        int end = -1;
        return Match(value, pattern, ref start, ref end);
    }

    public static bool Match(string value, string pattern, char[] toLowerTable)
    {
        int start = -1;
        int end = -1;
        return Match(value, pattern, ref start, ref end, toLowerTable);
    }

    public static bool Match(string value, string pattern, ref int start, ref int end)
    {
        return new Wildcard(pattern).IsMatch(value, ref start, ref end);
    }

    public static bool Match(string value, string pattern, ref int start, ref int end, char[] toLowerTable)
    {
        return new Wildcard(pattern).IsMatch(value, ref start, ref end, toLowerTable);
    }

    public bool IsMatch(string str)
    {
        int start = -1;
        int end = -1;
        return IsMatch(str, ref start, ref end);
    }

    public bool IsMatch(string str, char[] toLowerTable)
    {
        int start = -1;
        int end = -1;
        return IsMatch(str, ref start, ref end, toLowerTable);
    }

    public bool IsMatch(string str, ref int start, ref int end)
    {
        if (_pattern.Length == 0) return false;
        int pindex = 0;
        int sindex = 0;
        int pattern_len = _pattern.Length;
        int str_len = str.Length;
        start = -1;
        while (true)
        {
            bool star = false;
            if (_pattern[pindex] == '*')
            {
                star = true;
                do
                {
                    pindex++;
                }
                while (pindex < pattern_len && _pattern[pindex] == '*');
            }
            end = sindex;
            int i;
            while (true)
            {
                int si = 0;
                bool breakLoops = false;
                for (i = 0; pindex + i < pattern_len && _pattern[pindex + i] != '*'; i++)
                {
                    si = sindex + i;
                    if (si == str_len)
                    {
                        return false;
                    }
                    if (str[si] == _pattern[pindex + i])
                    {
                        continue;
                    }
                    if (si == str_len)
                    {
                        return false;
                    }
                    if (_pattern[pindex + i] == '?' && str[si] != '.')
                    {
                        continue;
                    }
                    breakLoops = true;
                    break;
                }
                if (breakLoops)
                {
                    if (!star)
                    {
                        return false;
                    }
                    sindex++;
                    if (si == str_len)
                    {
                        return false;
                    }
                }
                else
                {
                    if (start == -1)
                    {
                        start = sindex;
                    }
                    if (pindex + i < pattern_len && _pattern[pindex + i] == '*')
                    {
                        break;
                    }
                    if (sindex + i == str_len)
                    {
                        if (end <= start)
                        {
                            end = str_len;
                        }
                        return true;
                    }
                    if (i != 0 && _pattern[pindex + i - 1] == '*')
                    {
                        return true;
                    }
                    if (!star)
                    {
                        return false;
                    }
                    sindex++;
                }
            }
            sindex += i;
            pindex += i;
            if (start == -1)
            {
                start = sindex;
            }
        }
    }

    public bool IsMatch(string str, ref int start, ref int end, char[] toLowerTable)
    {
        if (_pattern.Length == 0) return false;

        int pindex = 0;
        int sindex = 0;
        int pattern_len = _pattern.Length;
        int str_len = str.Length;
        start = -1;
        while (true)
        {
            bool star = false;
            if (_pattern[pindex] == '*')
            {
                star = true;
                do
                {
                    pindex++;
                }
                while (pindex < pattern_len && _pattern[pindex] == '*');
            }
            end = sindex;
            int i;
            while (true)
            {
                int si = 0;
                bool breakLoops = false;

                for (i = 0; pindex + i < pattern_len && _pattern[pindex + i] != '*'; i++)
                {
                    si = sindex + i;
                    if (si == str_len)
                    {
                        return false;
                    }
                    char c = toLowerTable[str[si]];
                    if (c == _pattern[pindex + i])
                    {
                        continue;
                    }
                    if (si == str_len)
                    {
                        return false;
                    }
                    if (_pattern[pindex + i] == '?' && c != '.')
                    {
                        continue;
                    }
                    breakLoops = true;
                    break;
                }
                if (breakLoops)
                {
                    if (!star)
                    {
                        return false;
                    }
                    sindex++;
                    if (si == str_len)
                    {
                        return false;
                    }
                }
                else
                {
                    if (start == -1)
                    {
                        start = sindex;
                    }
                    if (pindex + i < pattern_len && _pattern[pindex + i] == '*')
                    {
                        break;
                    }
                    if (sindex + i == str_len)
                    {
                        if (end <= start)
                        {
                            end = str_len;
                        }
                        return true;
                    }
                    if (i != 0 && _pattern[pindex + i - 1] == '*')
                    {
                        return true;
                    }
                    if (!star)
                    {
                        return false;
                    }
                    sindex++;
                    continue;
                }
            }
            sindex += i;
            pindex += i;
            if (start == -1)
            {
                start = sindex;
            }
        }
    }
}

Solution 8:[8]

To support those one with C#+Excel (for partial known WS name) but not only - here's my code with wildcard (ddd*). Briefly: the code gets all WS names and if today's weekday(ddd) matches the first 3 letters of WS name (bool=true) then it turn it to string that gets extracted out of the loop.

using System;
using Microsoft.Office.Interop.Excel;
using System.Runtime.InteropServices;
using Range = Microsoft.Office.Interop.Excel.Range;
using System.Diagnostics;
using System.Reflection;
using System.IO;
using System.Text.RegularExpressions;

...
string weekDay = DateTime.Now.ToString("ddd*");

Workbook sourceWorkbook4 = xlApp.Workbooks.Open(LrsIdWorkbook, 0, false, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
Workbook destinationWorkbook = xlApp.Workbooks.Open(masterWB, 0, false, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);

            static String WildCardToRegular(String value)
            {
                return "^" + Regex.Escape(value).Replace("\\*", ".*") + "$";
            }

            string wsName = null;
            foreach (Worksheet works in sourceWorkbook4.Worksheets)
            {
                Boolean startsWithddd = Regex.IsMatch(works.Name, WildCardToRegular(weekDay + "*"));

                    if (startsWithddd == true)
                    {
                        wsName = works.Name.ToString();
                    }
            }

            Worksheet sourceWorksheet4 = (Worksheet)sourceWorkbook4.Worksheets.get_Item(wsName);

...

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Community
Solution 3
Solution 4 Wiktor Stribiżew
Solution 5 Avinash Raj
Solution 6
Solution 7 nb.duong
Solution 8