'c# get non-alphanumeric characters in a string as a string
The string likes this:
Lorem, ipsum? dolor_ sit amet, consectetur adipiscing elit.
I want to get from this string which are non-alphanumeric characters like this:
,?_,.
But how? I tried this:
var r = new Regex("[^a-zA-Z0-9]");
var m = r.Match(textBox1.Text);
var a = m.Value;
But it returns only last non-alphanumeric character .
Solution 1:[1]
Try this:
private static string TakeOutTheTrash(string Source, string Trash)
{
return new Regex(Trash).Replace(Source, string.Empty);
}
private static string Output(string Source, string Trash)
{
return TakeOutTheTrash(Source, Trash);
}
var InvertedTrash = @"[a-zA-Z0-9]";
var str = Output(textBox1.Text, InvertedTrash);
// ,?_,.
Solution 2:[2]
You can try Linq as an alternative to regular expressions. All we should do is to filter out (with a help of Where) letters, digits and, probably, whitespaces and then Concat them to a string:
using System.Linq;
...
var str = string.Concat(textBox1
.Text
.Where(c => !char.IsLetterOrDigit(c) && !char.IsWhiteSpace(c)));
If you insist on regular expressions, you have to combine all matches, e.g.
using System.Linq;
using System.Text.RegularExpressions;
...
// I've removed whitespaces - \s - from the match
var str = string.Concat(Regex
.Matches(textBox1.Text, @"[^A-Za-z0-9\s]+")
.Cast<Match>()
.Select(match => match.Value));
Solution 3:[3]
Maybe this =>
string input = "Lorem, ipsum? dolor_ sit amet, consectetur adipiscing elit.";
Regex rgx = new Regex("[a-zA-Z0-9 -]");
string output = rgx.Replace(input, "");
Console.WriteLine(output);
Solution 4:[4]
Note that your regex also returns whitespace, if this is not intentional, you can use [^a-zA-Z0-9\s] instead.
You can get the collection of all matches with r.Matches(textBox1.Text); (See https://docs.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.matches)
If you want to use regex for this you can try
var text = "Lorem, ipsum? dolor_ sit amet, consectetur adipiscing elit.";
var regex = new Regex("[^a-zA-Z0-9\s]");
var result = regex.Matches(text).Concat(match => match.Value);
Solution 5:[5]
You might also make the match a bit more specific and match punctuation characters:
var s = "Lorem, ipsum? dolor_ sit amet, consectetur adipiscing elit.";
var r = new Regex(@"\p{P}");
Console.WriteLine(String.Join("", r.Matches(s)));
Output
,?_,.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Habip O?uz |
| Solution 2 | |
| Solution 3 | Dmitry Bychenko |
| Solution 4 | |
| Solution 5 | The fourth bird |
