'An elegant way in C# to separate a comma separated list of email addresses
Looking on SO there are various approaches to this problem, however the recommended solution for instance does not deal with \"Last, First\" " and the suggestion posted by richard in that post is missing the code to SetUpTextFieldParser()
I have the following list of email addresses as a string:
string str = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>, \"Last, First\" <[email protected]>";
Current code does a:
str.Split(",");
which produces an incorrect list because of the comma in:
"Last, First"
Anyone got something elegant here to share so that I end up with an array of strings in the form:
Last, First <[email protected]>
[email protected]
First Last <[email protected]>
"First Last" <[email protected]>
"Last, First" <[email protected]>
EDIT - SOLUTION
I ended up using Yacoub Massad's solution as it was simple (regular expressions would be hard to maintain in my dev group as not everyone understands them). Below is the code (Fiddle) with some additions and simplistic testing to make sure all was well:
- Trailing comma in case someone got careless
- Addition of (comment) email address formats from MSDN page
_
using System;
using System.Collections.Generic;
using System.Net.Mail;
public class Program
{
public static void Main()
{
//https://msdn.microsoft.com/en-us/library/system.net.mail.mailaddress(v=vs.110).aspx
//Some esoteric "comment" formats as well as a trailing comma in case someone did not tidy up
string emails = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>, \"Last, First\" <[email protected]>, (comment)\"First, Last\"(comment)<(comment)joe(comment)@(comment)there.com(comment)>(comment),";
List<string> result = new List<string>();
Console.WriteLine("LOOP");
while (true)
{
int position_of_at = emails.IndexOf("@");
if (position_of_at == -1)
{
break;
}
int position_of_comma = emails.IndexOf(",", position_of_at);
if (position_of_comma == -1)
{
result.Add(emails);
break;
}
string email = emails.Substring(0, position_of_comma);
result.Add(email);
emails = emails.Substring(position_of_comma + 1);
}
Console.WriteLine("/LOOP");
//Do some very basic validation of above code
var i = 1;
if (result.Count == 6)
Console.WriteLine("SUCCESS: " + result.Count);
else
Console.WriteLine("FAILURE: " + result.Count);
foreach (string emailAddress in result)
{
Console.WriteLine("==== " + i.ToString());
Console.WriteLine(emailAddress);
Console.WriteLine("/====");
MailAddress mailAddress = new MailAddress(emailAddress);
Console.WriteLine(mailAddress.DisplayName);
Console.WriteLine("---- " + i.ToString());
i++;
}
}
}
Solution 1:[1]
Here is a nice and elegant short method that will do what you ask using a regular expression:
private IEnumerable<string> GetEmails(string input)
{
if (String.IsNullOrWhiteSpace(input)) yield break;
MatchCollection matches = Regex.Matches(input, @"[^\s<]+@[^\s,>]+");
foreach (Match match in matches) yield return match.Value;
}
You would call it like this:
string str = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>, \"Last, First\" <[email protected]>";
IEnumerable<string> emails = GetEmails(str);
Please note that this regular expression does not validate the email addresses, for instance, the email 1@h will be considered valid and you will get it as a match.
Creating such a regex validator would be a difficult job and probably not the best option.
For retrieving purposes, I think it is the ideal tool.
Solution 2:[2]
Not exactly elegant, but try this:
private static IEnumerable<string> GetEntries(string str)
{
List<string> entries = new List<string>();
StringBuilder entry = new StringBuilder();
while (str.Length > 0)
{
char ch = str[0];
//If the first character on the string is a comma, and the entry already contains na '@'
//Add this entry to the entries list and clear the temporary entry item.
if (ch == ',' && entry.ToString().Contains("@"))
{
entries.Add(entry.ToString());
entry.Clear();
}
//Just add the chacacter to the temporary entry item, otherwise.
else
{
entry.Append(ch);
}
str = str.Remove(0, 1);
}
//Add the last entry, which is still in the buffer because it doesn't end with a ',' character.
entries.Add(entry.ToString());
return entries;
}
It will Split entries by comma, but only those entries which contains an '@' character before the ',' character.
You would call it like this:
string str = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>, \"Last, First\" <[email protected]>";
var entries = GetEntries(str);
Solution 3:[3]
shortest method would be:
string str = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>, \"Last, First\" <[email protected]>";
string[] separators = new string[] { "com>,","com,","com>","com"};
var outputEmail = str.Split(separators,StringSplitOptions.RemoveEmptyEntries).Where(s=>s.Contains("@")).Select(s=>{return s.Contains('<') ? (s+"com>").Trim() : (s+"com").Trim();});
foreach (var email in outputEmail)
{
MessageBox.Show(email);
}
Solution 4:[4]
You can use Regex.Split with @"(?<=@\S*)\s+ - it splits on a space (or spaces) preceded by a word containing @:
string str = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>, \"Last, First\" <[email protected]>";
string[] arr = Regex.Split(str, @"(?<=@\S*)\s+");
foreach (var s in arr)
Console.WriteLine(s);
output:
Last, First <[email protected]>,
[email protected],
First Last <[email protected]>,
"First Last" <[email protected]>,
"Last, First" <[email protected]>
Solution 5:[5]
Here's a version that handles a few more edge cases and has fewer allocations:
public static List<string> ExtractEmailAddresses(string text)
{
var items = new List<string>();
if (String.IsNullOrEmpty(text))
{
return items;
}
int start = 0;
bool foundAt = false;
int comment = 0;
for (int i = start; i < text.Length; i++)
{
switch (text[i])
{
case '@':
if (comment == 0) { foundAt = true; }
break;
case '(':
comment++;
break;
case ')':
comment--;
break;
case ',':
HandleLastBlock(i);
break;
}
}
HandleLastBlock(text.Length);
return items;
void HandleLastBlock(int end)
{
if (comment == 0 && foundAt && start < end - 1)
{
var email = new System.Net.Mail.MailAddress(text.Substring(start, end - start));
items.Add(email.Address);
start = end + 1;
foundAt = false;
}
}
}
Solution 6:[6]
Try
UserEmails?.Split(';',',',' ','\n','\t').Where(x => !string.IsNullOrWhiteSpace(x)).ToList();
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | |
| Solution 4 | |
| Solution 5 | |
| Solution 6 | ohavryl |
