'How to I set a symbol inside the record separator of awk

How do I include symbols into the record separator of awk. I know the basic syntax like this:

awk 'BEGIN{RS="[:.!]"}{if (tolower($0) ~ "$" ) print $0 }'

which will separate a single line into separate records based on ! . and : but I also want to include symbols like a green checkmark this . I am having trouble understanding the syntax, so I put it in like this

awk 'BEGIN{RS="[:.!\u2705]"}{if (tolower($0) ~ "$" ) print $0 }'

which doesnt seem to work.

Sample input is this:

✅  Team collaboration  ✅  Project organisation✅  SSO support✅  API Access✅  Priority Support 


Solution 1:[1]

You need to use a regex with an alternation operator (|) because the character you want to split with consists of three separate UTF8 code units: E2, 9C and 85.

You can use

awk 'BEGIN{RS="[:.!]|\xE2\x9C\x85"} tolower($0) ~ "$"'

See the online demo:

#!/bin/bash
s='? Team collaboration ? Project organisation? SSO support? API Access? Priority Support'
awk 'BEGIN{RS="[:.!]|\xE2\x9C\x85"} tolower($0) ~ "$"' <<< "$s"

Output:


 Team collaboration 
 Project organisation
 SSO support
 API Access
 Priority Support

Note that print $0 is a default action, no need to use it explicitly.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Wiktor Stribiżew