'Regex to detect if character is repeated more than three times
I've tried to follow the solution described here: https://stackoverflow.com/a/17973873/2149915 to try and match a string with the following requirements: - More than 3 characters repeated sequentially in the string should be matched and returned.
Examples:
- hello how are you... -> VALID
- hello how are you............. -> INVALID
- hiii -> VALID
- hiiiiii -> INVALID
and so on and so forth, the idea is to detect text that is nonsensical.
So far my solution was to modify the regex in the link as such.
ORIGINAL: ^(?!.*([A-Za-z0-9])\1{2})(?=.*[a-z])(?=.*\d)[A-Za-z0-9]+$
ADAPTED: ^(?!.*([A-Za-z0-9\.\,\/\|\\])\1{3})$
Essentially i removed the requirement for capture groups of numbers and alphanumerics seen here: (?=.*[a-z])(?=.*\d)[A-Za-z0-9]+ and tried to add extra detection of characters such as ./,\ etc but it doesnt seem to match at all with any characters...
Any ideas on how i can achieve this?
thanks in advance :)
EDIT:
i found this regex: ^.*(\S)(?: ?\1){9,}.*$ on this question https://stackoverflow.com/a/44659071/2149915 and have adapted it to match only for 3 characters like such ^.*(\S)(?: ?\1){3}.*$.
Now it detects things like:
- aaaa -> INVALID
- hello....... -> INVALID
- /////.... -> INVALID
however it does not take into account whitespace such as this:
. . . . .
is there a modification that can be done to achieve this?
Solution 1:[1]
To disallow four or more consecutive chars in the string, you need
^(?!.*(.)\1{3,}).*
See the regex demo. If you do not allow an empty string, replace last .* with .+. Details:
^- start of string(?!.*(.)\1{3,})- a negative lookahead that fails the match if there are zero or more chars other than line break chars as many as possible, and then a char captured into Group 1 that is followed with three occurrences of the same char.*- any zero or more chars other than line break chars as many as possible (not necessary if you use a method that does not require a full string match, like[Matcher#find][2]orregex.ContainsMatchIn).
Here is a Kotlin demo (for a change):
import java.util.*
fun main(args: Array<String>) {
val texts = arrayOf<String>("hello how are you...","hiii", "hello how are you.............","hiiiiii")
val re = """^(?!.*(.)\1{3,}).*""".toRegex()
for(text in texts) {
val isValid = re.containsMatchIn(text)
println("${text}: ${isValid}")
}
}
Output:
hello how are you...: true
hiii: true
hello how are you.............: false
hiiiiii: false
NOTE:
If you do not want to limit to consecutive repeatitions modify the pattern above as follows:
^(?!.*(.)(?:.*?\1){3,}).*
See this regex demo. The (?:.*?\1){3,} regex matches three or more occurrences of any zero or more chars other than line break chars as few as possible and then the Group 1 value.
To match across line breaks, replace . with [\s\S] or add (?s) at the start of the pattern.
To limit the repetitions to letters, replace (.) with ([a-zA-Z]) or (\p{L}), and if you need to only check repeated digits, replace (.) with (\d) or ([0-9]).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
