'Matching numeric substring in a URL to flag

Been scorching the net and stack overflow content for the past week, tried everything ik but can't find my error

I want to flag these URLs (in Pyspark) as Brand

  1. https://aaa.com/en-GB/GB/c10092.html
  2. https://aaa.com/en-GB/GB/c10040-p0.html
  3. https://aaa.aaa.com/en-GB/GB/p/100713

The fixed pattern I saw here was that after "/c" and "p/" there were at least 3 digits and wrote this

f1 = df1.withColumn("Flag", when((col("uni_referer").rlike("%https://aaa.aaa.com/en-GB/GB/p/\d{3}%")),'Brand'))

But it's not flagging, can someone please help? Thanks



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source