'pandas: identifying if a first n characters are letter followed by numbers
I have a dataframe with column which contains multiple reference values. I am trying to filter a ceratin group of references which follow this format:
ABCD12345678
Basically the first 4 characters are letters followed by 8 numbers.
I tried:
df_new=df[df['col'].str.match('[a-zA-Z]', na = False)]
and
bew_2=df[df['col'].str.slice(0,4).str.contains('[a2-3]', na = False)]
But neither worked. It would be great if someone could guide me through this.
Solution 1:[1]
I think you can use
m = df['col'].str.match('^\w+\d+$', na = False)
# if the number is fixed
m = df['col'].str.match('^\w{4}\d{8}$', na = False)
print(m)
0 True
Name: col, dtype: bool
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Ynjxsjmh |
