'How to get the first 9 characters starting from number 5?
I have data that looks like this
0 504189219
1 500618053
2 0537533477
3 966581566618
4 00536079946
I want the output to be something like this
504189219
500618053
537533477
581566618
536079946
Solution 1:[1]
Use str.extract:
df['Col'] = df['Col'].str.extract('(5\d{8})')
print(df)
# Output
Col
0 504189219
1 500618053
2 537533477
3 581566618
4 536079946
Setup:
df = pd.DataFrame({'Col': ['504189219', '500618053', '0537533477',
'966581566618', '00536079946']})
print(df)
# Output
Col
0 504189219
1 500618053
2 0537533477
3 966581566618
4 00536079946
Solution 2:[2]
There is a library called phonenumbers to help you do that job, see this post
Solution 3:[3]
Using the same setup as Corralien, this method is also possible :
df = pd.DataFrame({'Col': ['504189219', '500618053', '0537533477',
'966581566618', '00536079946']})
def getNumber(n):
return n[n.find('5'):n.find('5') + 9]
df['Col'] = df['Col'].apply(getNumber)
print(df)
Same result can be achieved with a lambda expression as well.
Other answers originally did not take into account the constraint of the 9 numbers.
Solution 4:[4]
This may be a more robust approach:
import pandas as pd
def fix(col):
return col[-9:] if len(col) > 8 and col[-9] == '5' else col
df = pd.DataFrame({'Col': ['0404189219', '500618053', '0537533477',
'966581566618', '00536079946']})
df['Col'] = df['Col'].apply(fix)
print(df)
Output:
Col
0 0404189219
1 500618053
2 537533477
3 581566618
4 536079946
Note how in the absence of '5', the original value remains intact
Solution 5:[5]
for r in range(len(df.Col)): df.Col[r][df.Col[r].find("5"):]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Corralien |
| Solution 2 | KingOtto |
| Solution 3 | Titouan L |
| Solution 4 | Albert Winestein |
| Solution 5 | Haider Ali |
