'How to remove domain of a websites on pandas dataframe
Here's the dataset
Id Websites
1 facebook.com
2 linked.in
3 stackoverflow.com
4 harvard.edu
5 ugm.ac.id
Heres's my expected output
Id Name
1 facebook
2 linked
3 stackoverflow
4 harvard
5 ugm
Solution 1:[1]
You can split the name by "." and take what appears before the first .
df['Names'] = df['Websites'].str.split('.').str[0]
Output:
Id Websites Names
1 facebook.com facebook
2 linked.in linked
3 stackoverflow.com stackoverflow
4 harvard.edu harvard
5 ugm.ac.id ugm
Solution 2:[2]
Can make use of rsplit to split by the last occurrence of ".". Next part will be extracting out the domain name. Such that when cases like <abc.cde.com> occurs, it will return <abc.cde>
df['Name'].str.rsplit('.', 1).str[0]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | DF.Richard |
| Solution 2 |
