'Grabbing the first element of an array when manipulating an entire column in pandas [duplicate]
I am currently working on a pandas dataframe. I am reformatting the data so that it is easier to understand when running analysis. The default data in the columns is a string that looks like this something | something. An example is Accident | repairable-damage.
I want to create two new columns in the dataframe that split the string into 2 different strings and assign different parts of the split string to different columns.
Incident_Category |
------------------------------
Accident | repairable-damage
Accident | repairable-damage
Accident | hull-loss
This is what the expected output is:
Incident_Category | Incident_Type | Incident_Damage |
----------------------------------------------------------------
Accident | repairable-damage | Accident | repairable-damage
Accident | repairable-damage | Accident | repairable-damage
Accident | hull-losss | Accident | hull-losss
This is the code that I currently have:
print(dropped_dataset['Incident_Category'].unique())
dropped_dataset['Incident_type_array'] = dropped_dataset['Incident_Category'].str.split("|")
dropped_dataset['Incident_type'] = dropped_dataset['Incident_type_array'][0][0]
dropped_dataset['Incident_damage'] = dropped_dataset['Incident_type_array'][[1]]
dropped_dataset.head(7)
It is currently grabbing the first record and assigning the first rows details for the entire dataframe columns.
I want each rows Incident_Category to be split and assigned.
Solution 1:[1]
We can use pandas.Series.str.split:
dropped_dataset[['Incident_Type', 'Incident_Damage']] = dropped_dataset.Incident_Category.str.split(" | ", expand=True, regex=False)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
