'Create and trip report with end latitude and logitude
Please help, I have a data set structured like below
ss={'ride_id': {0: 'ride1',1: 'ride1',2: 'ride1',3: 'ride2',4: 'ride2',
5: 'ride2',6: 'ride2',7: 'ride3',8: 'ride3',9: 'ride3',10: 'ride3'},
'lat': {0: 5.616526,1: 5.623686, 2: 5.616555,3: 5.616556,4: 5.613834, 5: 5.612899,
6: 5.610804,7: 5.616614,8: 5.644431,9: 5.650771, 10: 5.610828},
'long': {0: -0.231901,1: -0.227248,2: -0.23192,3: -0.23168,4: -0.223812,
5: -0.22869,6: -0.226193,7: -0.231461,8: -0.237549,9: -0.271337,10: -0.226157},
'distance': {0: 0.0,1: 90.021,2: 138.0751,3: 0.0,4: 90.0041,5: 180.0293,6: 180.562, 7:0.0,8: 90.004,9: 180.0209,10: 189.0702},}
df=pd.DataFrame(ss)
the ride_id column indicates the number of trips taken in a window to make up the ride.
For example, ride1 consists of 2 trips, the first trip starts at index 0 and ends at index 1, then trip 2 starts at index 1 and ends at index 2.
I want to create a new data frame of trip reports, where each row will have the start coordinates (lat, long) and trip end coordinates(end_lat,end_long) taken from the next row and then distance. The results should look like the data frame below
sf={'ride_id': {0: 'ride1',1: 'ride1',2: 'ride2',3: 'ride2',4: 'ride2',},
'lat': {0: 5.616526,1: 5.623686,2: 5.616556,3: 3.613834, 4: 5.612899},
'long': {0: -0.231901,1: -0.227248,2: -0.23168,3: -0.223812,4: -0.22869},
'end_lat':{0: 5.623686,1: 5.616555,2: 5.613834,3: 5.612899,4: 5.610804},
'end_long':{0: -0.227248,1: -0.23192,2: -0.223812,3: -0.22869,4: -0.226193},
'distance': {0: 90.02100,1: 138.07510,2: 90.00410,3: 180.02930,4: 180.5621},}
df_s=pd.DataFrame(sf)
df_s
OUT:
ride_id lat long end_lat end_long distance
0 ride1 5.616526 -0.231901 5.623686 -0.227248 90.0210
1 ride1 5.623686 -0.227248 5.616555 -0.231920 138.0751
2 ride2 5.616556 -0.231680 5.613834 -0.223812 90.0041
3 ride2 3.613834 -0.223812 5.612899 -0.228690 180.0293
4 ride2 5.612899 -0.228690 5.610804 -0.226193 180.5621
I tried to group the data frame by the ride_id to isolate each ride_id, but I'm stuck, any ideas are warmly welcomed.
Solution 1:[1]
We can do groupby with shift then dropna
df['start_lat'] = df.groupby('ride_id')['lat'].shift()
df['start_long'] = df.groupby('ride_id')['long'].shift()
df = df.dropna()
df
Out[480]:
ride_id lat long distance start_lat start_long
1 ride1 5.623686 -0.227248 90.0210 5.616526 -0.231901
2 ride1 5.616555 -0.231920 138.0751 5.623686 -0.227248
4 ride2 5.613834 -0.223812 90.0041 5.616556 -0.231680
5 ride2 5.612899 -0.228690 180.0293 5.613834 -0.223812
6 ride2 5.610804 -0.226193 180.5620 5.612899 -0.228690
8 ride3 5.644431 -0.237549 90.0040 5.616614 -0.231461
9 ride3 5.650771 -0.271337 180.0209 5.644431 -0.237549
10 ride3 5.610828 -0.226157 189.0702 5.650771 -0.271337
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | BENY |
