'Python/Numpy get average of array based on index
I have two numpy arrays, the first one is the values and the second one is the indexes. What I want to do is to get the average of the values array based on the indexes array.
For example:
values = [1,2,3,4,5]
indexes = [0,0,1,1,2]
get_indexed_avg(values, indexes)
# should give me
# [1.5, 3.5, 5]
Here, the values in the indexes array represent the indexes in the final array. Hence:
- First two items in the
valuesarray are being averaged to form the zero index in the final array. - The 3rd and the 4th item in the
valuesarray are being averaged to form the first index in the final array. - Finally the last item is being used to for the 2nd index in the final array.
I do have a python solution to this. But that is just horrible and very slow. Is there a better solution to this? maybe using numpy? or other such libraries.
Solution 1:[1]
import pandas as pd
pd.Series(values).groupby(indexes).mean()
# OR
# pd.Series(values).groupby(indexes).mean().to_list()
# 0 1.5
# 1 3.5
# 2 5.0
# dtype: float64
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | d.b |
