'data accumulation with pandas
I'd like to accumulate like this.
| timestamp | id | strength |
|---|---|---|
| 1383260400000 | 1 | strength accumulated by square id1 or id2 == 1 |
| 1383260400000 | 2 | strength accumulated by square id1 or id2 == 2 |
if timestamp is different, making another row.
here is code. but, result is below.
df2["strength"] = pd.to_numeric(df["strength"])
for i in range(len(df)):
if (df.iloc[i - 1, 0] != df.iloc[i, 0]) or \
df.iloc[i - 1, 1] != df.iloc[i, 1]:
k = k + 1
print(k)
df2.loc[k, "timestamp"] = df.iloc[i, 0]
df2.loc[k, "id"] = df.iloc[i, 1]
df2.loc[k, "strength"] = df2.loc[k, "strength"] + df.iloc[i, 3]
| timestamp | id | strength |
|---|---|---|
| 4.890249263573871e-05 | ||
| 1383260400000 | 1 | 0.00692127440351541 |
| 1383260400000 | 2 | 0.0031095399526760983 |
| 0.0001136447068307 | ||
| 0.0001557162945607 | ||
| 0.0002134019939307 |
here is raw input data.
| timestamp | squareid1 | squareid2 | strength |
|---|---|---|---|
| 1383260400000 | 1 | 751 | 1.0024511991972666e-06 |
| 1383260400000 | 1 | 752 | 4.636474149834957e-05 |
| 1383260400000 | 1 | 753 | 0.0001289563565283 |
| 1383260400000 | 1 | 754 | 6.849944595888879e-05 |
| 1383260400000 | 1 | 755 | 9.220729726979178e-07 |
| 1383260400000 | 2 | 1 | 5.1182385382982246e-05 |
| 1383260400000 | 2 | 55 | 5.1182385382982246e-05 |
| 1383260400000 | 2 | 56 | 3.0469237948520085e-06 |
| 1383260400000 | 2 | 152 | 8.513520849740889e-05 |
Solution 1:[1]
df['group_me'] = df.apply(lambda x: x.squareid1 if x.squareid1 - 1 != x.squareid2 else x.squareid2, axis=1)
output = df.groupby(['timestamp', 'group_me'], as_index=False)['strength'].sum()
print(output)
Output:
timestamp group_me strength
0 1383260400000 1.0 0.000297
1 1383260400000 2.0 0.000139
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | BeRT2me |
