'Is there a way to concatenate columns by groups of 2 same values in pandas
Say i have something like this in a pandas dataframe :
| Entity | Type | Doc | Proj |
|---|---|---|---|
| Daniel | PER | 1 | 1 |
| Daniel | PER | 4 | 2 |
| Daniel | PER | 5 | 3 |
| Daniel | PER | 9 | 6 |
| Daniel | LOC | 7 | 4 |
| 905-888-8988 | ID | 3 | 1 |
| 905-888-8988 | ID | 4 | 2 |
| 905-888-8988 | ID | 14 | 8 |
For each combo of entity and type that is recurring, i'd like to add two new columns for the doc and proj corresponding to the match. I'd like to do this to all possible match between combo of entity.
Edit 1 More detailed explanation to get to the expected outcome
Step 1 - Identify if an "Entity" and "Type" combo has more than 1 occurence in the dataframe.
Step 2 - For each combo that have more than 1 occurence, i would need to represent all possible combination of "Doc" and "Proj" of the combos.
Step 3 - All these possible combination should be represented in pairs of "doc" and "proj"
So the result would look like this in the pandas dataframe :
| Entity | Type | Doc1 | Proj1 | Doc2 | Proj2 |
|---|---|---|---|---|---|
| Daniel | PER | 1 | 1 | 4 | 2 |
| Daniel | PER | 1 | 1 | 5 | 3 |
| Daniel | PER | 1 | 1 | 9 | 6 |
| Daniel | PER | 4 | 2 | 5 | 3 |
| Daniel | PER | 4 | 2 | 9 | 6 |
| Daniel | PER | 5 | 3 | 9 | 6 |
| 905-888-8988 | ID | 3 | 1 | 4 | 2 |
| 905-888-8988 | ID | 3 | 1 | 14 | 8 |
| 905-888-8988 | ID | 4 | 2 | 14 | 8 |
Thanks all for the help
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
