'Join two columns of integers in a pandas dataframe to a column of tuples

I want to combine two columns of features into one column, where each row will represent a data point as a tuple.

For example, here is my data frame:

      Weather  Temp  Play
0         2     1     0
1         2     1     0
2         0     1     1
3         1     2     1
4         1     0     1
5         1     0     0

I want it to look something like this:

                 x     Play
0              (2,1)     0
1              (2,1)     0
2              (0,1)     1
3              (1,2)     1
4              (1,0)     1
5              (1,0)     0

I want to then use this for model.fit(df[x], df[Play]) for Bernoulli Naive Bayes.

Is this at all possible? I am trying to avoid using lists. How can I do this for n columns next time?

Solution 1:^[1]

You can use zip

df['x'] = list(zip(df.Weather, df.Temp))

   Weather  Temp  Play       x
0        1     1     4  (1, 1)
1        2     1     5  (2, 1)
2        3     1     6  (3, 1)

Solution 2:^[2]

df.apply() can be used for a variety of abnormal cases such as this one:

df['x'] = df.apply(lambda x: (x.Weather, x.Temp), axis=1)

Output:

   Weather  Temp  Play       x
0        2     1     0  (2, 1)
1        2     1     0  (2, 1)
2        0     1     1  (0, 1)
3        1     2     1  (1, 2)
4        1     0     1  (1, 0)
5        1     0     0  (1, 0)

Solution 3:^[3]

To complement the answer of @SruthiV, if you want to obtain the shown format (where you replace the 2 columns by a new one), you can remove the columns while using them with pop:

df['x'] = list(zip(df.pop('Weather'), df.pop('Temp')))

Output:

   Play       x
0     0  (2, 1)
1     0  (2, 1)
2     1  (0, 1)
3     1  (1, 2)
4     1  (1, 0)
5     0  (1, 0)

Similarly, if you want to insert the new column in the position of the (first) previous one:

df.insert(df.columns.get_loc('Weather'), 'x',
          list(zip(df.pop('Weather'), df.pop('Temp'))))

NB. This operation is in place

Output:

        x  Play
0  (2, 1)     0
1  (2, 1)     0
2  (0, 1)     1
3  (1, 2)     1
4  (1, 0)     1
5  (1, 0)     0

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Sruthi V
Solution 2	BeRT2me
Solution 3	mozway

'Join two columns of integers in a pandas dataframe to a column of tuples

Solution 1:[1]

Solution 2:[2]

Solution 3:[3]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]

Solution 3:^[3]