'how to find no of categorical columns and numerical columns in dataset

I need to find number of numerical and categorical columns

Check how many categorical and numerical columns are there

Categorical - object type

Numerical - int,float

Boolean - bool

df = pd.read_csv("titanic.csv")

as i can only get name of the columns with df._get_numeric_data().columns i need sum of the columns



Solution 1:[1]

You can use columns = df.applymap(np.isreal).all(), Output will be,

PassengerId     True
Pclass          True
Name           False
Sex            False
Age             True
SibSp           True
Parch           True
Ticket         False
Fare            True
Cabin          False
Embarked       False
dtype: bool

All columns with numarical values will return true, othervice return false.

Also, you can get the true and false count using

print((columns).value_counts())

Output :

True     6
False    5
dtype: int64

Which means is df has 6 numerical and 5 categorical columns.

Solution 2:[2]

First check dtype per column:

df = pd.DataFrame({'float': [1.0],
                   'int': [1],
                   'datetime': [pd.Timestamp('20180310')],
                   'string': ['foo'],
                   'float2': [1.0]
                  })
df.dtypes
float              float64
int                  int64
datetime    datetime64[ns]
string              object
float2             float64
dtype: object

Then count how many of each type you have:

df.dtypes.value_counts()
float64           2
datetime64[ns]    1
object            1
int64             1
dtype: int64

Solution 3:[3]

df = pd.DataFrame({'a': [1, 2],
                   'b': [True, False],
                   'c': [1.0, 2.0],
                   'd': ['play', 'draw']})

You can use df.dtypes.value_counts() to get the total number of specific datatypes.

Output:

int64      1
bool       1
float64    1
object     1
dtype: int64

To get all numeric columns: df.select_dtypes(include='number')

Output:

    a   c
0   1   1.0
1   2   2.0

To get all categorical columns: df.select_dtypes(include='object')

Output:

    d
0   play
1   draw

Solution 4:[4]

Try this:

## from https://scikit-learn.org/stable/auto_examples/ensemble/plot_stack_predictors.html
  
from sklearn.compose import make_column_selector
        
cat_cols = make_column_selector(dtype_include=object) (df)
print (cat_cols)
## or
num_selector = make_column_selector(dtype_include=np.number)
num_cols = num_selector (df)
print (num_cols)  

 

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 alejandro
Solution 3
Solution 4 user96265