'Normalizing rows of pandas DF when there's string columns?

I'm trying to normalize a Pandas DF by row and there's a column which has string values which is causing me a lot of trouble. Anyone have a neat way to make this work?

For example:

               system  Fluency  Terminology  No-error  Accuracy  Locale convention  Other
19  hyp.metricsystem2      111           28       219        98                  0    133
18  hyp.metricsystem1       97           22       242        84                  0    137
22  hyp.metricsystem5      107           11       246        85                  0    127
17   hyp.eTranslation       49           30       262        80                  0    143
20  hyp.metricsystem3       86           23       263        89                  0    118
21  hyp.metricsystem4       74           17       274        70                  0    111

I am trying to normalize each row from Fluency, Terminology, etc. Other over the total. In other words, divide each integer column entry over the total of each row (Fluency[0]/total_row[0], Terminology[0]/total_row[0], ...)

I tried using this command, but it's giving me an error because I have a column of strings

bad_models.div(bad_models.sum(axis=1), axis = 0)

Any help would be greatly appreciated...



Solution 1:[1]

Use select_dtypes to select numeric only columns:

subset = bad_models.select_dtypes('number')

bad_models[subset.columns] = subset.div(subset.sum(axis=1), axis=0)
print(bad_models)

# Output
               system   Fluency Terminology  No-error  Accuracy  Locale convention     Other
19  hyp.metricsystem2  0.211832     0.21374  0.145418  0.193676                  0  0.172952
18  hyp.metricsystem1  0.185115    0.167939  0.160691  0.166008                  0  0.178153
22  hyp.metricsystem5  0.204198    0.083969  0.163347  0.167984                  0   0.16515
17   hyp.eTranslation  0.093511    0.229008  0.173971  0.158103                  0  0.185956
20  hyp.metricsystem3  0.164122    0.175573  0.174635  0.175889                  0  0.153446
21  hyp.metricsystem4  0.141221    0.129771  0.181939   0.13834                  0  0.144343

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1