'TypeError: '<=' not supported between instances of 'str' and 'float'
I want to find the number of rows of clin dataframe where the OS_MONTHS value is <= 12.0. The values in the OS_MONTHS are float.
This seems like a trivial question.
import pandas as pd
len(clin["OS_MONTHS"] <= 12.0)
Traceback:
TypeError: '<=' not supported between instances of 'str' and 'float'
Data type:
type(clin["OS_MONTHS"])
pandas.core.series.Series
Dataframe
| SEX | KPS | A header | AGE | OS_MONTHS | |
|---|---|---|---|---|---|
| 0 | 1 | 80 | 44 | 1 | 11.76 |
| 1 | 0 | 100 | 50 | 1 | 4.73 |
| 2 | 1 | 80 | 40 | 1 | 23.16 |
| 3 | 1 | 80 | 61 | 1 | 10.58 |
| 4 | 1 | 80 | 20 | 1 | 35.38 |
Solution 1:[1]
clin["OS_MONTHS"].astype(float) <= 12.0
if you want to get length:
(clin["OS_MONTHS"].astype(float) <= 12.0).value_counts()
or
s = clin["OS_MONTHS"]
len(s[s.astype(float) <= 1.5])
get your data unique values: unique(), there are some values that are not in float format, and you must handle theme in a manner... for example:
clin["OS_MONTHS"][clin["OS_MONTHS"] != '[Not Available]']
Solution 2:[2]
Check this out:
clin["OS_MONTHS"][~clin["OS_MONTHS"].str.replace('.','').str.isdigit()] = float('NaN')
# Then you can apply @MoRe's solution
clin["OS_MONTHS"].astype(float) <= 12.0
Solution 3:[3]
You could try ._convert(numeric=True) . Unlike .astype(float), this will transform to NaN all values it couldn't convert to floats.
So that would be:
len(clin[clin["OS_MONTHS"]._convert(numeric=True)<= 12.0])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | Daniel Weigel |
