'RandomForestClassifer with large feature datatypes

Is it possible to mix small datatypes (such as bits) and long datatypes (such as 256-bit hashes) when using a machine learning model in scikit-learn such as the RandomForestClassifier?

I have the following scenario:

from sklearn.ensemble import RandomForestClassifier

clf = RandomForestClassifier()

X = [[1, 2, 3, 'verylongfeature1'], [1, 1, 2, 'verylongfeature2']]
y = [1, 0]

clf.fit(X,y)

which gives the following error:

ValueError: could not convert string to float: 'verylongfeature1'

Is the RandomForestClassifier limited to 64-bit float input features?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source