I am trying to follow scikit learn example on decision trees: from sklearn.datasets import load_iris from sklearn import tree X, y = load_iris(return_X_y=True)
I have a set of training data that consists of X, which is a set of n columns of data (features), and Y, which is one column of target variable. I am trying to
I have read many blogs but was not satisfied with the answers, Suppose I train tf-idf model on few documents example: " John like horror movie." " Ryan w
I am want remove all non dictionary english words from text corpus. I have removed stopwords, tokenized and countvectorized the data. I need extract only the E
I am trying to pickle a sklearn machine-learning model, and load it in another project. The model is wrapped in pipeline that does feature encoding, scaling etc
I am using SKLearn to run SVC on my data. from sklearn import svm svc = svm.SVC(kernel='linear', C=C).fit(X, y) I want to know how I can get the distance of
I'm reading about decision trees and bagging classifiers, and I'm trying to show the first decision tree that is used in the bagging classifier. I'm confused a
I am trying to use manhattan distance for SpectralClustering() in Sklearn. I am trying to set the affinity parameter to be manhattan, but getting the following
I want to process quite big ARFF files in scikit-learn. The files are in a zip archive and I do not want to unpack the archive to a folder before processing. He
I'm searching for the most appropriate tool for python3.x on Windows to create a Bayesian Network, learn its parameters from data and perform
I'm getting this weird error: classification.py:1113: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
I try to train and test several scikit-learn models and attempt to print off the accuracy. Only some of these models work, others fail with th
I use a CatBoostClassifier and my classes are highly imbalanced. I applied a scale_pos_weight parameter to account for that. While training with an evaluation d
When using the scikit-learn library in Python, I can use the CountVectorizer to create ngrams of a desired length (e.g. 2 words) like so: from sklearn.metrics.
I want to get the coefficients of my sklearn polynomial regression model in Python so I can write the equation elsewhere.. i.e. ax1^2 + ax + bx2^2 + bx2 + c I'
I'm trying to implement SMOTENC inside a column transformer. However I'm getting error. The code and the error is provided below. #Create a mask for categorical
I have a dataset with 7 labels in the target variable. X = data.drop('target', axis=1) Y = data['target'] Y.unique() array(['Normal_Weight', 'Overweight_Level_
I am trying to determine roc_auc_score for a fit model on a validation set. I am seeing some conflicting information on function inputs. Documentation says: "y_
Usually, we apply cross_val_score to the Sklearn models by doing the following way. scores = cross_val_score(clf, X, y, cv=5, scoring='f1_macro') Now I have my
I am using truncated SVD from scikit-learn package. In the definition of SVD, an original matrix A is approxmated as a product A ≈ UΣV* where