Category "scikit-learn"

Scikit-learn pipeline: Non-finite test scores error / Inconsistent number of samples

I have a dataframe with two columns of texts and only the POS tags (of the same texts), which I want to use for language classification. I am trying to use both

Loop over hidden layer's nodes and create model based on MLP

I want to build an MLP classifier on iris dataset. Actually, I want to build a function that runs the model with N hidden units in the hidden layer and a loop t

Multivariate Linear Regression, coefficients don't match

I'm facing a problem with different linear models from scikit-learn. There is my code from sklearn.linear_model import LinearRegression reg = LinearRegression()

Model works perfectly but GridSearch causes error

While working on a project I have come across a weird error, where fitting my model works perfectly but when I apply gridsearch it gives me an error. The code p

Is there a way to use mutual information as part of a pipeline in scikit learn?

I'm creating a model with scikit-learn. The pipeline that seems to be working best is: mutual_info_classif with a threshold - i.e. only include fields whose mut

Compute class weight function issue in 'sklearn' library when used in 'Keras' classification (Python 3.8, only in VS code)

The classifier script I wrote is working fine and recently added weight balancing to the fitting. Since I added the weight estimate function using 'sklearn' lib

How to write a custom wrapper for a prediction function in xgboost or other estimators

So I want to manipulate the result of my prediction and I need to do it within the estimator. I tried to write a wrapper like this, but my kernel just dies when

Yellowbrick: is it possible to pass in different pairwise distance metrics for scoring methods

sklearn defines a large number of pairwise distance metrics for something like silhouette score: https://scikit-learn.org/stable/modules/generated/sklearn.metri

scikit-learn neural net beginner - results not what I expect

I have a simple example for which I am attempting to perform a classification using the MLPClassifier. from sklearn.neural_network import MLPClassifier # What

how to use the train_x and train_y from sklearn k-fold split generator

I am using the sklearn k-fold generator to split some data 10 times. When I run the code below I expect train_x,train_y,test_x,test_y to contain all 10 splits h

Elbow Method for K-Means in python

I'm using K-Means algorithm (in sklearn) to cluster 1-D array of values, and I want to decide the optimal number of clusters (K) in my script. I'm familiar with

Looping through each row in array to calculate cosine similarity

I have a subset of a dataframe that looks like: <OUT> PageNumber english_only_tags 175 flower architecture people 162 hair red bobbles

Polynomial Expansion without sklearn

I want to try and recreate this functions from scratch (without using sklearn): # The matrix is M which is 1000x10 matrix. from sklearn.preprocessing import Po

Pass information between pipeline steps in sklearn

I am working on a simple text generation problem with LSTMs. To make the preprocessing more compact and reproducible, I decided to implement everything in sklea

Cosine similarity and SVC using scikit-learn

I am trying to utilize the cosine similarity kernel to text classification with SVM with a raw dataset of 1000 words: # Libraries import numpy as np from sklear

Is this a valid approach to scale your target in machine learning without leaking information? [closed]

Consider a housing price dataset, where the goal is to predict the sale price. I would like to do this by predicting the "Sale price per Squar

TypeError: 'module' object is not iterable in django 4

TypeError: 'module' object is not iterable in django 4 I am getting the above error, it has persisted long enough than at this point I really need help. I am u

XGBoost model quantization - Sklearn model quantization

I am looking for solutions to quantize sklearn models. I am specifically looking for XGBoost models. I did find solutions to quantize pytorch and tensorflow mod

How to slice a XGBClassifier/XGBRegressor model into sub-models?

This document shows that a XGBoost API trained model can be sliced by following code: from sklearn.datasets import make_classification import xgboost as xgb bo

How to slice a XGBClassifier/XGBRegressor model into sub-models?

This document shows that a XGBoost API trained model can be sliced by following code: from sklearn.datasets import make_classification import xgboost as xgb bo