'i want to establish a pipe line to pubg data on kaggle to procces it but when i implement a pipe line this error get to me

i want to establish a pipe line to pubg data on kaggle to procces it but when i implement a pipe line this error get to me:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/tmp/ipykernel_35/3879657662.py in <module>
      8     ])
      9 
---> 10 pubg_num_tr = num_pipeline.fit_transform(pubg_num)

/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py in fit_transform(self, X, y, **fit_params)
    424         """
    425         fit_params_steps = self._check_fit_params(**fit_params)
--> 426         Xt = self._fit(X, y, **fit_params_steps)
    427 
    428         last_step = self._final_estimator

/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py in _fit(self, X, y, **fit_params_steps)
    353                 message_clsname="Pipeline",
    354                 message=self._log_message(step_idx),
--> 355                 **fit_params_steps[name],
    356             )
    357             # Replace the transformer of the step with the fitted

/opt/conda/lib/python3.7/site-packages/joblib/memory.py in __call__(self, *args, **kwargs)
    347 
    348     def __call__(self, *args, **kwargs):
--> 349         return self.func(*args, **kwargs)
    350 
    351     def call_and_shelve(self, *args, **kwargs):

/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py in _fit_transform_one(transformer, X, y, weight, message_clsname, message, **fit_params)
    891     with _print_elapsed_time(message_clsname, message):
    892         if hasattr(transformer, "fit_transform"):
--> 893             res = transformer.fit_transform(X, y, **fit_params)
    894         else:
    895             res = transformer.fit(X, y, **fit_params).transform(X)

/opt/conda/lib/python3.7/site-packages/sklearn/base.py in fit_transform(self, X, y, **fit_params)
    845         if y is None:
    846             # fit method of arity 1 (unsupervised transformation)
--> 847             return self.fit(X, **fit_params).transform(X)
    848         else:
    849             # fit method of arity 2 (supervised transformation)

/tmp/ipykernel_35/2077244363.py in transform(self, X)
     13         total_distance = X[:, walkDistance_ix] + X[:, rideDistance_ix]+X[:, swimDistance_ix]
     14         if self.add_total_distance_per_seconda:
---> 15             add_total_distance_per_seconda = X[:, total_distance] / X[:, matchDuration_ix]
     16             return np.c_[X, walk_distance_per_seconda, total_distance,
     17                          add_total_distance_per_seconda]

IndexError: arrays used as indices must be of integer (or boolean) type

my pipeline code is:

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

num_pipeline = Pipeline([
        ('imputer', SimpleImputer(strategy="median")),
        ('attribs_adder', CombinedAttributesAdder()),
        ('std_scaler', StandardScaler())
    ])

pubg_num_tr = num_pipeline.fit_transform(pubg_num)

i implemented an attribute adder and it worked properly but when i turn on the pipline it fails, i need a solution without the need to converse a float to integers because it harms data.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source