'i want to establish a pipe line to pubg data on kaggle to procces it but when i implement a pipe line this error get to me
i want to establish a pipe line to pubg data on kaggle to procces it but when i implement a pipe line this error get to me:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
/tmp/ipykernel_35/3879657662.py in <module>
8 ])
9
---> 10 pubg_num_tr = num_pipeline.fit_transform(pubg_num)
/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py in fit_transform(self, X, y, **fit_params)
424 """
425 fit_params_steps = self._check_fit_params(**fit_params)
--> 426 Xt = self._fit(X, y, **fit_params_steps)
427
428 last_step = self._final_estimator
/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py in _fit(self, X, y, **fit_params_steps)
353 message_clsname="Pipeline",
354 message=self._log_message(step_idx),
--> 355 **fit_params_steps[name],
356 )
357 # Replace the transformer of the step with the fitted
/opt/conda/lib/python3.7/site-packages/joblib/memory.py in __call__(self, *args, **kwargs)
347
348 def __call__(self, *args, **kwargs):
--> 349 return self.func(*args, **kwargs)
350
351 def call_and_shelve(self, *args, **kwargs):
/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py in _fit_transform_one(transformer, X, y, weight, message_clsname, message, **fit_params)
891 with _print_elapsed_time(message_clsname, message):
892 if hasattr(transformer, "fit_transform"):
--> 893 res = transformer.fit_transform(X, y, **fit_params)
894 else:
895 res = transformer.fit(X, y, **fit_params).transform(X)
/opt/conda/lib/python3.7/site-packages/sklearn/base.py in fit_transform(self, X, y, **fit_params)
845 if y is None:
846 # fit method of arity 1 (unsupervised transformation)
--> 847 return self.fit(X, **fit_params).transform(X)
848 else:
849 # fit method of arity 2 (supervised transformation)
/tmp/ipykernel_35/2077244363.py in transform(self, X)
13 total_distance = X[:, walkDistance_ix] + X[:, rideDistance_ix]+X[:, swimDistance_ix]
14 if self.add_total_distance_per_seconda:
---> 15 add_total_distance_per_seconda = X[:, total_distance] / X[:, matchDuration_ix]
16 return np.c_[X, walk_distance_per_seconda, total_distance,
17 add_total_distance_per_seconda]
IndexError: arrays used as indices must be of integer (or boolean) type
my pipeline code is:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
num_pipeline = Pipeline([
('imputer', SimpleImputer(strategy="median")),
('attribs_adder', CombinedAttributesAdder()),
('std_scaler', StandardScaler())
])
pubg_num_tr = num_pipeline.fit_transform(pubg_num)
i implemented an attribute adder and it worked properly but when i turn on the pipline it fails, i need a solution without the need to converse a float to integers because it harms data.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
