'Error: "ValueError: could not convert string to float: 'Private Sector/Self Employed' "

Output- "ValueError: could not convert string to float: 'Private Sector/Self Employed' ".

I need help with this error as I get this error consistently

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os
    for dirname, _, filenames in os.walk('/kaggle/input'):
       for filename in filenames:
          print(os.path.join(dirname, filename))
    pd.options.mode.chained_assignment = None # disabled chaining errors as some columns overwritten below
    import sys
    print(sys.version)
    import matplotlib.pyplot as plt
    %matplotlib inline
    from sklearn.preprocessing import LabelEncoder
    from scipy.stats import levene
    import seaborn as sns
    from scipy.stats import shapiro
    from sklearn.model_selection import train_test_split, cross_val_score
    from sklearn.neighbors import KNeighborsClassifier
    from sklearn.tree import DecisionTreeClassifier
    from sklearn.ensemble import RandomForestClassifier
    from sklearn.naive_bayes import GaussianNB
    from sklearn.linear_model import LogisticRegression
    from sklearn.svm import SVC
    from sklearn.metrics import accuracy_score
    from sklearn.preprocessing import StandardScaler
    from sklearn.decomposition import PCA
    from sklearn.decomposition import KernelPCA
    
    
    dataset_df = pd.read_csv("TravelInsurancePrediction.csv")
    dataset = dataset_df.loc[:, ~dataset_df.columns.str.contains('^Unnamed')]
    
    X = dataset.iloc[:,:-1].values
    y = dataset.iloc[:, -1].values
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state = 188)
    cKNN = KNeighborsClassifier(n_neighbors = 10, metric = 'minkowski', p = 2).fit(X_train, y_train)


Solution 1:[1]

According to your code and error, I assume that your data (either X or y) contains string values (e.g. 'Private Sector/Self Employed' which you see here). The error tells you this and implies that you need to process your data so that it contains only numbers, since KNeighborsClassifier can't work with strings. Try to apply feature encoding.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Evgeny Kovalev