'UCI dataset: How to extract features and change the data into usable format after reading the data on python
I am looking to apply some ml algorithms on the data set from https://archive.ics.uci.edu/ml/datasets/University. I noticed that the data is unstructured. Indeed, I want the data to have the features as the columns and have observations as the rows. Therefore, I need help with parsing this dataset.
Any help will be appreciated. Thanks.
Below is what I have tried:
column_names = ["University-name"
,"State"
,"location"
,"Control"
,"number-of-students"
,"male:female (ratio)"
,"student:faculty (ratio)",
"sat-verbal"
,"sat-math"
,"expenses"
,"percent-financial-aid"
,"number-of-applicants"
,"percent-admittance"
,"percent-enrolled"
,"academics"
,"social"
,"quality-of-life"
,"academic-emphasis"]
data_list =[]
data = ['https://archive.ics.uci.edu/ml/machine-learning-
databases/university/university.data','https://archive.ics.uci.edu/ml/machine-
learning-databases/university/university.data',...]'
for file in in data:
df = pd.read_csv(file, names = column_names)
data_list.append(df)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
