'Difficulty in forming a proper matrix for applying SVD Method on a pandas DataFrame

I have the following dataset.

|| sin || main_cat || also_view ||
==================================
|| B00520G7AU || 91 || [B07D6DDL1N, B07GXYLRRF, B008FR8UMU, B01N5XYE6M] ||
|| B003SK8V9G || 86 || [B01G7S0BX4, B000VXDHPG] ||

(the | and = is just the formatting I did here to show it as a table, ignore them)

I am facing problems in converting each of these objects to strings and lists respectively. The main_cat column is integer while others show up as objects even after trying functions like tolist(), tostring(), etc etc.

Because of which I cannot reshape the dataset. I wish to apply SVD.

I wish to generate a matrix with sin as index and column, and entries of also_view as values present in matrix.

Could someone guide me with syntax of how I could move forward from here?



Solution 1:[1]

One way is you can transform your data as space (or tab or pipe) separated by manual text find and replace:

  • Find "||" and replace with "".

  • Find ", " and replace with "".

  • Find " " (multiple continuous spaces) and replace with " " (single blank space).

  • Import dataset using:

    pd.read_csv('file.txt', delimiter=' ', index_col='sin')

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1