'Separate data in multiple rows using query

I have a data-set that came from python in my PSQL database, and I should not modify it internally, so I think its better to extract it and organize it with PSQL the way I need it to graph it.

The current data-set looks like:

text                                      contained_word_id
'I eat pizza in the park'                 2, 5
'I eat sushi'                             1
'She eats sushi with pizza in the plaza'  1, 2, 4
'He eats'                                 NaN

I would like to transform it into the following when I download it using a PSQL query:

text                                      contained_word_id
'I eat pizza in the park'                 2
'I eat pizza in the park'                 5
'I eat sushi'                             1
'She eats sushi with pizza in the plaza'  1
'She eats sushi with pizza in the plaza'  2
'She eats sushi with pizza in the plaza'  4
'He eats'                                 NaN

What is the best approach to this issue?



Solution 1:[1]

You can convert the awful comma separated list a list of rows using regex_split_to_table based on the comma:

select t.text, i.contained_word_id
from the_table t
  cross join regexp_split_to_table(t.contained_word_id, '\s*,\s*') as i(contained_word_id);

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 a_horse_with_no_name