'Count the number of different sections of text that match a regular expression

I am working in postgresql. I have an int column "DOC_ID" and a text column "CONTENT". Each row of the table contains information about the document.

There are sections of text that match the regular expression

'#[A-Za-z]+:[A-Za-z]+#. They are different!

For example: #Lot : version# , #Participant : name# and others.

I want for each value that satisfies the regular expression to get a column with the ID of the document in which it occurs. Considering that in one document it can be several times.

Example: My table:

DOC_ID       CONTENT
1            '#Participant : name# cat bear dog #Participant : name# mouse'
2            'cat #Participant : name# bear'
3            'cat #Lot : version# dog'
10           'mouse #Lot : version# cow'

Result table

TAG                       DOC_ID
#Participant : name#       1
#Participant : name#       1
#Participant : name#       2
#Lot : version#            3
#Lot : version#            10

Please tell me how can I do this.

sql postgresql

Solution 1:^[1]

Use the function regexp_matches():

select 
    (regexp_matches(content, '#[A-Za-z0-9 ?%--,. ]+:[A-Za-z0-9 ?%--,. ]+#', 'g'))[1] as tag,
    doc_id
from my_table

Read about the function and pattern matching in the documentation.

Test the query in db<>fiddle.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	klin

'Count the number of different sections of text that match a regular expression

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]