'How to extract values from a text CLOB field in SQL

I have an SQL table column (BINARYTEXT) populated with a CLOB. Within this CLOB is a number of attributes e.g.

CE.EffDate="20140106";
CE.CCY="EUR";
CE.TransactionType="STANDARDEUROPEAN";
CE.CAL="LON";

I need to extract only the value of the CE.TransactionType attribute contained between the double quotes so 'STANDARDEUROPEAN'. Note that the CLOB does not contain XML and only contains attributes as above with no start or end tags.

I have worked out how to do this using the REGEXP_SUBSTR function when I specify the string in the command:

select REGEXP_SUBSTR('CE.TransactionType="STANDARDEUROPEAN"', '="[^"]+') transtype 
from DUAL

which returns: ="STANDARDEUROPEAN

I am unable to manipulate this into using the CLOB as the string. This does not work:

select REGEXP_SUBSTR(BINARYTEXT,'CE.TransactionType="STANDARDEUROPEAN"', '="[^"]+')  transtype 
from DUAL

Thanks in advance,

Steve



Solution 1:[1]

You can use the DBMS_LOB package to extract the string.

SELECT REGEXP_SUBSTR(DBMS_LOB.substr(BINARYTEXT,3000) ,'CE.TransactionType="STANDARDEUROPEAN"', '="[^"]+')  transtype 
FROM YOURTABLENAME

The only thing you might have problems with is the buffer size, which is restricted to 32767 bytes (its set to 3000 in my example). If your data is bigger than this, then you'd need to query it in chunks. See the documentation here

If you need to change the TransactionType you're searching for, you should probably create a function which takes that part of the string as a parameter and build the sql dynamically.

Solution 2:[2]

select BINARYTEXT from your_table 
where regexp_like(dat,'^(\CE.TransactionType=)?"[^"]+"$');




EDIT2:
select BINARYTEXT from your_table
where regexp_like(dat,'^(\CE.TransactionType=)?"[^"]+"?;$');

Solution 3:[3]

you can try the below SQL

select substr ( REGEXP_SUBSTR(to_clob ('CE.EffDate="20140106"; CE.CCY="EUR";CE.TransactionType="STANDARDEUROPEAN";CE.CAL="LON";'), '="[^"]+', 1,3), 3)  transtype from DUAL

Solution 4:[4]

If your db does not have REGEXP you can use DBMS_LOB package (as mentioned by StevieG) here is sql using just the DBMS_LOB ... I used the semicolon as a delimiter as it is in your CLOB string in the question

WITH
    table_test As
        (
            Select 
              TO_CLOB('... ... ...
                       Some other text within the CLOB field
                       ...
                       CE.EffDate="20140106";
                       CE.CCY="EUR";
                       CE.TransactionType="STANDARDEUROPEAN";
                       CE.CAL="LON";
                       and yet some more text;
                       and more ...
                       ... ... ...
                    ') "BINARYTEXT"  
            From Dual
        )
Select
    DBMS_LOB.SubStr(
                      DBMS_LOB.SubStr(BINARYTEXT, 100, InStr(BINARYTEXT, 'CE.TransactionType=') + Length('CE.TransactionType=')+1),
                      InStr(DBMS_LOB.SubStr(BINARYTEXT, 100, InStr(BINARYTEXT, 'CE.TransactionType=') + Length('CE.TransactionType=')+1), ';')-2,
                      1
                      
                    ) "TRANSACTION_TYPE"
From
    table_test
-- 
-- Result:
--
-- TRANSACTION_TYPE
-- STANDARDEUROPEAN

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3 Younes
Solution 4 d r