'Using Snowflake and XMLGET to parse an XML file

I am trying to parse out this xml code snippet. I am using the snowflake database trying to create a table. Using XMLGET functions, I get null returns in all fields. I am bit confused on how to get the attribute of a tag, I've done things before and have gotten the value of an object but nothing where the value is in the attribute

<Msg_file>
  <Game Number="2020000001">
    <Msg_game_info>
      <Game_info Game_id="2020000001" Game_date="11/01/2020" Game_time="06:00 PM" />
    </Msg_game_info>M
  </Game>
</Msg_file>


Solution 1:[1]

It works but indeed is not as straightforward as you might think. Here's an example:

create or replace table xmltest(x variant);
insert into xmltest SELECT PARSE_XML(column1) as X FROM VALUES ('<Msg_file><Game Number="2020000001"><Msg_game_info><Game_info Game_id="2020000001" Game_date="11/01/2020" Game_time="06:00 PM"></Game_info></Msg_game_info></Game></Msg_file>');

The following would work:

select XMLGET(X, 'Game'):"@Number" AS game FROM xmltest; --return 2020000001
select XMLGET(VALUE, 'Game_info'):"@Game_date" AS game FROM xmltest, LATERAL FLATTEN( INPUT => X:"$" ) LIMIT 1; --returns "11/01/2020"

To understand better how XML parsing is done read the following article

Solution 2:[2]

Making assumptions about what is arrays and what is not, thus using this data:

<Msg_file>
  <Game Number="2020000001">
    <Msg_game_info>
      <Game_info Game_id="2020000001" Game_date="11/01/2020" Game_time="06:00 PM" />
      <Game_info Game_id="2020000001" Game_date="11/01/2020" Game_time="07:00 PM" />
    </Msg_game_info>
  </Game>
  <Game Number="2020000002">
    <Msg_game_info>
      <Game_info Game_id="2020000002" Game_date="11/01/2021" Game_time="06:01 PM" />
      <Game_info Game_id="2020000002" Game_date="11/01/2021" Game_time="07:01 PM" />
    </Msg_game_info>
  </Game>
</Msg_file>'

with this SQL:

SELECT
    get(g.value, '@Number') as number
    ,get(m.value, '@Game_id') as Game_id
    ,to_date(get(m.value, '@Game_date')) as Game_date
    ,to_time(get(m.value, '@Game_time')) as Game_time
FROM xml_data as x
    ,lateral flatten(to_array(x.xml:"$")) as g
    ,lateral flatten(to_array(xmlget(g.value,'Msg_game_info'):"$")) as m
    ;

we get these results:

NUMBER GAME_ID GAME_DATE GAME_TIME
2020000001 2020000001 2020-11-01 18:00:00
2020000001 2020000001 2020-11-01 19:00:00
2020000002 2020000002 2021-11-01 18:01:00
2020000002 2020000002 2021-11-01 19:01:00

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Sergiu
Solution 2 Simeon Pilgrim