'Hive DDL Parser using Beeline or Spark's internal classes or methods
Is it possible to use Hive/Beeline/Spark's DDL parsing capabilities within our custom programs preferably in Java or Scala?. I have already looked at the project https://github.com/xnuinside/simple-ddl-parser and it does exactly what I want. The concern I have with this project is, it is not using Hive or Spark's own internal classes for the parsing. They have come up with their own regex pattern to parse the given DDL statements.
I know beeline or spark-shell accepts the create table statements and it creates the table. I am thinking it must have internal classes which does the parsing and then it creates the table. If they are public classes or methods can we not use these instead of reinventing the wheel?. I do not know what are those internal classes or methods that parses the DDL statements. Please let me know if you know more about it. For my use case, I need to extract TableName, ColumnNames, DataTypes, PartitionKeys, SerDe, InputFormat, OutputFormat from the given Create Table Statement.
Solution 1:[1]
One of my friends suggested me to use the Apache-Hive library itself, specifically the class org.apache.hadoop.hive.ql.parse.HiveParser. Example programs can be found in the link1 or link2.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Venkatesan Muniappan |
