'Create External Data Source with HDInsight
I'm trying to create an external data source with my HDInsight Cluster. While doing so, I need to provide location as Hadoop, Name, Node, IP Address, and port number.
So, where could I find the Name, Node, IP Address, Resource Manager location, IP Address and port numbers for both on HDInsight cluster?
I already browsed through Core-site.xml & yarn-site.xml and found nothing for HDInsight.
--- 3: syntax for Creating an external data source.
CREATE EXTERNAL DATA SOURCE MyHadoopCluster WITH (
TYPE = HADOOP,
LOCATION ='hdfs://10.xxx.xx.xxx:xxxx',
RESOURCE_MANAGER_LOCATION = '10.xxx.xx.xxx:xxxx',
CREDENTIAL = HadoopUser1
);
-- LOCATION (Required) : Hadoop Name Node IP address and port.
-- RESOURCE MANAGER LOCATION (Optional): Hadoop Resource Manager location to enable pushdown computation.
-- CREDENTIAL (Optional): the database scoped credential, created above.
Thanks.
Solution 1:[1]
If I understand your question correctly you already have a HDInsight cluster and are trying to get Azure SQL DW to talk to it via an external table. If you search the Syntax section of the documentation for CREATE EXTERNAL DATA SOURCE for "Azure SQL Data Warehouse" you will see the only way Polybase in Azure SQL DW works at the moment is by talking to Azure Blob Storage and Azure Data Lake Store. (Stay tuned to that documentation page as Polybase in Azure SQL DW will get more flexible over time as they continue to enhance it.)
So for now you should have HDInsight write to an external table defined in Hive and then have Azure SQL DW point at the same folder in blob storage and declare its own external table that reads those blobs.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
