'How to retrieve data from different AWS regions for my glue job?
I have Glue DBs(db1 and db2) and tables(tbl1 and tbl2) available in different AWS regions(eu-west-1 and us-east-1) respectively.
My glue job in eu-west-1, needs data from both the tables, just a simple select * from db1.tbl1 and select * from db2.tbl2. Data is stored in AWS S3 as parquet and am able to query via Athena too.
How can I retrieve that data via spark sql in glue job. Can you help me out with an example? If not spark sql can you please suggest a different approach?
Thanks very much!
Solution 1:[1]
Create a crawler in EU region to read data from US region S3 bucket, this would create a table in EU DB(S3 location points to US S3 bucket). That way the data is in US region but your glue job in EU can retrieve US data as required.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
