'Spark read HFile without

We have some tables present on hbase (in TB's) which we have to migrate. However, Hbase is fully utilize and we cannot run Export as it put too much pressure on hbase. As hbase use , HFile as its data. Can i directly read HFiles as data and export it to some commonly used format(Parquet/orc).

I followed some blogs/stackoverflow questions like How to directly edit HBase HFile with Spark without HBase API and https://programmer.group/hbase-operation-spark-read-hbase-snapshot-demo-share.html but these are using hbase to read snapshots .

Is there a way to directly read Hfiles directly?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source