'How can I access to HDFS file system in the latest Tensorflow 2.6.0?
I recently upgraded the tensorflow version used in my program to the recently released 2.6.0, but I ran into a trouble.
import tensorflow as tf
pattern = 'hdfs://mypath'
print(tf.io.gfile.glob(pattern))
The above API throws an exception in version 2.6:
tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme'hdfs' not implemented (file:xxxxx)
Then I checked the relevant implementation code and found that the official recommendation is to use tensorflow/io to access hdfs, and the environment variable TF_USE_MODULAR_FILESYSTEM is provided to use legacy access support. Since my code is more complex and difficult to refactor in a short time, I tried to use this environment variable, but it still failed.
In general, my questions are:
- In the latest version of tensorflow, if "tfio" is not used, how can I still access the HDFS file?
- If "tfio" must be used, what is the equivalent code call to
tf.io.gfile.glob?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
