'How can I access to HDFS file system in the latest Tensorflow 2.6.0?

I recently upgraded the tensorflow version used in my program to the recently released 2.6.0, but I ran into a trouble.

import tensorflow as tf

pattern = 'hdfs://mypath'
print(tf.io.gfile.glob(pattern))

The above API throws an exception in version 2.6:

tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme'hdfs' not implemented (file:xxxxx)

Then I checked the relevant implementation code and found that the official recommendation is to use tensorflow/io to access hdfs, and the environment variable TF_USE_MODULAR_FILESYSTEM is provided to use legacy access support. Since my code is more complex and difficult to refactor in a short time, I tried to use this environment variable, but it still failed.

In general, my questions are:

  1. In the latest version of tensorflow, if "tfio" is not used, how can I still access the HDFS file?
  2. If "tfio" must be used, what is the equivalent code call to tf.io.gfile.glob?


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source