'Potential bug in GCP regarding public access settings for a file

I was conversing with someone from GCS support, and they suggested that there may be a bug and that I post what's happening to the support group.

Situation

I'm trying to adapt this Tensorflow demo ... https://www.tensorflow.org/hub/tutorials/tf2_arbitrary_image_stylization ... to something I can use with images stored on my GCP account. Substituting one of my images to run through the process.

​​I have the bucket set for allUsers to have public access, with a Role of Storage Object Viewer.

However, the demo still isn't accepting my files stored in GCS.

For example, this file is being rejected: https://storage.googleapis.com/01_bucket-02/Green_Sea_Turtle_grazing_seagrass.jpeg

That file was downloaded from the examples in the demo, and then uploaded to my GCS and the link used in the demo. But it's not being accepted. I'm using the URL from the Copy URL link.

Re: publicly accessible data

I've been following the instructions on making data publicly accessible. https://cloud.google.com/storage/docs/access-control/making-data-public#code-samples_1

I've performed all the above operations from the console, but the bucket still doesn't indicate public access for the bucket in question. So I'm not sure what's going on there.

Please see the attached screen of my bucket permissions settings.

screen of bucket permissions

So I'm hoping you can clarify if those settings look good for those files being publicly accessible.

Re: Accessing the data from the demo

I'm also following this related article on 'Accessing public data' https://cloud.google.com/storage/docs/access-public-data#storage-download-public-object-python

There are 2 things I'm not clear on:

  1. If I've set public access the way I have, do I still need code as in the example on the 'Access public data' article just above?
  2. If I do need to add this to the code from the demo, can you tell me how I can find these 2 parts of the code: a. source_blob_name = "storage-object-name" b. destination_file_name = "local/path/to/file"

I know the path of the file above (01_bucket-02/Green_Sea_Turtle_grazing_seagrass.jpeg), but don't understand whether that's the storage-object-name or the local/path/to/file.

And if it's either one of those, then how do I find the other value?

And furthermore, to make a bucket public, why would I need to state an individual file? That's making me think that code isn't necessary.

Thank you for clarifying any issues or helping to resolve my confusion.

Doug



Solution 1:[1]

If I've set public access the way I have, do I still need code as in the example on the 'Access public data' article just above?

No, you don't need to. I actually did some testing and I was able to pull images in GCS, may it be set to public or not.

As what we have discussed in this thread, what's happening in your project is that the image you are trying to pull in GCS has a .jpeg extension but is not actually .jpeg. The actual image is in .jpg causing TensorFlow to not able to load it properly.

See this testing following the demo you've mentioned and the image from your bucket. Note that I used .jpg as the image's extension.

content_urls = dict(
  test_public='https://storage.cloud.google.com/01_bucket-02/Green_Sea_Turtle_grazing_seagrass.jpg'
  )

enter image description here

enter image description here

Also tested another image from your bucket and it was successfully loaded in TensorFlow.

enter image description here

Solution 2:[2]

Most likely the problem is your turtle ends in .jpeg and your libraries are looking for .jpg.

The Errors you're seeing would be much more helpful to figure out the problem.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Mabel A.
Solution 2 Josh Bloom