'How to move a byteobject in Databricks to e.g. S3

I am working in Databricks/Python and I have a byte object given by a parameter "byte_object" and I want to move it to a S3 location s3_path. For a saved object in DBFS storage with path dbfs_path it can be done like:

dbutils.fs.mv(dbfs_path, s3_path)

But my problem is that I want to move a byte object to S3, given by my parameter "byte_object" in Python. Not a file stored in DBFS, any ideas of how to do that?



Solution 1:[1]

If your byte object could be represented as string, then you can use dbutils.fs.put directly with it (doc).

Otherwise, you can write that byte object into a local file, and then use dbutils.fs.mv to move it to S3, something like this:

local_file = "/tmp/local_file"
with open(local_file, "wb") as f:
  f.write(byte_object)

dbutils.fs.mv(f"file:{local_file}", s3_path)

the main thing here is that you need to specify local file as file:/__path__, not simply __path__ - it will search on DBFS without that prefix.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Alex Ott