'How to move a byteobject in Databricks to e.g. S3
I am working in Databricks/Python and I have a byte object given by a parameter "byte_object" and I want to move it to a S3 location s3_path. For a saved object in DBFS storage with path dbfs_path it can be done like:
dbutils.fs.mv(dbfs_path, s3_path)
But my problem is that I want to move a byte object to S3, given by my parameter "byte_object" in Python. Not a file stored in DBFS, any ideas of how to do that?
Solution 1:[1]
If your byte object could be represented as string, then you can use dbutils.fs.put directly with it (doc).
Otherwise, you can write that byte object into a local file, and then use dbutils.fs.mv to move it to S3, something like this:
local_file = "/tmp/local_file"
with open(local_file, "wb") as f:
f.write(byte_object)
dbutils.fs.mv(f"file:{local_file}", s3_path)
the main thing here is that you need to specify local file as file:/__path__, not simply __path__ - it will search on DBFS without that prefix.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Alex Ott |
