'appending to existing avro file oython
I'm exploring the avro file format and am currently struggling to append data. I seem to overwrite in each run. I found an existing thread here, saying I should not pass in a schema in order to "append" to existing file without overwriting. Even my lint gives this clue: If the schema is not present, presume we're appending.. However, If I try to declare DataFileWriter as DataFileWriter(open("users.avro", "wb"), DatumWriter(), None) then the code wont run.
Simply put, how do I append values to an existing avro files without writing over existing content.
schema = avro.schema.parse(open("user.avsc", "rb").read()
writer = DataFileWriter(open("users.avro", "wb"), DatumWriter(), schema)
print("start appending")
writer.append({"name": "Alyssa", "favorite_number": 256})
writer.append({"name": "Ben", "favorite_number": 12, "favorite_color": "blue"})
writer.close()
print("write successful!")
# Read data from an avro file
with open('users.avro', 'rb') as f:
reader = DataFileReader(open("users.avro", "rb"), DatumReader())
users = [user for user in reader]
reader.close()
print(f'Schema {schema}')
print(f'Users:\n {users}')
Solution 1:[1]
I'm not sure how to do it with the standard avro library, but if you use fastavro it can be done. See the example below:
from fastavro import parse_schema, writer, reader
schema = {
"namespace": "example.avro",
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "favorite_number", "type": ["int", "null"]},
{"name": "favorite_color", "type": ["string", "null"]}
]
}
parsed_schema = parse_schema(schema)
records = [
{"name": "Alyssa", "favorite_number": 256},
{"name": "Ben", "favorite_number": 12, "favorite_color": "blue"},
]
# Write initial 2 records
with open("users.avro", "wb") as fp:
writer(fp, schema, records)
# Append third record
with open("users.avro", "a+b") as fp:
writer(fp, schema, [{"name": "Chris", "favorite_number": 1}])
# Read all records
with open("users.avro", "rb") as fp:
for record in reader(fp):
print(record)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Scott |
