'Add document to Firestore from Beam with auto generated ID
I would like to use Apache Beam Java with the recently published Firestore connector to add new documents to a Firestore collection. While I thought that this should be a relatively easy task, the need for creating com.google.firestore.v1.Document objects seem to make things a bit more difficult. I was using this blog post on Using Firestore and Apache Beam for data processing as a starting point.
What I actually only want is to write is a simple transformation, mapping MyClass objects to Firestore documents, which are then added to a Firestore collection.
What I now ended up with is a Beam SimpleFunction, which maps MyClass objects to Documents:
public static class Mapper extends SimpleFunction<MyClass, Document> {
@Override
public Document apply(final MyClass record) {
final String project = "my-project";
final String database = "(default)";
final String collection = "my-collection";
final String documentId = someUnecessaryIdComputation();
return Document
.newBuilder()
.setName("projects/" + project + "/databases/" + database + "/documents/" + collection
+ "/" + documentId)
.putFields("key",
Value.newBuilder().setStringValue(record.getValue()).build())
// ...
.build();
}
}
and a DoFn transforming these Documents to Write objects with configured update (can probably be also simplified to a SimpleFunction but was copied from the blog post):
private static final class CreateUpdateOperation extends DoFn<Document, Write> {
@ProcessElement
public void processElement(ProcessContext c) {
final Write write = Write.newBuilder()
.setUpdate(c.element())
.build();
c.output(write);
}
}
I'm using these two functions in my pipeline as follows:
pipeline.apply(MapElements.via(new Mapper()))
.apply(ParDo.of(new CreateUpdateOperation()))
.apply(FirestoreIO.v1().write().batchWrite().build());
The major disadvantages here are:
- I have to specify a document ID and can not use an auto-generated one as with the "plain" Java SDK
- I have to specify the project ID and the database name although they should be available. At least for the Java SDK, I have don't have to set them.
Is there any way to add documents using the Firestore connector without explicitly setting document ID, project ID and database?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
