'PutFile append file
New to Nifi!
I'm wondering if there is a way in nifi to use a processor such as "PutFile" and have it write to one single file (append data to this file, or over-write the data in this file) - rather than create multiple different files? Is there another processor I need to use in order to accomplish this?
Solution 1:[1]
Currently there is no way to append data to a file but you can overwrite the file using PutFile.
The PutFile processor writes a file to disk using the the attribute "filename" on the FlowFile. So if you put an UpdateAttribute processor before a PutFile that updates all incoming FlowFiles with to same "filename" then the PutFile processor will write them all with the same file name on disk.
To do this with PutFile, make sure you configure processor property "Conflict Resolution Strategy" to "Replace".
Solution 2:[2]
For those who don't want to override the data in the file but want to append data.
Appending to single file Using ExecuteStreamCommand processor:
It isn't possible with putFile processor but you can use the ExecuteStreamCommand processor to accomplish this.
In command arguments put attributes you want to log speparated by delimiter
${aatr1};${aatr2};${attr3}
In command path put the absolute path of a bash script: /path/logger.sh
logger.sh:
#!/bin/bash
echo "$1|$2|$3">> /path/attributes.log
attibutres.log will append the three attributes line by line. Make sure that the bash script is executable with nifi.
do a chmod 777 logger.sh
Appending to single file Using ExecuteScript Processor:
Try this ECMAscript:
var flowFile = session.get();
var File = Java.type("java.io.RandomAccessFile");
if (flowFile != null) {
var filename = flowFile.getAttribute("filename");
/// write to file
var filePath = "/path/attributes.log" ;
flowFile = session.putAttribute(flowFile, "filePath", filePath);
var file = new File(filePath, "rws");
file.seek(file.length());
file.write(filename.getBytes());
file.write("\n".getBytes());
file.close();
// Finish by transferring the FlowFile to an output relationship
session.transfer(flowFile, REL_SUCCESS);
}
Solution 3:[3]
Sorry. New to this. Let me try again...
LogAttribute processor may be an option because it appears that it offers the closest functionality to a file append. But it doesn't appear to be ideal, as it offers few options for directing the output.
There are two other options you can try if you are intent on using the "out of the box" processor functionality rather than developing classes to customize. Use an ExecuteScript processor to execute a Jython, Groovy, or JS, script that modifies the output flowFile to include only the Attributes you need. Follow that with a PutMongo or a PutSQL processor to update a persisted database resource.
Another option if you don't have a database resource at your disposal: use the ExecuteScript processor mentioned above, followed by a PutFile that outputs a uniquely named filename attribute to a directory - say, ${filename}.${uuid}. You'll wind up with a large number of similarly formatted files - one log record per file - which you can then roll up into one file for analysis using Linux line commands or alternatively employ a final ExecuteScript processor in your workflow to roll up every time a file gets processed through your workflow. This last one may not be a good idea, as it is not clear whether it will introduce synchronization and write contention problems if your stream of flowFiles is high.
Solution 4:[4]
There are one or more solutions for appending data to a file. I prefer the ExecuteGroovyScript processor for file appending.you can get the file through groovy script easily. then if file exists we can append the data. I successfully implement this method.
Solution 5:[5]
The original question was to append to an existing file. Presumably the idea is to write one line for each flowfile.
You can achieve this use case with a different approach: instead of appending successive lines to a file, do the merging in Nifi and then write out the entire file in one go. This is done with the MergeContent processor. It takes the content of successive flowfiles, and pastes them together into a single flowfile. It has various configuration options to say how many flowfiles you want to accumulate before it spits out the merged result, which you can then write out using PutFile.
Here is a recipe to write out key attributes in CSV format to a hardcoded location. Use case: archive flowfile attributes before emptying a queue.
Suppose you want to write out the attributes id, name and size.
- Add a processor AttributesToCSV. Set "Attribute list" to
id,name,size. - Set "Destination" to "flowfile-content".
- Connect it downstream to a MergeContent.
- Set "Minimum/Maximum Number of Entries" so that your entire queue fits into one bin (e.g. 1 and some large number, respectively)
- Set "Delimiter strategy" to "Text"
- Set "Demarcator" to a newline (type ?+?, or paste in the newline)
- In "Header" set an appropriate CSV header e.g. "ID,Name,Size" and add a trailing newline
- Set "Max bin age" to a few seconds - long enough that it has time to merge everything, but short enough that you are happy to wait.
- Connect it downstream to an UpdateAttribute that sets
filenameattribute to whatever you like - Connect that downstream to a PutFile, set the "Directory" property
Everything else can stay as defaults.
This recipe needs adapting for other use cases:
- if there is a constant flow that needs to be written out: split into bins by number of entries, and make the filename dynamic - e.g. UpdateAttribute can set
filenameto${UUID().csv} - if another format than CSV is desired: replace the AttributesToCSV by some other processor
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | JDP10101 |
| Solution 2 | |
| Solution 3 | 56lt56 |
| Solution 4 | MUHAMMAD SIYAD EH |
| Solution 5 |
