'How to avoid extra blank lines in XML generation with Java?
Currently I'm trying to develop some code with Java 9 and javax.xml libraries (both mandatory for my task) that edits an XML file and I'm having some weird issues adding child nodes.
This is the XML file:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
</users>
and I want to edit it build something like this:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
<user>
<name>A name</name>
<last-name>Last Name</last-name>
<username>username</username>
</user>
</users>
Now, the first run of the code adds a single blank line before the <user> node. When it runs for a second time fills with more blank lines:
<users>
<user>
<name>name</name>
<last-name>lastname</last-name>
<username>username</username>
</user>
<user>
<name>name</name>
<last-name>lastname</last-name>
<username>username</username>
</user>
</users>
This is the XML generated after running the program 2 times. As you can see, it adds blank lines before the <user> nodes and between the other nodes, exactly n-1 blank lines between nodes being n the times the code was executed.
Wondering what is the content of those nodes before updating the file I wrote the next code:
int i=0;
while (root.getChildNodes().item(i)!=null){
Node aux = root.getChildNodes().item(i);
System.out.println("Node text content: ".concat(aux.getTextContent()));
i++;
}
1st execution:
Node text content:
Node text content: namelastnameusername
2nd execution:
Node text content:
Node text content:
name
lastname
username
Node text content:
Node text content: namelastnameusername
3rd execution
Node text content:
Node text content:
name
lastname
username
Node text content:
Node text content:
name
lastname
username
Node text content:
Node text content: namelastnameusername
Finally, this is the Java code:
private static void saveUser(String firstName, String lastName, String username){
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new File(databaseFile));
Element root = doc.getDocumentElement();
root.normalize();
// build user node
Element userNode = doc.createElement("user");
Element nameNode = doc.createElement("name");
Element lastNameNode = doc.createElement("last-name");
Element usernameNode = doc.createElement("username");
//build structure
nameNode.appendChild(doc.createTextNode(firstName));
lastNameNode.appendChild(doc.createTextNode(lastName));
usernameNode.appendChild(doc.createTextNode(username));
userNode.appendChild(nameNode);
userNode.appendChild(lastNameNode);
userNode.appendChild(usernameNode);
root.appendChild(userNode);
//write the updated document to file or console
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File(databaseFile));
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.transform(source, result);
}catch (SAXException | ParserConfigurationException | IOException | TransformerException e1) {
e1.printStackTrace();
}
}
The only solution I could find is to delete blank lines after XML generation, but I think it's not a proper solution and I would like to find some alternatives first.
Any suggestions on how to tackle this problem?
Solution 1:[1]
In Short? Actually, in Java 9, you may only take the way of deleting the blank line after xml generated or after xml parsed from file, like :
private void clearBlankLine(Element element) {
NodeList childNodes = element.getChildNodes();
for (int index = 0; index < childNodes.getLength(); index++) {
Node item = childNodes.item(index);
if (item.getNodeType() != 1 && System.lineSeparator()
.equals(item.getNodeValue())) {
element.removeChild(item);
} else {
if (item instanceof Element) {
clearBlankLine((Element) item);
}
}
}
}
Then invoke this with root element.
Details:
In the flow of xml generation, there are three lifecycle for each element parse: startElement,parse,endElement. While the indent feature is implemented in the startElement scope. Also the indent will add a blank line in document.
The invoke stack is different in java 8 between java 9:
In Java 8: ToStream#startElement-> ToStream#indent(IfNecessary)
In Java 9: ToStream#startElement->ToStream#flushCharactersBuffer(IfNecessary)->ToStream#indent(IfNecessary)
While the flushCharactersBuffer also do indent when we open the indent feature like: transformer.setOutputProperty(OutputKeys.INDENT, "yes"); Also the condition to invoke method: flushCharactersBuffer and method: indent almost same.
That means in Java 9, this would add two new line for each need indented element, result to blank lines appeared.
Solution 2:[2]
your solution and below suggestion are both works fine for me, please try with this test case,
public static void main(String[] args) {
saveUser("test one", "test two", "test three");
}
private static void saveUser(String firstName, String lastName, String username){
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new File("second.xml"));
Element root = doc.getDocumentElement();
root.normalize();
// build user node
Element userNode = doc.createElement("user");
Element nameNode = doc.createElement("name");
Element lastNameNode = doc.createElement("last-name");
Element usernameNode = doc.createElement("username");
userNode.appendChild(nameNode).setTextContent(firstName); //set the text content
userNode.appendChild(lastNameNode).setTextContent(lastName);
userNode.appendChild(usernameNode).setTextContent(username);
root.appendChild(userNode);
//write the updated document to file or console
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File("second.xml"));
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.transform(source, result);
}catch (Exception e) {
e.printStackTrace();
}
}
second.xml (before execution)
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
</users>
second.xml (first execution)
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
</users>
second.xml (second execution)
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
</users>
second.xml (third execution)
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
</users>
importing classes,
import java.io.File;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.parsers.DocumentBuilder; // missing import class
import org.w3c.dom.Document;
import org.w3c.dom.Element;
Solution 3:[3]
I found this solution using an XPath to be much cleaner than any of the others here (h/t to Isaac for his answer over at https://stackoverflow.com/a/12670194/1339923). It doesn't require a separate (i.e., XSLT) file, and doesn't require you to add 14 lines of Java to iterate over every node in the Document. Only 6 lines of code.
In the case of @pablo-r-grande's original question... right before this comment (i.e., just before loading the Document into the DOMSource):
//write the updated document to file or console
...I would add these lines:
// Generate list of all empty Nodes, them remove them
XPath xp = XPathFactory.newInstance().newXPath();
NodeList nl = (NodeList) xp.evaluate("//text()[normalize-space(.)='']", doc, XPathConstants.NODESET);
for (int i = 0; i < nl.getLength(); ++i) { // note the position of the '++'
Node node = nl.item(i);
node.getParentNode().removeChild(node);
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | gavincook |
| Solution 2 | |
| Solution 3 | Lambart |
