'How to avoid extra blank lines in XML generation with Java?

Currently I'm trying to develop some code with Java 9 and javax.xml libraries (both mandatory for my task) that edits an XML file and I'm having some weird issues adding child nodes.

This is the XML file:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
</users>

and I want to edit it build something like this:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
    <user>
        <name>A name</name>
        <last-name>Last Name</last-name>
        <username>username</username>
    </user>
</users>

Now, the first run of the code adds a single blank line before the <user> node. When it runs for a second time fills with more blank lines:

<users>


    <user>

        <name>name</name>

        <last-name>lastname</last-name>

        <username>username</username>

    </user>

    <user>
        <name>name</name>
        <last-name>lastname</last-name>
        <username>username</username>
    </user>
</users>

This is the XML generated after running the program 2 times. As you can see, it adds blank lines before the <user> nodes and between the other nodes, exactly n-1 blank lines between nodes being n the times the code was executed.

Wondering what is the content of those nodes before updating the file I wrote the next code:

int i=0;
while (root.getChildNodes().item(i)!=null){
  Node aux = root.getChildNodes().item(i);
  System.out.println("Node text content: ".concat(aux.getTextContent()));
  i++;
}

1st execution:

Node text content: 

Node text content: namelastnameusername

2nd execution:

Node text content: 


Node text content: 
        name
        lastname
        username

Node text content: 

Node text content: namelastnameusername

3rd execution

Node text content: 



Node text content: 

        name

        lastname

        username


Node text content: 


Node text content: 
        name
        lastname
        username

Node text content: 

Node text content: namelastnameusername

Finally, this is the Java code:

private static void saveUser(String firstName, String lastName, String username){
  DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    try {
      DocumentBuilder builder = factory.newDocumentBuilder();
      Document doc = builder.parse(new File(databaseFile));
      Element root = doc.getDocumentElement();
      root.normalize();

      // build user node
      Element userNode = doc.createElement("user");
      Element nameNode =  doc.createElement("name");
      Element lastNameNode = doc.createElement("last-name");
      Element usernameNode = doc.createElement("username");

      //build structure
      nameNode.appendChild(doc.createTextNode(firstName));
      lastNameNode.appendChild(doc.createTextNode(lastName));
      usernameNode.appendChild(doc.createTextNode(username));

      userNode.appendChild(nameNode);
      userNode.appendChild(lastNameNode);
      userNode.appendChild(usernameNode);
      root.appendChild(userNode);

      //write the updated document to file or console
      TransformerFactory transformerFactory = TransformerFactory.newInstance();
      Transformer transformer = transformerFactory.newTransformer();
      DOMSource source = new DOMSource(doc);
      StreamResult result = new StreamResult(new File(databaseFile));
      transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
      transformer.setOutputProperty(OutputKeys.INDENT, "yes");
      transformer.transform(source, result);
    }catch (SAXException | ParserConfigurationException | IOException | TransformerException e1) {
      e1.printStackTrace();
    }
}

The only solution I could find is to delete blank lines after XML generation, but I think it's not a proper solution and I would like to find some alternatives first.

Any suggestions on how to tackle this problem?



Solution 1:[1]

In Short? Actually, in Java 9, you may only take the way of deleting the blank line after xml generated or after xml parsed from file, like :

private void clearBlankLine(Element element) {
    NodeList childNodes = element.getChildNodes();
    for (int index = 0; index < childNodes.getLength(); index++) {
        Node item = childNodes.item(index);
        if (item.getNodeType() != 1 && System.lineSeparator()
            .equals(item.getNodeValue())) {
            element.removeChild(item);
        } else {
            if (item instanceof Element) {
                clearBlankLine((Element) item);
            }
        }
    }
}

Then invoke this with root element.

Details:

In the flow of xml generation, there are three lifecycle for each element parse: startElement,parse,endElement. While the indent feature is implemented in the startElement scope. Also the indent will add a blank line in document.

The invoke stack is different in java 8 between java 9:

In Java 8: ToStream#startElement-> ToStream#indent(IfNecessary)

In Java 9: ToStream#startElement->ToStream#flushCharactersBuffer(IfNecessary)->ToStream#indent(IfNecessary)

While the flushCharactersBuffer also do indent when we open the indent feature like: transformer.setOutputProperty(OutputKeys.INDENT, "yes"); Also the condition to invoke method: flushCharactersBuffer and method: indent almost same.

That means in Java 9, this would add two new line for each need indented element, result to blank lines appeared.

Solution 2:[2]

your solution and below suggestion are both works fine for me, please try with this test case,

public static void main(String[] args) {

    saveUser("test one", "test two", "test three");

}

private static void saveUser(String firstName, String lastName, String username){
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    try {
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document doc = builder.parse(new File("second.xml"));
        Element root = doc.getDocumentElement();
        root.normalize();

        // build user node
        Element userNode = doc.createElement("user");
        Element nameNode =  doc.createElement("name");
        Element lastNameNode = doc.createElement("last-name");
        Element usernameNode = doc.createElement("username");

        userNode.appendChild(nameNode).setTextContent(firstName); //set the text content
        userNode.appendChild(lastNameNode).setTextContent(lastName);
        userNode.appendChild(usernameNode).setTextContent(username);
        root.appendChild(userNode);

        //write the updated document to file or console
        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        Transformer transformer = transformerFactory.newTransformer();
        DOMSource source = new DOMSource(doc);
        StreamResult result = new StreamResult(new File("second.xml"));
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.transform(source, result);

     }catch (Exception e) {
        e.printStackTrace();
     }
}

second.xml (before execution)

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
</users>

second.xml (first execution)

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
</users>

second.xml (second execution)

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
</users>

second.xml (third execution)

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
</users>

importing classes,

import java.io.File;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.parsers.DocumentBuilder; // missing import class

import org.w3c.dom.Document;
import org.w3c.dom.Element;

Solution 3:[3]

I found this solution using an XPath to be much cleaner than any of the others here (h/t to Isaac for his answer over at https://stackoverflow.com/a/12670194/1339923). It doesn't require a separate (i.e., XSLT) file, and doesn't require you to add 14 lines of Java to iterate over every node in the Document. Only 6 lines of code.

In the case of @pablo-r-grande's original question... right before this comment (i.e., just before loading the Document into the DOMSource):

//write the updated document to file or console

...I would add these lines:

// Generate list of all empty Nodes, them remove them
XPath xp = XPathFactory.newInstance().newXPath();
NodeList nl = (NodeList) xp.evaluate("//text()[normalize-space(.)='']", doc, XPathConstants.NODESET);
for (int i = 0; i < nl.getLength(); ++i) { // note the position of the '++'
    Node node = nl.item(i);
    node.getParentNode().removeChild(node);
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 gavincook
Solution 2
Solution 3 Lambart