'XSLT output formatting: removing line breaks, and blank output lines from removed elements while keeping indent

Here is my XML:

<doc xmlns="http://www.foo.org">
  <div>
    <title>Mr. Title</title>
    <paragraph>This is one paragraph.
    </paragraph>
    <paragraph>Another paragraph.
    </paragraph>
    <list>
      <orderedlist>
        <item>
          <paragraph>An item paragraph.</paragraph>
        </item>
        <item>
          <paragraph>Another item paragraph</paragraph>
        </item>
      </orderedlist>
    </list>
  </div>    
</doc>

Here is my XSL:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:foo="http://www.foo.org">

<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="foo:doc">
  <xsl:element name="newdoc" namespace="http://www/w3.org/1999/xhtml">
   <xsl:apply-templates/>
  </xsl:element>
 </xsl:template>

 <xsl:template match="foo:div">
  <segment title="{foo:title}">
   <xsl:apply-templates/>
  </segment>
 </xsl:template>

 <xsl:template match="foo:title">
  <xsl:element name="h2">
   <xsl:apply-templates/>
  </xsl:element>
 </xsl:template>

 <xsl:template match="foo:paragraph">
  <xsl:element name="p">
   <xsl:apply-templates/>
  </xsl:element>
 </xsl:template>

 <xsl:template match="foo:list">
  <xsl:apply-templates/>
 </xsl:template>

 <xsl:template match="foo:orderedlist">
  <xsl:element name="ol">
   <xsl:apply-templates/>
  </xsl:element>
 </xsl:template>

 <xsl:template match="foo:item">
  <xsl:element name="li">
   <xsl:apply-templates/>
  </xsl:element>
 </xsl:template>

 <xsl:template match="foo:item/foo:paragraph">
  <xsl:apply-templates/>
 </xsl:template>

</xsl:stylesheet>

And the output:

<newdoc xmlns="http://www/w3.org/1999/xhtml">
  <segment xmlns="" title="Mr. Title">
    <h2>Mr. Title</h2>
    <p>This is one paragraph.
    </p>
    <p>Another paragraph.
    </p>

      <ol>
        <li>
          An item paragraph.
        </li>

        <li>
          Another item paragraph
        </li>
      </ol>

  </segment>    
</newdoc>

I would like to change 3 things about this output:

  1. remove the line break from the "p" elements (originally paragraph)
  2. remove the line breaks from the "li" elements (produced when item/paragraph elements were removed)
  3. remove the extra blank lines created when the list items were removed

-I have tried <xsl:template match="foo:list/text()[normalize-space(.)='']" /> for #3, but this messes with the indentation

-I have also tried <xsl:template match="foo:paragraph/text()[normalize-space(.)='']" /> for #1, but this has no effect on the line breaks

-And I have tried <xsl:strip-space elements="*"/> but this eliminates all indentation

Thank you!!



Solution 1:[1]

Adding these templates to your stylesheet:

<xsl:template match="*/text()[normalize-space()]">
    <xsl:value-of select="normalize-space()"/>
</xsl:template>

<xsl:template match="*/text()[not(normalize-space())]" />

Produces this output:

<?xml version="1.0" encoding="UTF-8"?>
<newdoc xmlns="http://www/w3.org/1999/xhtml">
    <segment xmlns="" xmlns:foo="http://www.example.com" title="Mr. Title">
        <h2>Mr. Title</h2>
        <p>This is one paragraph.</p>
        <p>Another paragraph.</p>
        <ol>
            <li>An item paragraph.</li>
            <li>Another item paragraph</li>
        </ol>
    </segment>
</newdoc>

The template with match="*/text()[normalize-space()]" will match text() nodes if the string returned from normalize-space() has some value. An empty string from an all whites-space text() would evaluate to false() and not be matched. The other template matches the opposite condition, and since it is an empty template, will eliminate the white-space only text() from the output.

Solution 2:[2]

At the very end of the stylesheet add these two templates:

<xsl:template match=
"text()[not(string-length(normalize-space()))]"/>

<xsl:template match=
"text()[string-length(normalize-space()) > 0]">
  <xsl:value-of select="translate(.,'&#xA;&#xD;', '  ')"/>
</xsl:template>

You now get the wanted result:

<?xml version="1.0" encoding="UTF-8"?>
<newdoc xmlns="http://www/w3.org/1999/xhtml">
   <segment xmlns="" xmlns:foo="http://www.foo.org" title="Mr. Title">
      <h2>Mr. Title</h2>
      <p>This is one paragraph.         </p>
      <p>Another paragraph.         </p>
      <ol>
         <li>An item paragraph.</li>
         <li>Another item paragraph</li>
      </ol>
   </segment>
</newdoc>

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Dimitre Novatchev