Adams Bros Blog

22Nov/094

XSL Break or Wrap String on Word Boundary

I've searched all over the Internet for this, and was unable to find anything reasonable.  I found an example somewhere, of how to break a string at a specific location, but it breaks whether there is a word there or not.  So, either you have to re-compose the XML elements without a space, and hope every system you interact with does the same thing as you,  or re-compose them with a space, and a word may then be broken up in the end result.

In my example below, I break a string on a word boundary, outputting to an XML element called "NoteMessage" from the PESC standard.  This is dependant on Java, but you could use any language that has a useful lastIndexOf function.  In the case of Java, with zero based string index, we need to compensate for the one based string index that XSL has.  So, we add one to the result of the lastIndexOf() call.

<!--
Example...
java -cp target/dependency/xml-util-0.1.3-SNAPSHOT.jar:/usr/share/xalan/lib/xalan.jar \
org.apache.xalan.xslt.Process -XSL target/classes/xsl/notemessage.xsl \
 -PARAM testString "This is a test to break a string into multiple note messages, \
automatically, without programming."
-->
<xsl:stylesheet version="1.0"
                xmlns:String="http://xml.apache.org/xalan/java/java.lang.String"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="${xml.indent}" method="xml" omit-xml-declaration="yes"/>
  <xsl:param name="testString"/>
  <xsl:variable name="break-at" select="'76'"/>

  <xsl:template match="/">
    <xsl:call-template name="note-message">
      <xsl:with-param name="string" select="$testString"/>
    </xsl:call-template>
  </xsl:template>

  <xsl:template name="note-message">
    <xsl:param name="string"/>
    <xsl:choose>
      <xsl:when test="string-length($string) <= $break-at">
        <xsl:element name="NoteMessage">
        <xsl:value-of select="$string"/>
        </xsl:element>
      </xsl:when>
      <xsl:otherwise>
        <!-- call method to find word boundary index -->
        <xsl:variable name="truncString"
                      select="String:new(substring($string, 1, $break-at))"/>
        <xsl:variable name="lastSpaceIndex"
                      select="String:lastIndexOf($truncString, ' ') + 1"/>
        <xsl:element name="NoteMessage" namespace="">
          <xsl:value-of select="substring(string($truncString), 1, $lastSpaceIndex)"/>
        </xsl:element>
        <xsl:call-template name="note-message">
          <xsl:with-param name="string"
                          select="substring($string, $lastSpaceIndex + 1)"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>
</xsl:stylesheet>
Comments (4) Trackbacks (0)
  1. What is thAt XML-utilo.1.3-SNAPSHOT.jar? Is it used?

    • Sorry, that was a proprietary thing. I had copy/pasted an example from some real code, and massaged it.

      • I hate to be a bother about this, but I have a few more questions:
        1. why is the String:new necessary prior to the call to lastIndexOf?
        2. Isn’t lastIndexOf a method that operates on an instance of a string? It appears like it’s taking the string to search as the first parameter, andthe space (the search) as a second parameter. Can you explain why that works?

        I’ve been working with Stylus Studio here and using External functions and have a similar requirement due to some overflow of text using Apache FOP. Stylus Studio runs your example perfectly IF I change the processor to XALAN-J rather than their built in. Otherwise, it can’t seem to find these functions, even though I would think they’d be “registered” by default.

        The only change I had over your function was that there might not be a space in the input text, so it MUST wrap at 35 characters, regardless.

        • It’s been awhile since I used this. And you’re not a bother! 😀

          1. I am not positive that the “new” is required. I thought it was at the time, but I could be wrong.
          2. Yes, the first parameter is a way of telling the XSL processor to use the “lastIndexOf” on that string instance. So, I’m suspecting that the “new” was used to create the instance to be used. But again, not sure if it’s required, you could try it by just passing the raw XSL “substring” and see what happens.

          Keep in mind that if you use XSL 2.0, there are a LOT of new features, and I’m betting you could do this easily, without a need to call a specific language, such as Java. Xalan doesn’t support 2.0, but saxon does. The coolest thing about XSL 2.0, is you can make native “functions” where you don’t have to apply a template, but it’s actually like any other XSL function call, such as “substring”. I could work up an example if I remember when I’m at work. I’ve done some XSL 2.0 stuff at work.


Leave a comment

 

No trackbacks yet.