Page tree
Skip to end of metadata
Go to start of metadata

Purpose

This document is for anyone who uses the SQL Anywhere OPENXML function, working with string node elements that can potentially include special characters.

Overview

The XML parsing mechanism used in SQL Anywhere’s OPENXML function processes a number of special characters not as literal characters in a string node, but instead would split the node. These characters include, but are not limited to, &quot; (the quote character), &amp; (the ampersand), &lt; (the < character) and &gt; (the > character)

Example 1

SELECT * FROM OPENXML( '<products> 
           <prod_type id="301">Tee Shirt</prod_type>
           <prod_type id="401">Baseball Cap</prod_type>
           </products>',
           '/products/prod_type' )
WITH ( prod_name LONG VARCHAR 'text()', prod_id CHAR(3) '@id')

gives you the result set

RESULT SET 1

Prod_nameProd_id
Tee Shirt301
Baseball Cap401

If you wanted to add special characters like these ampersands, quotes, etc., this is shown below in EXAMPLE 2:

Example 2

SELECT * FROM OPENXML( '<products> 
           <prod_type id="301">Tee & Shirt</prod_type>
           <prod_type id="401">Baseball < Cap</prod_type>
           </products>',
           '/products/prod_type' )
WITH ( prod_name LONG VARCHAR 'text()’, prod_id CHAR(3) '@id')

The intention is to get a result set displayed as follows:

RESULT SET 2

Prod_nameProd_id
Tee & Shirt301
Baseball < Cap401

The problem is, the XML is parsed as an element node with three children. For example in the "301" row, the first child is a text node "Tee". The second is an entity reference node. The third is a text node "Shirt". The entity reference node has one text node '"' as a child. On the other hand, text() just returns the first node As a result instead of RESULT SET 2, you get an unexpected result set, with prod_name truncated at the end of the first node:

RESULT SET 3

Prod_nameProd_id
Tee301
Baseball401

If you want the full string, then you need to concatenate the three text nodes in the subtree. To achieve the displayed result set as shown in Example 2, you need to replace ‘text() ‘ with ‘.’

Example 3


SELECT * FROM OPENXML( '<products>
           <prod_type id="301">Tee & Shirt</prod_type>
           <prod_type id="401">Baseball < Cap</prod_type>
           </products>',
           '/products/prod_type' )
WITH ( prod_name LONG VARCHAR '.', prod_id CHAR(3) '@id')
  • No labels