Lutz,
When using xml-parse to read xhtml (which is valid xml), any attributes which contain quotation marks, even escaped or in a CDATA block, will cause an error:
<a href="/some/path" title="something" onclick="alert("Hello world")">Looky here</a>
This should validate. It would be impossible to make this render correctly using anything apart from single quotes, and that would mean forcing double quotes to single quotes as a direct translation, which might corrupt some javascript (which can use both in the same string). The following is an example that should be valid xml and demonstrates the problem:
<a href="/some/path/" title="something" onclick="alert("Hello" + 'world')">Looky here</a>
I could get it to validate using the html entity, but it would not function when rendered as a string. Escaping the quotes should be valid; I can't find anything forbidding this in the xhtml spec.
PS Excuse the visible entities - can't figure out any other way to display the contents of an html tag, since the BB seems to be stripping any attributes off.
When you use XML with no DTD validation (xml-parse does no validation, only checks for XML being well formed) then the following characters are not allowed as of XML spec:
greater
less
ampersand
quote
apostrope
Use entities to encode them.
When using CDATA, newLISP will process correctly:
> (xml-type-tags nil nil nil nil)
(nil nil nil nil)
> (xml-parse {<data><![CDATA[<>&"']]></data>} 15)
((data "<>&"'"))
>
but XSLT will translate all special chars in CDATA into entities, so there is no safe way to use special chars in a CDATA block. The best is to just base64 encode all CDATA strings.
Lutz
ps: note that xml-parse is an XML parser not an XHTML parser with HTML DTD validation.
I don't need DTD validation. I wasn't going that far with it. Checking for well-formed markup was all that I am after.