trim and utf-8 oddness

Started by cormullion, January 26, 2006, 02:28:39 AM

Previous topic - Next topic

cormullion

I'm using newLISP UTF-8 on MacOS. Why does the 00 start appearing?  


(set 't "a hypothetical one-dimensional subatomic particle")
"a hypothetical one-dimensional subatomic particle"
> (trim t)
"a hypothetical one-dimensional subatomic particle"
> t
"a hypothetical one-dimensional subatomic particle"
> (trim t "e")
"a hypothetical one-dimensional subatomic particl"
> t
"a hypothetical one-dimensional subatomic particl00"
> (trim t "a" "e")
" hypothetical one-dimensional subatomic particl"
> t
" hypothetical one-dimensional subatomic particl0000"
>

Lutz

#1
It has always been this way, and you see the 00 only when a string is returned on the commandline. When you print it, its Ok:



> (set 's "ABC00")
"ABC00"
> (println s)
ABC
"ABC00"
>


newLISP can work with binary contents in strings, for debugging purpuses it is important to have a way to 'see' that binary contents. Not that the string also appears quoted.



But there is another issue with characters > ASCII 127. These characters starting a few development versions back, where also shown in nnn format. This was not good for European useres and Windows users using the PC codepage 859, which carries special European characters, money symbols and other symbols and some graphical characters in that codepage.



Starting with version 8.7.10 upper ASCII will only be shown as nnn codes when the default "C" locale is specified. When any other locale is specified than upper ASCII will be displayed as characters not codes.



But remember all this discussion is only about strings displayed in the interactive newLISP console as return values. When displaying upper ASCII with 'print'n' etc. The character will be displayed as a '?' question mark, if not part of the current code page.



Lutz



ps: and there is an entirely different issue with 'trim' which seems to behave desctructively in UTF-8 which it not should and will be fixed in 8.9.10

cormullion

#2
Thanks. As you say, it only looks odd!