Checking whether running a utf version?

Started by cormullion, December 10, 2008, 11:24:54 AM

Previous topic - Next topic

cormullion

What's the best way to check whether the current newlisp session is running in UTF8 mode? And is it then possible to switch between UTF8 and non-UTF8 functions such as utf8len <-> length?



I'm trying to make sure that a program runs OK on both types of newLISP, but not sure how it can be done?



BTW: who isn't using UTF8 these days?

xytroxon

#1
Quote from: "cormullion"BTW: who isn't using UTF8 these days?


ME ;)



It's easier to use NON-UTF-8 with data generated by and for Windows legacy apps... (Less surprises!)



-- xytroxon
\"Many computers can print only capital letters, so we shall not use lowercase letters.\"

-- Let\'s Talk Lisp (c) 1976

HPW

#2
QuoteBTW: who isn't using UTF8 these days?


Me too, user of the DLL use it in a enviroment which also does not support it.
Hans-Peter

newdep

#3
I dont use utf-8, actualy you dont want to know my statement on utf-8 ;-)



if a user has 2 binarys you could execute in a pre-check script based on utf-8

..



But i liked your check in the GS color gadget...





PS: Who uses utf-8 dies days?

Voice is the gadget people..:) (like it was 15 years ago ;-)
-- (define? (Cornflakes))

cormullion

#4
If this works:


(if-not unicode (println "need UTF version" (exit)))

I'll use something similar.



Hard for me to test, of course... :)



My understanding had been that a Unicode version of newLISP can process all data generated by non-Unicode systems... but that non-Unicode versions of newLISP can't process data generated by Unicode applications. Perhaps I'd got it wrong.

newdep

#5
I think that is correct indeed.. utf-8 version can handle none utf-8 code..
-- (define? (Cornflakes))

Lutz

#6
Here are some ways to test if the running newLISP is UTF-8 enabled:


(= (char (char 1000)) 1000) => true on UTF-8 versions

(primitive? utf8) => true on UTF-8
(primitive? unicode) => true on UTF-8

; or simply check if the utf8 or unicode function is there

(if utf8 true nil) => true on UTF-8
(if unicode true nil) => true on UTF-8




The difference between the two versions is the working of several functions dealing with strings:



http://www.newlisp.org/downloads/newlisp_manual.html#utf8_capable">http://www.newlisp.org/downloads/newlis ... f8_capable">http://www.newlisp.org/downloads/newlisp_manual.html#utf8_capable



and the addition of the three functions: 'utf8', 'utf9-len'  and 'unicode'. The UTF-8 version can handle ASCII non-UTF-8 strings without a problem. A problem could occur when processing one-byte character sets, which encode characters in the range 128-255. Portions of this text could be mistaken as UTF-8 by the UTF-8 version of newLISP and using functions in the above link. This could occur in popular Windows one-byte ISO-8859 character sets using the bytes beyond 127.