newLISP Fan Club

Forum => newLISP newS => Topic started by: cormullion on December 10, 2008, 11:24:54 AM

Title: Checking whether running a utf version?
Post by: cormullion on December 10, 2008, 11:24:54 AM
What's the best way to check whether the current newlisp session is running in UTF8 mode? And is it then possible to switch between UTF8 and non-UTF8 functions such as utf8len <-> length?



I'm trying to make sure that a program runs OK on both types of newLISP, but not sure how it can be done?



BTW: who isn't using UTF8 these days?
Title: Re: Checking whether running a utf version?
Post by: xytroxon on December 10, 2008, 12:03:32 PM
Quote from: "cormullion"BTW: who isn't using UTF8 these days?


ME ;)



It's easier to use NON-UTF-8 with data generated by and for Windows legacy apps... (Less surprises!)



-- xytroxon
Title:
Post by: HPW on December 10, 2008, 12:34:10 PM
QuoteBTW: who isn't using UTF8 these days?


Me too, user of the DLL use it in a enviroment which also does not support it.
Title:
Post by: newdep on December 10, 2008, 12:44:17 PM
I dont use utf-8, actualy you dont want to know my statement on utf-8 ;-)



if a user has 2 binarys you could execute in a pre-check script based on utf-8

..



But i liked your check in the GS color gadget...





PS: Who uses utf-8 dies days?

Voice is the gadget people..:) (like it was 15 years ago ;-)
Title:
Post by: cormullion on December 10, 2008, 01:29:35 PM
If this works:


(if-not unicode (println "need UTF version" (exit)))

I'll use something similar.



Hard for me to test, of course... :)



My understanding had been that a Unicode version of newLISP can process all data generated by non-Unicode systems... but that non-Unicode versions of newLISP can't process data generated by Unicode applications. Perhaps I'd got it wrong.
Title:
Post by: newdep on December 10, 2008, 01:37:02 PM
I think that is correct indeed.. utf-8 version can handle none utf-8 code..
Title:
Post by: Lutz on December 10, 2008, 02:02:51 PM
Here are some ways to test if the running newLISP is UTF-8 enabled:


(= (char (char 1000)) 1000) => true on UTF-8 versions

(primitive? utf8) => true on UTF-8
(primitive? unicode) => true on UTF-8

; or simply check if the utf8 or unicode function is there

(if utf8 true nil) => true on UTF-8
(if unicode true nil) => true on UTF-8




The difference between the two versions is the working of several functions dealing with strings:



http://www.newlisp.org/downloads/newlisp_manual.html#utf8_capable



and the addition of the three functions: 'utf8', 'utf9-len'  and 'unicode'. The UTF-8 version can handle ASCII non-UTF-8 strings without a problem. A problem could occur when processing one-byte character sets, which encode characters in the range 128-255. Portions of this text could be mistaken as UTF-8 by the UTF-8 version of newLISP and using functions in the above link. This could occur in popular Windows one-byte ISO-8859 character sets using the bytes beyond 127.