Just posted the 1st Unicode/UTF-8 compileable development version. I did testing with Cyrillic/Greek/Hebrew/Russian character sets, but could not test on a platform with keybboard support for those characters or platforms which heavily use multibyte characters like Chinese/Japanese/Indian/Arabic chracter sets and also input these from the keyboard.
I believe JP (Jean Pierre) on this board is running Japanese Windows?
There could be (shouldn't) differences running the Tcl/Tk frontend and running newlisp.exe or newlisp (Linux binary) alone. The TCl/TK frontend switches fine on Linux, but could not test on Win32. It is not only display but also the correct working of UTF-8 versions of specific string functions (see CHANGES file and manual cpater about UTF-8), like 'trim', 'nth', 'upper-case', etc..
Any feedback about this is appreciated.
Lutz
Running Turtle.lsp with the UTF-8 EXE gives an error ' Bad screen distance "302,1612092" '.
When I use a german Umlaut in a String with the Trim command I get a strange result.
Quote
> (trim "Höhe;;" ";")
"Hö¨¥–"
Do I need to give the trim command a UTF-8 string where the umlaut is encoded in a compatible way?
I don't think you should use the UTF-8 version on Win32 in Germany, where Windows is localized with German as a one-byte-character language, probably with code page ISO-8859.
Windows in Germany and other European countries will display Unicode in the notepad.exe application and others but is else not a Unicode enabled OS.
Lutz
just found out:
(trim "Höhe;;" ";") => "Höhe"
works fine on newlisp-utf8.exe when in the command shell, it is together with the Tcl/Tk frontend, that is gets confused. I wonder if on Win32 Tcl/Tk has to be compiled as a Unicode application with unicows.dll/lib etc. On Linux it switches on startup.
Lutz
Thanks for the info.
In the utf8 doku is a typo:
'The utf8 function is used top convert from UCS-4 to UTF-8'
We all know that newLISP is top but I think it should read 'to'.
Maybe the trim example in the docu could be clearer:
(trim "00012340" "0") => "1234"
(trim "00012340" "0" "") => "12340"
(trim "01234000" "" "0") => "01234"
Lutz
Strangely enough I did try your newlisp-utf8.exe and I found it brakes code when
used with UTF8 strings but the regular NewLisp does not
Example if you run the strings ..
(trim " 日本語が難しい ") ;; Japanish ist schwer (UTF8 i
n Japanese)
(trim " Er ist ein großer Schwätzer ") ;; Er ist ein grosser Schwaetzer
(UTF8 in German)
The code will be broken on both accounts by newlisp-utf8.exe but left intact wit
h Newlisp
Jean-Pierre
Quote from: "Lutz"
I don't think you should use the UTF-8 version on Win32 in Germany, where Windows is localized with German as a one-byte-character language, probably with code page ISO-8859.
Lutz
Indeed under XP with the default code page chcp 437, under the command prompt
echo (trim "Höhe;;" ";") > test.txt
notepad test.txt will show that we have an ANSI coded file.
Jean-Pierre
Seems like Windows uses Unicode only internally but else translates to one-byte-character code pages. But when loading a utf-8 file into notepad.exe it works correctly. You also can read this file in newlisp-utf8.exe, upper-case the string and write it back, and it will be fine in notepad.exe. 'upper-case' in newLISP converts the a utf-8 string to 4-byte Unicode and calls a Borland/Windows or Linux -library function towupper(), then converts back to utf-8. notepad.exe also has a save-as option for utf-8.
I wonder if all you need is a utf-8 compiled cmd.exe, like it is the case on Linux with Xterm, and I thought that perhaps Japanese Windows would be like this. Did you try your experiment on US-WinXP or on a Japanese localized version?
Lutz
The localization won't matter under Win2k or XP since all the internal representations are in Unicode (UTF-16LE). Strictly speaking UTF8 is not Unicode but a coding that lends itself readily to conversion in Unicode(s). The disparity between newlisp-utf8.exe and its UNIX counterpart could come that under UNIX Unicode is not Low Endian but High Endian and Windows will require a Low Endian code otherwise will mess up subsequent conversion in UTF8.
Jean-Pierre
Quote from: "HPW"
Running Turtle.lsp with the UTF-8 EXE gives an error ' Bad screen distance "302,1612092" '.
Running an equivalent program UTF-8 EXE was able to carry all its calculations and display Japanese without any problems
Jean-Pierre
========= Kame.lsp
;; Kame.lsp - graphics
;; written by Jean-Pierre Berard
;;
;; 1 rad = 180/3.1415927 = 57.29578 deg
;; 1 deg = 0.017453292 rad
(set! color "blue")
(set! width 500)
(set! height 500)
(define (convert angle) (mul angle 0.017453292))
(define (adjacent-cos angle hypo) (mul hypo (cos (convert angle))))
(define (adjacent-tan angle opposite) (div opposite (tan (convert angle))))
(define (hypo-sin angle opposite) (div opposite (sin (convert angle))))
(define (hypo-cos angle adjacent) (div adjacent (cos (convert angle))))
(define (opposite-sin angle hypo) (mul hypo (sin (convert angle))))
(define (opposite-tan angle adjacent) (mul adjacent (tan (convert angle))))
(define (outer inner-angle) (sub 180 inner-angle))
(define (rectangular angle radius)
(set! x (adjacent-cos angle radius))
(set! y (opposite-sin angle radius))
(println "x=" x " y=" y)
true
)
(define (polar x y)
(set! angle (div (atan (div y x)) 0.017453292))
(set! radius (root (add (pow x 2) (pow y 2)) 2))
(println "angle=" angle " radius=" radius)
true
)
(define (triangulation side side-size)
(set! y (div side-size 2))
(set! angle (div 360 side 2))
(set! x (adjacent-tan angle y))
(set! radius (hypo-sin angle y))
(println "angle=" angle " radius=" radius)
(println "x=" x " y=" y)
(pen 'yellow)
(forward y)
(right 90)
(forward x)
(right (sub 180 angle))
(forward radius)
true
)
(define (pseudo-polygon side n)
(set! ratio (div 360 side))
(dotimes (x side)
(forward n)
(right ratio))
(left ratio)
)
(define (polygon side n)
(dotimes (x side)
(forward n)
(right (div 360 side))
))
(define (oval x y)
(set! Y (sub lastY (div y 2)))
(tk ".kw.canvas create oval "
(join (map string (list lastX Y (add lastX x) (add Y y))) " ")
" -outline " color)
(round (div direction 0.017453292))
)
(define (circle n)
(set! X 0)
(set! x (round lastX))
(set! y (round lastY))
(set 'direction -1.570796327)
(set! ratio (mul (div 57.29578 n) 2))
(until (and (= x X) (= y (round lastY)))
(set! X (round lastX))
(forward 1)
(right ratio))
)
(define (cercle n)
(set! x (round lastX))
(set! y (round lastY))
(set! lastX (+ x n))
(for (t 0 2 0.005) ;; from 0 to 2 rad
(set! newX (mul n (cos (mul pi t))))
(set! newY (mul n (sin (mul pi t))))
(set! newX (add newX x))
(set! newY (add newY y))
(tk ".kw.canvas create line "
(join (map string (list lastX lastY newX newY)) " ")
" -fill " color)
(set 'lastX newX)
(set 'lastY newY))
(set 'lastX x)
(set 'lastY y)
(round (div direction 0.017453292))
)
(define (rose clr)
(set 'color clr)
(dotimes (x 90)
(pseudo-polygon 4 60)
(right 2))
)
(define (square n)
(dotimes (x 4)
(forward n)
(right 90))
)
(define (squirl n)
(dotimes (x (/ n 3))
(forward n)
(right 90)
(set! n (- n 2)))
(round (div direction 0.017453292))
)
(define (dragon sign level)
(if (= 0 level)
(forward 4)
(begin
(dec 'level)
(right (sign 45))
(dragon - level)
(left (sign 90))
(dragon + level)
(right (sign 45))
)))
(define (dragon-curve n clr)
(set 'color clr)
(dragon + n)
)
(define (right d)
(set 'direction (add direction (mul d 0.017453292)))
(round (div direction 0.017453292)))
(define (left d)
(set 'direction (sub direction (mul d 0.017453292)))
(round (div direction 0.017453292)))
(define (forward d)
(set 'newX (add lastX (mul (cos direction) d)))
(set 'newY (add lastY (mul (sin direction) d)))
(tk ".kw.canvas create line "
(join (map string (list lastX lastY newX newY)) " ")
" -fill " color)
(tk "update idletasks")
(set 'lastX newX)
(set 'lastY newY)
(round (div direction 0.017453292))
)
(define (backward d)
(set! direction (mul -1 direction))
(forward d)
)
(define (pen clr) (set! color (string clr)))
(define (clear) ;; upper left and lower right
(tk ".kw.canvas create rectangle 0 0 "
(join (map string (list width height)) " ")
" -fill black -tag clear")
(center)
)
(define (center)
(set 'lastX (/ width 2))
(set 'lastY (/ height 2))
(set 'direction -1.570796327))
(define (start x y)
(set 'lastX x)
(set 'lastY y)
(set 'direction -1.570796327))
(define (goto x y)
(set 'lastX x)
(set 'lastY y)
(round (div direction 0.017453292))
)
(begin
(set! today (parse (date (apply date-value (now)))))
(println (car today) " " (cadr today) " " (caddr today))
(set! nihongo {u4e80u3000u4f5cu56f3})
(tk "if {[winfo exists .kw] == 1} {destroy .kw}")
(tk "toplevel .kw")
(tk "canvas .kw.canvas -width " width " -height " height " -bg black")
(tk "pack .kw.canvas")
(tk "wm geometry .kw +290+25")
(tk "wm title .kw { Kame.lsp}")
(tk "bind .kw exit")
(start 50 450)
(squirl 400)
(rose "red")
(tk ".kw.canvas create text 130 380 "
"-fill white -font {Times 22 normal} -text " nihongo)
)
(define (help)
(println "outer inner-angle")
(println "adjacent-cos angle hypo")
(println "adjacent-tan angle opposite")
(println "hypo-sin angle opposite")
(println "hypo-cos angle adjacent")
(println "opposite-sin angle hypo")
(println "opposite-tan angle adjacent")
(println "triangulation side side-size")
(println "rectangular angle radius")
(println "polar x y")
true
)
[/quote]
Running an equivalent program UTF-8 EXE was able to carry all its calculations and display Japanese without any problems
Jean-Pierre
========= Kame.lsp
[/quote]
Sorry the second statement after (begin
(println (car today) " " (cadr today) " " (caddr today))
has to be substituted with ....
(println (nth 0 today) " " (nth 1 today) " " (nth 3 today))
One has also to add the function ...
(define (round n) (floor (add n 0.5)))
to make it newlisp standard script
Jean-Pierre
Thanks Jean-Pierre and Hans-Peter for all the input about the UTF-8 version on Win32. It seems that things are working ok, except for the UTF-8 version of 'trim'.
I found the problem with 'trim' and fixed it for the next development version 8.0.9, probably out tomorrow. I still want to retest all other UTF-8 enabled functions on Windows, which is a bit tedious, because I have to write all strings before/after manipulation to a file and then view them with notepad.exe, to see if they correctly work on character versus byte borders and not change things they shouldn't.
Lutz
Lutz
Well done everything seems to be fixed except for the best feature of all; the ability of Newlisp to directly communicate with the clipboard. Newlisp-utf8.exe seems to disable completely the clipboard on Unicode and non Unicode based Windows (Win98/ME).
Also newlisp-utf8.dll and alas newlisp.dll as well cannot run processes under the exec function, only the function ! shell out works.
Jean-Pierre