println decimal string

Started by Thorstein, May 26, 2015, 04:46:53 PM

Previous topic - Next topic

Thorstein

Running on windows,



(exec)'ing Google Translate returns the following str inside a JSON container:



(a) "Nous allons habiller pour la randonnée, selon la météo."



However, somewhere in the process of a (string str "") or (replace x str y) the str begins to (println) as this:



(b) "Nous allons habiller pour la randonn195169e, selon la m195169t195169o. nn</td></tr>"



How can I convert (b) to (a) so I can (println) (a) to a static HTML file?



Do I have to make a unicode build?

Lutz

#1
The "195169" portion of the string is just the way newLISP encodes a string when output is directed to a device or terminal not able to display UTF8 characters. The byte sequence 195 169 is the encoding for the UTF8 character é (unicode 233).



If you would do a (println "195169") in a UTF8 capable terminal - e.g. on OSX - you would see é.



The following code would create a usable page.html which correctly would show the accented é in a web browser:


(write-file "page.html" (string
   {<htmL><head> <META http-equiv="Content-Type" content="text/html; charset=utf-8" /> </head><body>}
   "nNous allons habiller pour la randonn195169e, selon la m195169t195169o.n"
   "</body></html>"
))


This is the page generated by above program, including the translation of the n characters in two linefeeds for better looks for the HTML code.



<htmL><head> <META http-equiv="Content-Type" content="text/html; charset=utf-8" /> </head><body>
Nous allons habiller pour la randonnée, selon la météo.
</body></html>


Note, that the first string argument is limited with curly braces {,}, doing it this way, lets me include un-escaped quotes " in the string. Normally when using "..." to limit a string in newLISP, would have to escape special characters with a backslash like this ". For strings longer than 2048 characters you also can use [text]...[/text] tags as delimiters. All this is explained in the manual.



So the sequence 195169 has nothing to do with println or string. It is just the special way to encode UTF8 characters in newLISP.



The above code to write page.html works also with non-UTF8 versions of newLISP, but if you do a lot of web work and in non-english languages, I recommend using the the UTF8 version of newLISP. This way you have a lot of string manipulating functions UTF8 aware.

Thorstein

#2
Thanks, Lutz!  I found the latest UTF8 build.  That is doing the trick.  (That and RTFM! :-/ ).



And many thanks for this great Lisp!  (And for the great documentation.)