newLISP Fan Club

Forum => newLISP in the real world => Topic started by: Kirill on October 19, 2011, 03:51:27 PM

Title: Character mess when reading CGI-params on UTF-8 newLISP
Post by: Kirill on October 19, 2011, 03:51:27 PM
I'm using cgi.lsp to read POST following POST-ed parameters:



text=тест
save-note=Save


Here's what cgi.lsp gives me (web.lsp behaves accordingly):



(("save-note" "Save") ("text" "209130208181209129209130"))
Ñ,есÑ,


The output above is produced by this code:



(println "Content-type: text/plainnn")
(println CGI:params)
(println (CGI:get "text"))
(exit)


newLISP version is this:



newLISP v.10.3.4 64-bit on Linux IPv4/6 UTF-8, execute 'newlisp -h' for more info.


Maybe someone here could give me some pointers on how to get data as utf8 strings?



Regards,

Kirill
Title: Re: Character mess when reading CGI-params on UTF-8 newLISP
Post by: Kirill on October 19, 2011, 10:36:57 PM
Seems neither cgi.lsp nor web.lsp is able to deal with multibyte characters in their processing of urlencoded data. URL decoding (and encoding) needs a tiny overhaul.
Title: Re: Character mess when reading CGI-params on UTF-8 newLISP
Post by: Kirill on October 19, 2011, 10:44:02 PM
But Dragonfly seems to provide utf8-urlencode and utf8-urldecode. Great!
Title: Re: Character mess when reading CGI-params on UTF-8 newLISP
Post by: Kirill on October 19, 2011, 11:20:46 PM
Confirming that replacing Web:url-encode with Dragonfly's utf8-urlencode solves the issue. It's not pretty, just cut and paste for now, so there is no pretty patch to submit.



For cgi.lsp solution would be similar.



Br,

Kirill