Print Page - Character mess when reading CGI-params on UTF-8 newLISP

Title: Character mess when reading CGI-params on UTF-8 newLISP
Post by: Kirill on October 19, 2011, 03:51:27 PM

I'm using cgi.lsp to read POST following POST-ed parameters:

Code Select Expand

text=тест
save-note=Save

Here's what cgi.lsp gives me (web.lsp behaves accordingly):

Code Select Expand

(("save-note" "Save") ("text" "209130208181209129209130"))
Ñ,ÐµÑÑ,

The output above is produced by this code:

Code Select Expand

(println "Content-type: text/plainnn")
(println CGI:params)
(println (CGI:get "text"))
(exit)

newLISP version is this:

Code Select Expand

newLISP v.10.3.4 64-bit on Linux IPv4/6 UTF-8, execute 'newlisp -h' for more info.

Maybe someone here could give me some pointers on how to get data as utf8 strings?

Regards,

Kirill

Title: Re: Character mess when reading CGI-params on UTF-8 newLISP
Post by: Kirill on October 19, 2011, 10:36:57 PM

Seems neither cgi.lsp nor web.lsp is able to deal with multibyte characters in their processing of urlencoded data. URL decoding (and encoding) needs a tiny overhaul.

Title: Re: Character mess when reading CGI-params on UTF-8 newLISP
Post by: Kirill on October 19, 2011, 10:44:02 PM

But Dragonfly seems to provide utf8-urlencode and utf8-urldecode. Great!

Title: Re: Character mess when reading CGI-params on UTF-8 newLISP
Post by: Kirill on October 19, 2011, 11:20:46 PM

Confirming that replacing Web:url-encode with Dragonfly's utf8-urlencode solves the issue. It's not pretty, just cut and paste for now, so there is no pretty patch to submit.

For cgi.lsp solution would be similar.

Br,

Kirill

newLISP Fan Club

Forum => newLISP in the real world => Topic started by: Kirill on October 19, 2011, 03:51:27 PM