newLISP Fan Club

Forum => newLISP in the real world => Topic started by: ale870 on April 06, 2010, 06:01:14 AM

Title: URL Encode / decode
Post by: ale870 on April 06, 2010, 06:01:14 AM
Hello,

I need to apply URL encoding on a text (see: //http://www.blooberry.com/indexdot/html/topics/urlencoding.htm).

I know I can use several replace functions to accomplish the job, but is there any function to make it faster and with less code (maybe using MAP or something similar).



Thank you!
Title: Re: URL Encode / decode
Post by: ale870 on April 06, 2010, 06:51:01 AM
Ok, after some investigation, I found this:


(replace " |r|n" "This is my namenThis is my address" (case $0 (" " "%20") ("r" "%13") ("n" "%10") ) 1)


It works. But is there a better way?
Title: Re: URL Encode / decode
Post by: ale870 on April 06, 2010, 07:01:37 AM
New version!

In this case you don't need to put replaced symbols in two different places:


(replace " |r|n" "This isnan exampler" (string "%" (format "%02X" (char $0))) 1)
Title: Re: URL Encode / decode
Post by: Lutz on April 06, 2010, 07:44:37 AM
Look for the title "URL encode and decode " on this page:



http://www.newlisp.org/index.cgi?page=Code_Snippets



you also have to encode special characters and spaces.



The "Code Snippets" page is also accessible from here:



http://www.newlisp.org/index.cgi?Tips_and_Tricks
Title: Re: URL Encode / decode
Post by: cormullion on April 06, 2010, 10:11:58 AM
//http://static.artfulcode.net/newlisp/web.lsp.html ?
Title: Re: URL Encode / decode
Post by: Sammo on April 06, 2010, 11:20:04 AM
In the Code Snippets example:
(define (url-decode str)
  (replace "+" str " ") ; optional
  (replace "%([0-9A-F][0-9A-F])" s (char (int $1 0 16)) 1))

should be
(define (url-decode str)
  (replace "+" str " ") ; optional
  (replace "%([0-9A-F][0-9A-F])" str (char (int $1 0 16)) 1))

in which 's' in the second replace is changed to 'str'.



-- Sam
Title: Re: URL Encode / decode
Post by: itistoday on April 06, 2010, 11:52:58 AM
I spent quite a while on the UTF8 urlencoding decoding stuff in Dragonfly, because all of the newLISP code I had seen previously was "doing it wrong." If I recall correctly, they weren't properly converting characters that took 3 or more bytes (or was it 4?) to represent in UTF8, and because of that failing to properly encode and decode many (if not all) Asian characters.



I think I came up with the fastest solution possible without resorting to native code, and unlike the examples above, it should handle the entire Unicode range of characters. If you can think of any improvements let me know and I'll incorporate them into Dragonfly!



Taken from dragonfly-framework/lib/request.lsp:


(constant 'REGEX_HTTP_SPECIAL_STR (regex-comp {([^.0-9a-z]+)} 1))
(constant 'REGEX_HEX_ENCODED_CHAR (regex-comp {%([0-9A-F][0-9A-F])} 1))

(define (hex-encode-str str , cnvrt)
(setf cnvrt (dup "%%%X" (length str)))
(eval (append '(format cnvrt) (unpack (dup "b" (length str)) str)))
)

;; @syntax (utf8-urlencode <str> [<bool-everything>])
;; @param str the string to encode
;; @param bool-everything whether to escape the entire string or just most of the "non-ascii friendly" parts.
;; <p>Use this function to safely encode data that might have foreign characters in it, or simply
;; characters that should be placed into URLs:</p>
;; <b>example:</b>
;; <pre> (utf8-urlencode "What time is it?")  => "What%20time%20is%20it%3F"</pre>
(define (utf8-urlencode str everything)
(if everything
(hex-encode-str str)
(replace REGEX_HTTP_SPECIAL_STR str (hex-encode-str $1) 0x10000)
)
)

;; @syntax (utf8-urldecode <str>)
;; <p>Decodes a utf8-urlencoded string. Converts '+'&apos;s to spaces.</p>
(define (utf8-urldecode str)
(replace "+" str " ")
(replace REGEX_HEX_ENCODED_CHAR str (pack "b" (int $1 nil 16)) 0x10000)
)
Title: Re: URL Encode / decode
Post by: Lutz on April 07, 2010, 02:29:50 AM
'format' can take the values to format in a list, so you can simplify the first function:


(define (hex-encode-str str , cnvrt)
   (setf cnvrt   (dup "%%%X" (length str)))
   (format cnvrt (unpack (dup "b" (length str)) str))
)

> (hex-encode-str "newLISP")
"%6E%65%77%4C%49%53%50"
>
Title: Re: URL Encode / decode
Post by: itistoday on April 07, 2010, 09:44:56 AM
Quote from: "Lutz"'format' can take the values to format in a list, so you can simplify the first function:

Thanks Lutz, didn't know that. I've commit that change to the Dragonfly repo.