newLISP Fan Club

Forum => newLISP in the real world => Topic started by: pjot on March 01, 2004, 01:54:38 PM

Title: Base64 encoder
Post by: pjot on March 01, 2004, 01:54:38 PM
Hi all,



After all contributions of newdep I couldn't stay behind... ;-)   Below my base64 encoder written in newLISP. Have phun with it.



Cheers

Peter.

--------------------------------------------------------------------------



#!/usr/bin/newlisp

;;

;; Base64 converter using newLISP. Tested on Slackware 9.1 with newLISP 7.5.4.

;;

;; Proxy-servers require a base64 encoded "username:password" to pass through.

;;

;; With this encoder you can hack your way out :-)

;;

;;----------------------------------------------------------------------------

 

;; Setup base64 encode string

(set 'BASE64 "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/")



;; Get input from user

(print "Enter a string to convert: ")

(set 'dat (read-line))



;; Initialize result variable to string

(set 'enc "")



;; Mainloop

(while (> (length dat) 0) (begin

   

   ;; Find ASCII values

   (if (= (length dat) 1)

      (begin

         (set 'byte1 (char dat))

         (set 'byte2 0)

         (set 'byte3 0)))

   (if (= (length dat) 2)

      (begin

         (set 'byte1 (char dat))

         (set 'byte2 (char dat 1))

         (set 'byte3 0)))

   (if (> (length dat) 2)

      (begin

         (set 'byte1 (char dat))

         (set 'byte2 (char dat 1))

         (set 'byte3 (char dat 2))))



   ;; Now create BASE64 values

   (set 'base1 (/ byte1 4))

   (set 'base2 (+ (* (& byte1 3) 16) (/ (& byte2 240) 16)))

   (set 'base3 (+ (* (& byte2 15) 4) (/ (& byte3 192) 64)))

   (set 'base4 (& byte3 63))



   ;; Find BASE64 characters

   (if (= (length dat) 1)

      (begin

         (set 'enc (append enc (nth base1 BASE64)))

         (set 'enc (append enc (nth base2 BASE64)))

         (set 'enc (append enc "=="))

         ;; Put 'dat' to empty list

         (set 'dat "")))

   (if (= (length dat) 2)

      (begin

         (set 'enc (append enc (nth base1 BASE64)))

         (set 'enc (append enc (nth base2 BASE64)))

         (set 'enc (append enc (nth base3 BASE64)))

         (set 'enc (append enc "="))

         ;; Put 'dat' to empty list

         (set 'dat "")))

   (if (> (length dat) 2)

      (begin

         (set 'enc (append enc (nth base1 BASE64)))

         (set 'enc (append enc (nth base2 BASE64)))

         (set 'enc (append enc (nth base3 BASE64)))

         (set 'enc (append enc (nth base4 BASE64)))

         ;; Decrease 'dat' with 3 characters

         (set 'dat (slice dat 3))))

))



;; Print resulting string

(println enc)



;; Exit

(exit)
Title:
Post by: newdep on March 01, 2004, 01:56:52 PM
Hello Pjot,



Great invention ;-) nice work... ill plug it in as a module...



Norman.
Title:
Post by: Lutz on March 01, 2004, 02:20:00 PM
Thanks Pjot,



a small improvment:



>>>> instead of <<<<

(set 'enc (append enc (nth base1 BASE64)))

(set 'enc (append enc (nth base2 BASE64)))

(set 'enc (append enc "=="))





>>>> you can do this <<<<

(set 'enc (append (nth base1 BASE64) (nth base2 BASE64)  "=="))



in newLISP in all places where it makes sense you can specify more than one arg. Makes the code shorter and faster.



Lutz
Title:
Post by: pjot on March 01, 2004, 02:31:51 PM
Hi Lutz,



Thanks for the tip! I did not know that, I will change it here in my code.  (I am sorry for the layout of the program, but the tabs do not seem to appear....)



A decoder will follow.



Regards



Peter.
Title:
Post by: Lutz on March 01, 2004, 02:35:23 PM
thanks Peter, our pops3.lsp users will be delighted, I will put your code in a future base64.lsp module in the newLISP distribution, if this is Ok with you.



Lutz
Title:
Post by: pjot on March 02, 2004, 12:44:04 PM
Hi Lutz,



Of course, it's ok.



Peter.
Title:
Post by: pjot on March 02, 2004, 01:50:17 PM
As promised, the Base64 decoder below. I've applied your tip to this code but it does not improve readability...



Regards



Peter



-----------------------------------------------------------



#!/usr/bin/newlisp

;;

;; Base64 decoder

;;

;; Due to the nature of newLISP, this is the smallest

;; BASE64 decoder I've ever written.

;;

;;-------------------------------------------------------



;; Setup base64 string

(set 'BASE64 "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/")



;; Get input from user

(print "Enter a string to convert: ")

(set 'dat (read-line))



;; Initialize result variable to string

(set 'res "")



;; Mainloop

(while (> (length dat) 0) (begin



   ;; Find the indexnumber in the BASE64 definition

   (set 'byte1 (find (nth 0 dat) BASE64))

   (if (= byte1 nil)(set 'byte1 0))

   (set 'byte2 (find (nth 1 dat) BASE64))

   (if (= byte2 nil)(set 'byte2 0))

   (set 'byte3 (find (nth 2 dat) BASE64))

   (if (= byte3 nil)(set 'byte3 0))

   (set 'byte4 (find (nth 3 dat) BASE64))

   (if (= byte4 nil)(set 'byte4 0))



   ;; Recalculate to ASCII value

   (set 'res (append res (char (+ (* (& byte1 63) 4) (/ (& byte2 48) 16))) (char (+ (* (& byte2 15) 16) (/ (& byte3 60) 4))) (char (+ (* (& byte3 3) 64) byte4))))



   ;; Decrease string with 4

   (set 'dat (slice dat 4))))



;; Print resulting string

(println res)



;; Exit

(exit)
Title:
Post by: newdep on March 02, 2004, 02:42:42 PM
Hello pjot,



Your going too fast :)



Regards, Norman.
Title:
Post by: Lutz on March 02, 2004, 04:18:43 PM
thanks Peter, getting shorter and faster, I wonder if anybody can top this ;-) ?



Lutz
Title:
Post by: HPW on March 02, 2004, 11:19:51 PM
Testing on WIN with this sample code:

(PS: BASE64 is a protected symbol!)



;;
;; Base64 converter using newLISP. Tested on Slackware 9.1 with newLISP 7.5.4.
;;
;; Proxy-servers require a base64 encoded "username:password" to pass through.
;;
;; With this encoder you can hack your way out :-)
;;
;;----------------------------------------------------------------------------

;; Setup base64 encode string

(context 'BASE64)



(define (encode dat)

(set 'base64charset "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/")

;; Get input from user
;(print "Enter a string to convert: ")
;(set 'dat (read-line))

;; Initialize result variable to string
(set 'enc "")

;; Mainloop
(while (> (length dat) 0)
(begin
;; Find ASCII values
(if (= (length dat) 1)
(begin
(set 'byte1 (char dat))
(set 'byte2 0)
(set 'byte3 0)))
(if (= (length dat) 2)
(begin
(set 'byte1 (char dat))
(set 'byte2 (char dat 1))
(set 'byte3 0)))
(if (> (length dat) 2)
(begin
(set 'byte1 (char dat))
(set 'byte2 (char dat 1))
(set 'byte3 (char dat 2))))

;; Now create BASE64 values
(set 'base1 (/ byte1 4))
(set 'base2 (+ (* (& byte1 3) 16) (/ (& byte2 240) 16)))
(set 'base3 (+ (* (& byte2 15) 4) (/ (& byte3 192) 64)))
(set 'base4 (& byte3 63))

;; Find BASE64 characters
(if (= (length dat) 1)
(begin
(set 'enc (append enc(nth base1 base64charset)(nth base2 base64charset)"=="))

;; Put 'dat' to empty list
(set 'dat "")))
(if (= (length dat) 2)
(begin
(set 'enc (append enc (nth base1 base64charset)(nth base2 base64charset)(nth base3 base64charset)"="))

;; Put 'dat' to empty list
(set 'dat "")))
(if (> (length dat) 2)
(begin
(set 'enc (append enc (nth base1 base64charset)(nth base2 base64charset)(nth base3 base64charset)(nth base4 base64charset)))
;; Decrease 'dat' with 3 characters
(set 'dat (slice dat 3))))
))

;; Return resulting string
enc)








;;
;; Base64 decoder
;;
;; Due to the nature of newLISP, this is the smallest
;; BASE64 decoder I've ever written.
;;
;;-------------------------------------------------------

(define (decode dat)

;; Setup base64 string
(set 'base64charset "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/")

;; Get input from user
;(print "Enter a string to convert: ")
;(set 'dat (read-line))

;; Initialize result variable to string
(set 'res "")

;; Mainloop
(while (> (length dat) 0)
(begin
;; Find the indexnumber in the base64charset definition
(set 'byte1 (find (nth 0 dat) base64charset))
(if (= byte1 nil)(set 'byte1 0))
(set 'byte2 (find (nth 1 dat) base64charset))
(if (= byte2 nil)(set 'byte2 0))
(set 'byte3 (find (nth 2 dat) base64charset))
(if (= byte3 nil)(set 'byte3 0))
(set 'byte4 (find (nth 3 dat) base64charset))
(if (= byte4 nil)(set 'byte4 0))

;; Recalculate to ASCII value
(set 'res (append res
(char (+ (* (& byte1 63) 4) (/ (& byte2 48) 16)))
(char (+ (* (& byte2 15) 16) (/ (& byte3 60) 4)))
(char (+ (* (& byte3 3) 64) byte4))))

;; Decrease string with 4
(set 'dat (slice dat 4))))

;; Print resulting string
(trim res "00")
)

(context 'MAIN)




Bugs corrected!
Title:
Post by: newdep on March 03, 2004, 07:32:32 AM
Hello HPW,



"00" odd actualy... Looks like windows displays the <EOF> or <EOL>

which is an NULL-byte normaly but im not sure on this...



Norman.
Title:
Post by: Lutz on March 03, 2004, 08:08:45 AM
newLISP will only display the null character if it was somehow specified, internally there is one more character behind it, i.e:



(set 'var "000000") => "000000"



(length var) => 3



but internally newLISP allocates one more byte, which is always 00 this way strings are limited when using 'print', 'format' etc. i.e.



(set 'var "A0000") => "A0000"



but



(println var)

A

"A0000"



the first A is the printed one and the other thing is the return value from the print statment. Internally the var string is allocated with 4 bytes.



most of newLISP's functions can work on binary content allthough not documented spefically because strings are always stored in a buffer with a zero appended and the buffer length stored in a different field of the LISP cell.



Lutz
Title:
Post by: Lutz on March 03, 2004, 10:10:36 AM
Perhaps Peters BASE64 code is just fine, from:



http://www.securecode.net/Base64Convert+main.html



>>>At the end of the encoding process we might run into a problem. If the size of the original data in bytes is a multiple of three, everything works fine. If it is not, we might end up with one or two 8-bit bytes. For proper encoding, we need three bytes, however.



The solution is to append enough bytes with a value of '0' to create a 3-byte group. Two such values are appended if we have one extra byte of data, one is appended for two extra bytes.

>>>



Si I think you can just strip of the trailing 00 doing a: (trim str)



Lutz
Title:
Post by: pjot on March 03, 2004, 11:58:59 AM
Hi all,



Indeed it is the nature of Base64 which is bothering us, I think. For the ENCODING part, we are going from 3 bytes to 4 'bytes' (3x8 to 4x6). The decoding part is the reverse (4x6 to 3x8).



The padding sign '=' is used to fill up the empty places when the input string originally is not a multiple of 4. This   '=' sign however is not part of the BASE64 string, therefore with the decoder a '0' is produced in order to perform a succesfull binary calculation backwards. This might result in a string with "00" at the end.



Even not shown in my prompt here, the length of the encoded "Peter" = "UGV0ZXI=" will deliver a decoded length of 6 which is "Peter00" again, since the base64 string ends with the padding symbol '='.



Indeed the best workaround for all this is to TRIM your resulting string in the BASE64 decoder function towards a regular one. (How is this with other languages? Gnu AWK and Scriptbasic happen to do this automatically.)



So the last line in the decoder must be:



;; Print resulting string

(trim res "00"))





Thanks for the tip.



Peter.
Title:
Post by: Lutz on March 03, 2004, 12:05:41 PM
GNU Awk and Script Basic only handle 'C' strings which always end with a zero. These languages cut of anything after a zero implicitly, as a drawback they don't handle binary data as you can using newLISP.



Lutz
Title:
Post by: HPW on March 03, 2004, 12:09:37 PM
It seems that my test with my 10 KB string failed, because

it does not work with 'rn'



This works:



> (setq a(BASE64:encode "TestnTest"))

"VGVzdApUZXN0"

> (BASE64:decode a)

"TestnTest"

> (setq a(BASE64:encode "TestrTest"))

"VGVzdA1UZXN0"

> (BASE64:decode a)

"TestrTest"



But:



> (setq a(BASE64:encode "TestrnTest"))

"dA=="
Title:
Post by: pjot on March 03, 2004, 12:15:11 PM
Hi Lutz,



Well, I mentioned the other languages because I did not experience this problem there... I am sorry if it appeared as a criticism, that really was not my intention... indeed newLISP is functioning more consequent since it will return everything the program has asked for.



HPW: I look into that problem now, just a minute...



P.
Title:
Post by: pjot on March 03, 2004, 12:20:30 PM
Hi HPW,



In my Linux environment this is convertable without any problem. My self-defined function delivers this result with "TestrnTest":



VGVzdA0KVGVzdA==



I will literally cut and paste your program now.
Title:
Post by: Lutz on March 03, 2004, 12:20:40 PM
don't worry, i didn't take it as criticism, i just take all opportunity to explain things about newLISP knowing that many people read this board



Lutz
Title:
Post by: pjot on March 03, 2004, 12:34:15 PM
Hi HPW,



I just found the problem. There is an error in your corrected encoding part. Look at this part in your BASE64:encode:



;; Find BASE64 characters

(if (= (length dat) 1)

(begin

(set 'enc (append(nth base1 base64charset)(nth base2 base64charset)"=="))



This should be:



;; Find BASE64 characters

(if (= (length dat) 1)

(begin

(set 'enc (append enc (nth base1 base64charset)(nth base2 base64charset)"=="))





Then it works.

Btw: nice to see how contexts are working, I did not experiment with them yet!



Regards



Peter.
Title:
Post by: HPW on March 03, 2004, 12:34:27 PM
Oops! Typo.



When changing to Lutz advice I change to



(set 'enc (append (nth base1 base64charset)(nth base2 base64charset)"=="))



instead of



(set 'enc (append enc (nth base1 base64charset)(nth base2 base64charset)"=="))



Sorry for that.
Title:
Post by: HPW on March 03, 2004, 12:35:50 PM
Thanks, you was faster! :-)
Title:
Post by: pjot on March 03, 2004, 12:41:18 PM
;-) ok ok... no problem! Thanks for your input!
Title:
Post by: HPW on March 03, 2004, 12:46:08 PM
A little benchmark test with 50KB string:



> (time(setq a (BASE64:encode bigtxt)))

2391

> (time(setq b(get-string(hpwMimeEncodeString bigtxt))))

0

> (length a)

66696

> (length b)

68450

> (length bigtxt)

50022

>
Title:
Post by: pjot on March 03, 2004, 12:50:43 PM
hmmm... so your mime encode is faster, but also appears to produce a LONGER result? Shouldn't the result of the conversion be of the same length?
Title:
Post by: HPW on March 03, 2004, 12:55:37 PM
Yes, I am also wondering.



The speed difference does not wonder because it

is a compiled delpi-dll with a MIME-encoder from

project jedi.



The size difference might be related to an option

which allows to generate a formated BASE64-stream.

Have to check.
Title:
Post by: pjot on March 03, 2004, 12:59:59 PM
OK I am curious, because this might lead to a bug - either in my code or in the Delphi DLL...



Regards



Peter.
Title:
Post by: HPW on March 03, 2004, 01:01:32 PM
After a look into the source,  my first thought was right.



There is an option for inserting line breaks into the

mime-stream. Maybe I should make an optional

encode command without this option.
Title:
Post by: pjot on March 03, 2004, 01:07:32 PM
Well, it might be nice to check if our BASE64 encoders lead to the same result. If so we must assume the encoders are bug-free... ;-)
Title:
Post by: Lutz on March 03, 2004, 01:08:48 PM
changing (while (> (length dat) 0)  to (while (> (length dat) 3)

and putting all the (if ... length ....) stuff outside the while loop would make things a lot faster.



Also, in the spec at: http://www.securecode.net/Base64Convert+main.html



is says something about inserted linefeeds and max linelength, the rn can just be fitered out as they are not legal BASE64 characters.



Lutz
Title:
Post by: HPW on March 03, 2004, 01:15:27 PM
Here now with 1.02 of the DLL going online this evening:



> (import "hpwNLUtility.dll" "hpwMimeEncodeStringNoCRLF")

hpwMimeEncodeStringNoCRLF <ED373C>

> (import "hpwNLUtility.dll" "hpwMimeDecodeString")

hpwMimeDecodeString <ED37A0>

> (time (setq a(BASE64:encode bigtxt)))

2703

> (time(setq b(get-string(hpwMimeEncodeStringNoCRLF bigtxt))))

0

> (length a)

66696

> (length b)

66696
Title:
Post by: Lutz on March 03, 2004, 01:23:54 PM
Actually the lne-feeds should stay in to protect mailservers with max line-length < 76. As long as BASE64:decode filters those out there is no problem.



Lutz
Title:
Post by: pjot on March 03, 2004, 01:24:50 PM
HPW: All right! That is good news, so it seems we end up with the same result. Thank you for testing!



So I can try to increase the conversion speed then...



Lutz: I am not sure what you mean by placing the (if... length...) stuff outside the loop since that would result in a non-working algorithm...? Can you show a small example of what you mean?



Peter.
Title:
Post by: Lutz on March 03, 2004, 01:54:57 PM
lets say the data string is 100 chars long than you check a 98 times if the length is > 2 > 1 etc.



If the while loop runs only (while (> (length dat) 2) then the little rest of 1 or two characters can be processes after the loop. And inside the loop you always can assume length > 2.



Lutz
Title:
Post by: pjot on March 03, 2004, 03:13:47 PM
OK I understand. Tomorrow I will investigate improvements on the Base64 encoder. Thanks.
Title:
Post by: pjot on March 04, 2004, 02:52:28 PM
The base64 routine is smaller and faster now. But I still think I can perform more optimizations. The base64 value calculation can be unified I suppose. In the meantime I have advanced to this code:



------------------------------------------------------

#!/usr/bin/newlisp

;;

;; Base64 converter using newLISP. Tested on Slackware 9.1 with newLISP 7.5.6.

;;

;; Proxy-servers require a base64 encoded <username:password> to pass through.

;;

;; With this encoder you can hack your way out :-)

;;

;; Improved after hints and tips from Lutz.

;;

;;----------------------------------------------------------------------------



;; Setup base64 encode string

(set 'BASE64 "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/")



;; Get input from user

(print "Enter a string to convert: ")

(set 'dat (read-line))



;; Initialize result variable to string

(set 'enc "")



;; Mainloop

(while (> (length dat) 2) (begin

   

   ;; Find ASCII values

   (set 'byte1 (char dat))

   (set 'byte2 (char dat 1))

   (set 'byte3 (char dat 2))



   ;; Now create BASE64 values

   (set 'base1 (/ byte1 4))

   (set 'base2 (+ (* (& byte1 3) 16) (/ (& byte2 240) 16)))

   (set 'base3 (+ (* (& byte2 15) 4) (/ (& byte3 192) 64)))

   (set 'base4 (& byte3 63))



   ;; Find BASE64 characters

   (set 'enc (append enc (nth base1 BASE64) (nth base2 BASE64) (nth base3 BASE64) (nth base4 BASE64) ))

   ;; Decrease 'dat' with 3 characters

   (set 'dat (slice dat 3))

))



;; From here determine last characters and/or padding

(set 'byte1 (char dat))

(set 'byte3 0)

(if (= (length dat) 2) (set 'byte2 (char dat 1)) (set 'byte2 0))



;; Create BASE64 values

(set 'base1 (/ byte1 4))

(set 'base2 (+ (* (& byte1 3) 16) (/ (& byte2 240) 16)))

(set 'base3 (+ (* (& byte2 15) 4) (/ (& byte3 192) 64)))

(set 'base4 (& byte3 63))



;; Find last BASE64 characters

(if (= (length dat) 1) (set 'enc (append enc (nth base1 BASE64) (nth base2 BASE64) "==")))

(if (= (length dat) 2) (set 'enc (append enc (nth base1 BASE64) (nth base2 BASE64) (nth base3 BASE64) "=")))



;; Print resulting string

(println enc)



;; Exit

(exit)
Title:
Post by: HPW on March 04, 2004, 11:25:48 PM
A little optimized through eliminating multiple set to one.





;;
;; Base64 converter using newLISP. Tested on Slackware 9.1 with newLISP 7.5.4.
;;
;; Proxy-servers require a base64 encoded "username:password" to pass through.
;;
;; With this encoder you can hack your way out :-)
;;
;;----------------------------------------------------------------------------

;; Setup base64 encode string

(context 'base64)


(define (encode dat)

(set 'base64charset "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/")

;; Get input from user
;(print "Enter a string to convert: ")
;(set 'dat (read-line))

;; Initialize result variable to string
(set 'enc "")

;; Mainloop
(while (> (length dat) 2) (begin

;; Find ASCII values
(set 'byte1 (char dat)
'byte2 (char dat 1)
'byte3 (char dat 2)

;; Now create BASE64 values
'base1 (/ byte1 4)
'base2 (+ (* (& byte1 3) 16) (/ (& byte2 240) 16))
'base3 (+ (* (& byte2 15) 4) (/ (& byte3 192) 64))
'base4 (& byte3 63)

;; Find BASE64 characters
'enc (append enc (nth base1 base64charset) (nth base2 base64charset) (nth base3 base64charset) (nth base4 base64charset) )
;; Decrease 'dat' with 3 characters
'dat (slice dat 3)
)))

(if (> (length dat) 0)
(begin
;; From here determine last characters and/or padding
(set 'byte1 (char dat)
'byte3 0)
(if (= (length dat) 2) (set 'byte2 (char dat 1)) (set 'byte2 0))

;; Create BASE64 values
(set 'base1 (/ byte1 4)
'base2 (+ (* (& byte1 3) 16) (/ (& byte2 240) 16))
'base3 (+ (* (& byte2 15) 4) (/ (& byte3 192) 64))
'base4 (& byte3 63))

;; Find last BASE64 characters
(if (= (length dat) 1) (set 'enc (append enc (nth base1 base64charset) (nth base2 base64charset) "==")))
(if (= (length dat) 2) (set 'enc (append enc (nth base1 base64charset) (nth base2 base64charset) (nth base3 base64charset) "=")))
))


;; Return resulting string
enc)





;;
;; Base64 decoder
;;
;; Due to the nature of newLISP, this is the smallest
;; BASE64 decoder I've ever written.
;;
;;-------------------------------------------------------

(define (decode dat)

;; Setup base64 string
(set 'base64charset "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/")

;; Get input from user
;(print "Enter a string to convert: ")
;(set 'dat (read-line))

;; Initialize result variable to string
(set 'res "")

;; Mainloop
(while (> (length dat) 0)
(begin
;; Find the indexnumber in the base64charset definition
(set 'byte1 (find (nth 0 dat) base64charset))
(if (= byte1 nil)(set 'byte1 0))
(set 'byte2 (find (nth 1 dat) base64charset))
(if (= byte2 nil)(set 'byte2 0))
(set 'byte3 (find (nth 2 dat) base64charset))
(if (= byte3 nil)(set 'byte3 0))
(set 'byte4 (find (nth 3 dat) base64charset))
(if (= byte4 nil)(set 'byte4 0))

;; Recalculate to ASCII value
(set 'res (append res
(char (+ (* (& byte1 63) 4) (/ (& byte2 48) 16)))
(char (+ (* (& byte2 15) 16) (/ (& byte3 60) 4)))
(char (+ (* (& byte3 3) 64) byte4))))

;; Decrease string with 4
(set 'dat (slice dat 4))))

;; Print resulting string
(trim res "00")
)

(context 'MAIN)

Title:
Post by: Lutz on March 05, 2004, 12:00:45 PM
another optimization, on bigger files repeated string appending gets very expensive because the string is growing and growing. The following change pushes all strings on a list than does a 'join'. On files smaller than a few Kbyte this is slightly slower but on a 100Kbyte file speed increases 10 fold! Ecoding newlisp.c in 4 seconds instead of 40seconds. On bigger files the difference will be even more.



I have used this technique of string appending in many programs with sometimes dramatic performance differences.





;;
;; Base64 converter using newLISP. Tested on Slackware 9.1 with newLISP 7.5.4.
;;
;; Proxy-servers require a base64 encoded "username:password" to pass through.
;;
;; With this encoder you can hack your way out :-)
;;
;;----------------------------------------------------------------------------

;; Setup base64 encode string

(context 'base64)


(define (encode dat)

(set 'base64charset "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/")

(define (to64)
  (set 'base1 (/ byte1 4)
       'base2 (+ (* (& byte1 3) 16) (/ (& byte2 240) 16))
       'base3 (+ (* (& byte2 15) 4) (/ (& byte3 192) 64))
       'base4 (& byte3 63)))

;; Initialize result variable to string
(set 'enc '())

;; Mainloop
(while (> (length dat) 2) (begin

   ;; Find ASCII values
   (set 'byte1 (char dat)
        'byte2 (char dat 1)
        'byte3 (char dat 2))
   
   ;; Now create BASE64 values
   (to64)
 
   ;; Find BASE64 characters
   (push (append (nth base1 base64charset) (nth base2 base64charset)
                 (nth base3 base64charset) (nth base4 base64charset)) enc)

   ;; Decrease 'dat' with 3 characters
   (set 'dat (slice dat 3))
   ))

(if (> (length dat) 0)
(begin

;; From here determine last characters and/or padding
(set   'byte1 (char dat)
   'byte3 0)
(if (= (length dat) 2) (set 'byte2 (char dat 1)) (set 'byte2 0))

;; Create BASE64 values
(to64)

;; Find last BASE64 characters
(if (= (length dat) 1) (push (append (nth base1 base64charset) (nth base2 base64charset) "==") enc))
(if (= (length dat) 2) (push (append (nth base1 base64charset) (nth base2 base64charset)
                             (nth base3 base64charset) "=") enc))
))


;; Return resulting string
(trim (join (reverse enc))))


Lutz
Title:
Post by: Lutz on March 05, 2004, 12:22:31 PM
oops, should have moved the define of 'to64' out of the (define (encode ....) but it doesn't really matter.



Lutz
Title:
Post by: pjot on March 07, 2004, 02:33:55 AM
Phew newLISP indeed is very powerfull! I am learning all the time here. Thank you HPW and Lutz for your examples and explanations!



Peter.