Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Topics - Thorstein

#1
This behavior was a little unexpected.



newLISP v.10.7.5 64-bit on Linux IPv4/6 UTF-8 libffi, options: newlisp -h



> (read-expr "08")

0

> (read-expr "07")

7

> (read-expr "06")

6

> (read-expr "09")

0

> (eval-string "08")

8

> (eval-string "09")

9

> (read-expr "08")

0

> $count

1

> (read-expr "07")

7

> $count

2

>
#2
[See solution in thread below.]



I'm trying to implement several versions of the Lempel-Ziv-x and Snappy compression algorithms.  Ordinarily, I like to get my logic straight in Lisp, and then, if I need the speed, I'll port the tight loops to a C library.  In this case, however, NEWLisp has been atypically difficult to debug.  I wonder if there are some simple code patterns I'm overlooking.



It would, of course, be simpler to use a non-UTF-8 enabled build of NEWLisp, but I want to compress UTF-8 strings that I'm processing within NEWLisp.



So given a UTF-8 string us, I understand that (slice us i 1) will give me an 8-bit "char". I also found that defining


(define (byte s
   (i 0)  )
  (char s i true)
  )

helped in some situations. But then I ran into problems trying to unpack a code like 32765 into two bytes.  In the following examples  I thought I could use the following for the low byte of 253.

> (mod 32765 256)
253

;; but
> (byte (mod 32765 256)) 
ý

;; and
>(byte (byte (mod 32765 256)))
195


And while, as mentioned above, the following use of (char) looks ok

>(char (char (mod 32765 256)))
253

>(char (mod 32765 256))
"ý"

>(length "ý")
2

the UTF-8 char length messes with the byte discipline of the compression algorithms.



At last, I found that (pack) can work:

>(pack "b" (& 32765 0xff))
"�"

;; and
> (byte (pack "b" (& 32765 0xff)))
253

;; (and for the high byte):
>(byte (pack "b" (/ 32765 256)))
127


But, a little confusingly, there were still some gotchas.  For example, (pack) doesn't work with (mod):

> (byte (pack "b"  (mod 32765 256)))
16

So, long story short, I've got these manipulations more-or-less working, but I wonder if there's a more direct way to manipulate such bytes and 8-bit chars??
#3
Running on windows,



(exec)'ing Google Translate returns the following str inside a JSON container:



(a) "Nous allons habiller pour la randonnée, selon la météo."



However, somewhere in the process of a (string str "") or (replace x str y) the str begins to (println) as this:



(b) "Nous allons habiller pour la randonn195169e, selon la m195169t195169o. nn</td></tr>"



How can I convert (b) to (a) so I can (println) (a) to a static HTML file?



Do I have to make a unicode build?