bizarre string bug [nevermind]

Started by itistoday, January 10, 2009, 09:40:18 PM

Previous topic - Next topic

itistoday

never mind, my stupid mistake, see posts below.



Basically, newLISP claims that two strings, that are identical, are not.



Given this input file named "test.csv"]Date, Type, Net,
"11/22/2008","Web Accept Payment Received","15.00",
[/code]

Here is the output of the program:


'Web Accept Payment Received' vs 'Web Accept Payment Received'
data-type=Web Accept Payment Received, filter-type=Web Accept Payment Received
(!= date-type filter-type)=true
0 27


The string "Web Accept Payment Received" is stored in two variables, filter-type, and date-type.  It is defined in the script for filter-type, and it's read from the file and stored in data-type.



When printing the contents of those variables, they appear identical.  I even iterated over each character and verified they were the same ASCII value.  However, when asking for the length, newLISP claims data-type is 0 characters in length, and when checking equality (as shown above), it fails.



Here is the script so far (it's not complete), see the part where it says "BUG HERE":
#!/usr/bin/newlisp

; =============
; = configure =
; =============

(set 'csv-delimiter ",")
(set 'header-date "Date")
(set 'header-amount "Net")
(set 'header-type "Type")

; filters
(set 'filter-type "Web Accept Payment Received")
(set 'filter-min-amount 15)
(set 'filter-max-amount 20)

; ============
; = end conf =
; ============

; --------------
(context 'DataStore)

(context MAIN)
; --------------

(define (fail) (apply println (args)) (exit 1))
(define-macro (fail-on-nil) (doargs (arg) (if (nil? (eval arg)) (fail arg " is nil"))))
(define-macro (paras)
(join (map (lambda (x)
(string x "=" (eval x))
) (args)) ", ")
)

(set 'csv-input-file (main-args 2))

(if-not csv-input-file (fail "usage: ./" (main-args 1) " <paypal>"))
(if-not (file? csv-input-file) (fail "no such file: " csv-input-file))
(if-not (regex "(.*).csv" csv-input-file) (fail "not a csv file: " csv-input-file))

(set 'csv-output-file (append $1 "-out.csv"))

(set 'fin (open csv-input-file "r"))
(set 'fout (open csv-output-file "w"))

(set 'header-list (map trim (parse (read-line fin) csv-delimiter)))

; set the indexes
(set 'index-date (find header-date header-list))
(set 'index-amount (find header-amount header-list))
(set 'index-type (find header-type header-list))
(fail-on-nil index-type index-amount index-date)

; write the header
(write-line fout (append "Date" csv-delimiter " Copies Sold" csv-delimiter " "))

(while (read-line fin)
(set 'data-list (map (fn (x) (trim x """)) (parse (current-line) csv-delimiter)))
(set 'data-date (data-list index-date))
(set 'data-type (data-list index-type))
(set 'data-amount (float (data-list index-amount)))
(fail-on-nil data-date data-amount data-type)


;; BUG HERE
(println "'" data-type "' vs '" filter-type "'")
(println (paras data-type filter-type))
(println (paras (!= date-type filter-type)))
(println (length date-type) " " (length filter-type))

; (dostring (c data-type) (println c " - " (char c)))
; (println)
; (dostring (c filter-type) (println c " - " (char c)))

(exit)
;; END BUG

(if (and (= (length date-type) (length filter-type)) (>= data-amount filter-min-amount) (<= data-amount filter-max-amount))
(begin
(println data-date " " data-amount)
(DataStore data-date (+ (if $it $it 0) 1))
)
(println "skipping " data-date " $" data-amount)
)
)


(println "result: " (DataStore))

(close fin)
(close fout)
(exit)
Get your Objective newLISP groove on.

cormullion

#1
I have no idea what's happening here... :) I'm even getting confused between your two variables "datE-type" and "datA-type":


(println (paras data-type filter-type))
   (println (paras (!= date-type filter-type)))
   (println (length date-type) " " (length filter-type))


Are these supposed to be different?

DrDave

#2
Quote from: "cormullion"I have no idea what's happening here... :) I'm even getting confused between your two variables "datE-type" and "datA-type":


(println (paras data-type filter-type))
   (println (paras (!= date-type filter-type)))
   (println (length date-type) " " (length filter-type))


Are these supposed to be different?

I think date-type is a typo, and occurs more than once.
...it is better to first strive for clarity and correctness and to make programs efficient only if really needed.

\"Getting Started with Erlang\"  version 5.6.2

itistoday

#3
Quote from: "cormullion"I have no idea what's happening here... :) I'm even getting confused between your two variables "datE-type" and "datA-type":


(println (paras data-type filter-type))
   (println (paras (!= date-type filter-type)))
   (println (length date-type) " " (length filter-type))


Are these supposed to be different?


Don't feel bad, it appears I was confused between those two variables as well! Thanks! It appears you've fixed the "bug"! (I feel rather silly now :-p).
Get your Objective newLISP groove on.

Lutz

#4
In your case it was just a typo, but actually there is a case where two strings look alike but are different; when strings contain binary zeros as a result of working with imported C functions:



> (set 'A "abc" 'B "abc0000") ; B has trailing two 0's

> (println A " " B) ; println strips the 0's
abc abc

> (= A B)
nil

> (println (length A) ":" (length B))
3:5

> (= A (get-string B))
true
>


Here 'get-string' is used to strip trailing zeros.



For those of you who are C programmers:


> (import "libc.dylib" "strcat")

> (set 's (dup "00" 20))  ; create a buffer with 20 zeros

> (strcat s "abc")

> (strcat s "def")

> (= s "abcdef")
nil

> (= (get-string s) "abcdef")
true
>

cormullion

#5
you were unlucky - two commonly typed words differing only by the last letter, and both "e" and "a" similar enough at small point sizes. My newLISP coding is now done using 18 point type.. :)



btw - hope the spying business is doing well!

itistoday

#6
Quote from: "Lutz"In your case it was just a typo, but actually there is a case where two strings look alike but are different; when strings contain binary zeros as a result of working with imported C functions:


Thanks for the heads up Lutz!


Quote from: "cormullion"you were unlucky - two commonly typed words differing only by the last letter, and both "e" and "a" similar enough at small point sizes. My newLISP coding is now done using 18 point type.. :)


Yeah... Monaco 12pt. here, could be the culprit, but I just can't stand using large fonts for coding, I like to see as much code as possible at once without having to scroll around.


Quotebtw - hope the spying business is doing well!


Thanks! :-p



I'm planning on covering newLISP on the site's blog as well. One of the things I'd like to share is my TextMate bundle for newLISP, here's a screenshot of some of the highlighting:



http://www.kinostudios.com/images/newlisphighlight.png">
Get your Objective newLISP groove on.