parsing numbers (error?)

Started by Fanda, February 08, 2007, 05:45:06 AM

Previous topic - Next topic

Fanda

It seems that there is an error in 'parse'. It behaves differently on different numbers:


> (parse "Feb 07,2007")
("Feb" "07" "," "2007")

> (parse "Feb 08,2007")
("Feb" "0" "8" "," "2007")


> (dotimes (i 20) (println (parse (format "Feb %d,2007" i))))
("Feb" "0" "," "2007")
("Feb" "1" "," "2007")
("Feb" "2" "," "2007")
("Feb" "3" "," "2007")
("Feb" "4" "," "2007")
("Feb" "5" "," "2007")
("Feb" "6" "," "2007")
("Feb" "7" "," "2007")
("Feb" "8" "," "2007")
("Feb" "9" "," "2007")
("Feb" "10" "," "2007")
("Feb" "11" "," "2007")
("Feb" "12" "," "2007")
("Feb" "13" "," "2007")
("Feb" "14" "," "2007")
("Feb" "15" "," "2007")
("Feb" "16" "," "2007")
("Feb" "17" "," "2007")
("Feb" "18" "," "2007")
("Feb" "19" "," "2007")
("Feb" "19" "," "2007")

> (dotimes (i 20) (println (parse (format "Feb 0%d,2007" i))))
("Feb" "00" "," "2007")
("Feb" "01" "," "2007")
("Feb" "02" "," "2007")
("Feb" "03" "," "2007")
("Feb" "04" "," "2007")
("Feb" "05" "," "2007")
("Feb" "06" "," "2007")
("Feb" "07" "," "2007")
("Feb" "0" "8" "," "2007")
("Feb" "0" "9" "," "2007")
("Feb" "010" "," "2007")
("Feb" "011" "," "2007")
("Feb" "012" "," "2007")
("Feb" "013" "," "2007")
("Feb" "014" "," "2007")
("Feb" "015" "," "2007")
("Feb" "016" "," "2007")
("Feb" "017" "," "2007")
("Feb" "01" "8" "," "2007")
("Feb" "01" "9" "," "2007")
("Feb" "01" "9" "," "2007")


Fanda

nigelbrown

#1
looks like it interprets numbers as octal if leading 0 and 07 is valid octal but 08 isn't so becomes 0 and 8



Manual says

parse tokenizes according to newLISP's internal parsing rules.



and numbers section says

Octals start with an optional + (plus) or - (minus) sign and a 0 (zero), followed by any combination of the octal digits: 01234567. Any other character ends the octal number.



Nigel

Lutz

#2
yes, exactly like Nigel explains, but if you specify the break-up string in 'parse' then it will parse as expected:


> (dotimes (i 20) (println (parse (format "Feb 0%d,2007" i) "\s|," 0)))
("Feb" "00" "2007")
("Feb" "01" "2007")
("Feb" "02" "2007")
("Feb" "03" "2007")
("Feb" "04" "2007")
("Feb" "05" "2007")
("Feb" "06" "2007")
("Feb" "07" "2007")
("Feb" "08" "2007")
("Feb" "09" "2007")
("Feb" "010" "2007")
("Feb" "011" "2007")
("Feb" "012" "2007")
("Feb" "013" "2007")
("Feb" "014" "2007")
("Feb" "015" "2007")
("Feb" "016" "2007")
("Feb" "017" "2007")
("Feb" "018" "2007")
("Feb" "019" "2007")
("Feb" "019" "2007")
>


See als the new 'parse-date'



Lutz



ps: "internal parsing rules" means: like newISP source code

cormullion

#3
it's a bit like the 'bug' I had to track down. My program worked perfectly for 9 months and then went wrong... :-)



http://newlisper.blogspot.com/2006/09/my-mistake.html">//http://newlisper.blogspot.com/2006/09/my-mistake.html

Fanda

#4
Octal numbers seem to cause a confusion - could we change their format to something similar like HEX numbers???



"x12" -> "o22"

0x12 -> 0o22   [zero - small "o"]



or maybe use "c":

"x12" -> "c22"

0x12 -> 0c22   [zero - small "c"]



Fanda

Lutz

#5
It is better to stay with standard conventions in this case. C, Perl and Python  do all the same thing here.



Lutz