9.3 docs - there's no ref-set-all because...

Started by cormullion, December 31, 2007, 07:38:38 AM

Previous topic - Next topic

cormullion

I'm updating my tutorial document with the new list searching functions. My current understanding is that there's no ref-set-all to go with set-ref-all, although there's a ref-set to go with set-ref and a set-nth to go with nth-set, because you can't return only the changed elements of a list...



For example:


(set 'l (list (list 'aaa 100) (list 'aaa 200)))
;-> ((aaa 100) (aaa 200))
(set-ref-all (l 'aaa) 'a1)
;-> ((a1 100) (a1 200))


changes every occurrence of 'aaa to 'a1 in the list, but returns the whole list. So for big lists, you'd ideally want the equivalent ref-set-all that returns only the changed elements, but you can't do that.



Am I close?



Also, is this a new form of element referencing - putting a string or symbol or number after a list, in parentheses? It's analogous to implicit indexing I suppose...

Lutz

#1
QuoteAm I close?




yes, allthough one could return a list of all changed elements, but this would make only sense when using regex, match or unify to find multiple originals which than potentially could look different. If this is what somebody needs, you always can work around this by pushing $0 onto a list as a side task in your replacement expression.


QuoteAlso, is this a new form of element referencing - putting a string or symbol or number after a list, in parentheses? It's analogous to implicit indexing I suppose...


yes, this is correct and on purpose.



Also, in the coming development version (hopefully the last before 9.3 release) overflowing indices in: nth, set-nth, nth-set, push and pop, will cause a "list index out of bound" -error message. The different behavior of lists and arrays in previous versions has been criticized frequently over the last years, so after the last discussion I finally decided to take the step to change to a more consistent and commonly expected behavior.



Lutz

cormullion

#2
OK! The new form looks very powerful.



I wonder whether the new error for out of range indices will break any code. I don't mind myself - it will be interesting to find out...



Out of curiousity - did you decide that returning nil wasn't a good idea?



And Happy New(lisp) Year Lutz!

Lutz

#3
QuoteI wonder whether the new error for out of range indices will break any code. I don't mind myself - it will be interesting to find out...


Few, if any, have used overflowing indices has a means to reach the last or first element in a list, on purpose. A quick check on most of my own code, revealed no problems.



As somebody pointed out, a lot of sloppy code might turn up now throwing error messages, but that would be a good thing.


QuoteOut of curiousity - did you decide that returning nil wasn't a good idea?


I thought, returning 'nil' was a bad idea, because it created ambiguous situations, when 'nil' itself is part of the list and is returned, so it's nothing you can test on, except you are sure, you do not have 'nil's in your list.



There was another odd situation in the old ways of working: (pop '() 0) returned 'nil'. With 9.2.12 (pop '() 0) will throw an error but (pop '()) will still throw 'nil', we have to see how this is received  by most users. Nobody has commented anything about it in the past.



In any case, when popping lists iteratively or recursively one should test with 'empty?' or test on the boolean value as in: (while myList (pop-it)), never rely on a popped 'nil' to be an indicator that the list is empty.



Lutz

cormullion

#4
Thanks... I've already spotted some code of mine that relies on out-of-range indices - although it's knowingly doing it, so no problem...!



Back to set-ref-all... I'm trying to update a short script in the Intro that modifies the BOILING_POINT field in an sxml list:


((PERIODIC_TABLE
   (ATOM (NAME "Actinium")
   (ATOMIC_WEIGHT "227")
   (ATOMIC_NUMBER "89")
   (OXIDATION_STATES "3")
   (BOILING_POINT ((UNITS "Kelvin")) "3470")
   (SYMBOL "Ac")
   (DENSITY ((UNITS "grams/cubic centimeter")) "n      10.07n    ")
   (ELECTRON_CONFIGURATION "[Rn] 6d1 7s2 ")
   (ELECTRONEGATIVITY "1.1")
...


It used to be done with ref-all. But all my attempts to use set-ref-all with match to access any of these sublists lead to "call stack overflow : match <13933>". I think I should be able to do this with match but it seems harder than it should be?

Lutz

#5
Here are some examples:


(set 'P

'(PERIODIC_TABLE
   (ATOM (NAME "Actinium")
   (ATOMIC_WEIGHT "227")
   (ATOMIC_NUMBER "89")
   (OXIDATION_STATES "3")
   (BOILING_POINT ((UNITS "Kelvin")) "3470")
   (SYMBOL "Ac")
   (DENSITY ((UNITS "grams/cubic centimeter")) "n      10.07n    ")))

)

; look for Actinium
(set-ref-all '(ATOM (NAME "Actinium") *) P (println $0) match)

; look for all OXIDTION_STATES "3"
(set-ref-all '(ATOM * (OXIDATION_STATES "3") *) P (println $0) match)

; like previous, but assumes constant number and order of attributes
; for atoms, would be faster because is more explicit
(set-ref-all '(ATOM ? ? ? (OXIDATION_STATES "3") ? ? ?) P (println $0) match)


note, that in this examples nothing gets replaced because the 'println' statement simply returns the found expression, which in all cases is an atom. There is also only one atom in the table.



Lutz



ps: perhaps you can make up a simpler periodic table with just a few entries to show how the stack overflow in 'match' is caused.

cormullion

#6
Great, thanks Lutz. Those examples are very illuminating. 'match' is as cool as a mountain stream and there's a little struggle involved to master it...



The problem appears to be one of data size of the sxml list. With about 26 elements, all your examples work. With about 28 elements, they produce call stack errors.



I think I got my XML periodic table from http://www.w3.org/XML/Binary/2005/03/test-data/Over100K/periodic.xml">//http://www.w3.org/XML/Binary/2005/03/test-data/Over100K/periodic.xml.

Lutz

#7
The default stack-size on newLISP is only 2048, just bump it up i.e.:


newlisp -s 10000

... and try again, or try to rephrase your searchpattern.



Lutz



ps: I am still busy with family festivities today, but will try the XML link later this week.

Lutz

#8
The stack overflow in the new 'set-ref-all' was caused by a bug, not by the nesting of the data list or match expression. A stack overflow in this function should only occur on extreme deep nesting of the data.



This is fixed in development version 9.2.12, due later this week.



Lutz

cormullion

#9
That's good news. Thanks!