How can I list all field names in a parsed S-XML list?

Started by Jeremy Reimer, May 13, 2010, 10:47:39 AM

Previous topic - Next topic

Jeremy Reimer

Hello!  I'm just getting started with newLISP, and I really like it a lot.  Forgive the newbie questions.



I have a test XML file called dingy.xml that looks like this:



<document>
<element name="First" date="4/23/2010">
  <data>This is the first element</data>
</element>
<element name="Second" date="4/23/2010">
  <data>This is the second element</data>
</element>
</document>


I am parsing with xml-parse using this code:



(define (parse-xml-data str)
  (xml-type-tags nil nil nil nil)
  (let ((xml (xml-parse str 15)))
       (or xml (throw-error (xml-error)))))

  (setq parsedlist (parse-xml-data (read-file "./dingy.xml")))


This returns an s-xml list as follows:



((document (element ((name "First") (date "4/23/2010")) (data "This is the first element")) (element ((name "Second") (date "4/23/2010")) (data "This is the second element"))))



Now, I am able to search and retrieve, say, all items with the type "data" using this code:



(dolist (el (ref-all 'data parsedlist))
         (println (rest (parsedlist (chop el)))))


This gives me back the following result:



("This is the first element")



("This is the second element")



This is great!  But now my question is this:  how could I find all the "field names" (I don't know how better to describe it) without knowing them beforehand?  In this XML example, I have a type called "data", which I pass into ref-all by quoting it; 'data.  But what if I just want to get a list of all possible elements I could search for?  In other words, how could I get a list back that looks like this:



((document) (name) (date) (element) (data))



which is all the different types of element names (field names?) in my XML?  My goal for this is to be able to input in any kind of XML file and return back a list of searchable "field names".



In the newLISP introductory guide, it says (pg 156):


Quote
You add them up to get the options code number – so 15 (+ 1 2 4 8) uses the first four of these options: suppress unwanted stuff, and translate strings tags to symbols.  As a result of this, new symbols have been added to newLISP's symbol table:



(channel description docs item lastBuildDate link managingEditor rss sxml title version webMaster xml)



These correspond to the string tags in the XML file, and they'll be useful almost immediately.


I guess what I'm looking for is some way to bring back a list of all the symbols that have been added to the symbol table when I use xml-parse.  I wish I could have explained it better.  :)



Thanks so much for any help you can provide!

cormullion

#1
Hi Jeremy!



I can't think of anything super elegant, but what about:


(unique (filter symbol? (flat x)))

I don't really like it - but I haven't persuaded match to do it either... I'm sure you should be able to look for patterns of element types, but...

Jeremy Reimer

#2
This totally worked!  Thanks so much for your quick reply!

cormullion

#3
Cool. A recursive solution too:


(define (heads-up l)
   (cond
     ((list? l)
         (heads-up (first l))
         (dolist (s (rest l)) (heads-up s)))
     ((symbol? l)  
          (unless (find l acc) (push l acc))
   )))
       
(heads-up lst)
acc
;-> (data date name element document)


Lutz - I couldn't find a way to use match with other functions to get every first element of a list?

Lutz

#4
(set 'aList '((a 1 (b 2) (c 3)) (d 4) (e 4 (f  5 6 7))))

(map first (ref-all '(? *) aList match true))    

;=> (a b c d e f)

The 'true' option collects the elements instead of the index-vectors. This could make the list resulting from 'ref-all' very big if the original list has a lot nesting. In that case work without the 'true' option and loop through the list of index-vectors:
(set 'aList '((a 1 (b 2) (c 3)) (d 4) (e 4 (f  5 6 7))))

(dolist (idx  (ref-all '(? *) aList match) )
(println (first (aList idx)) ))

a
b
c
d
e
f

I would try the first method first, as it is probably much faster and definitely more elegant.

cormullion

#5
Yes, I knew some wild card magic would do it! Thanks.