dolist or not dolist

joejoe · November 16, 2010, 08:07:29 AM

Hi -

I still know next to nothing nL but I have patched together this with help:

Code Select Expand
(set 'result (unique (sort
    (find-all {[a-zA-Z]+}
        (replace "<[^>]+.*+>" (get-url "http://mysite.com/") "" 0) )
)))
(println result)
(exit)

I am trying to now pull from multiple of my sites, so I want to get-url mysite.com and mysite0.com and mysite1.com as well.

I thought this should be done with dolist. I still want to unique and sort the compilation of urls that I retreive.

So I tried this:

Code Select Expand
(set 'result (unique (sort (dolist 123
    (find-all {[a-zA-Z]+}
        (replace "<[^>]+.*+>" (get-url "http://mysite.com") "" 0) )
    (find-all {[a-zA-Z]+}
        (replace "<[^>]+.*+>" (get-url "http://site.com/") "" 0) )
    (find-all {[a-zA-Z]+}
        (replace "<[^>]+.*+>" (get-url "http://newsite.com/") "" 0) )
))))

(println result)

(exit)

But it is saying list is expected in dolist.

But how would I make dolist not unique and sort until the end of compiling the 'result list?

Thanks!

cormullion · November 16, 2010, 02:31:48 PM

Without knowing what the find-all stuff is doing, I'm guessing that this is one way of writing the sort of thing you want:

Code Select Expand
(dolist (url '("http://mysite.com" "http://mysite.com" "http://mysite.com"))
    (push (find-all {[a-zA-Z]+} (replace "<[^>]+.*+>" (get-url url) "" 0)) result))

(println (unique (sort result)))
(exit)

You need a list to iterate through. The results are accumulated in the result list.

hth

joejoe · November 17, 2010, 04:35:41 PM

~~Quote from: "cormullion"~~Without knowing what the find-all stuff is doing, I'm guessing that this is one way of writing the sort of thing you want:

Code Select Expand (dolist (url '("http://mysite.com" "http://mysite.com" "http://mysite.com")) (push (find-all {[a-zA-Z]+} (replace "<[^>]+.*+>" (get-url url) "" 0)) result)) (println (unique (sort result))) (exit)

You need a list to iterate through. The results are accumulated in the result list.

hth

That is a work of beauty. I cant get over how effective and to the point nL is.

You all are angels. Thanks big cormullion!

joejoe · November 17, 2010, 06:40:12 PM

I cant figure why it is not doing unique and sort to the results list:

Code Select Expand
#!/usr/bin/newlisp

(dolist (url '("http://www.newlisp.org" "http://newlisp.nfshost.com/wiki/"))
    (push (find-all {[a-zA-Z]+} (replace "<[^>]+.*+>" (get-url url) "" 0)) result))

(println (unique (sort result)))
(exit)

I am seeing duplicate words and unsorted. Must be something simple I am missing?

Code Select Expand
[...] "script" "body" "body" "html"))

i feel like a child who knows what he wants to say but cant get the words out. :)

i promise i am learning and love to help once learned. ;)

thanks again for any tip or suggestion!

cormullion · November 18, 2010, 09:09:25 AM

Keep at it! Sorry - the code I supplied wasn't really a solution, rather it was a suggestion of an approach you could try.

Perhaps what you're forgetting is that lists can be simple or structured/nested/hierarchical:

Code Select Expand
(3 2 1) ; simple
((0 3) (2 1) (14)) ; structured

A list element can be a string or symbol or number, but it could also be a list. You need to be aware of the nature of the lists you're storing your data in.

In your code, you're using push to add the results of a find-all to an existing list. But find-all returns a list. So your result is a list of lists where each 'sub-list' contains the words found in a single HTML page (I think). Both unique and sort will work on structured lists, but they won't flatten or merge the lists for you - they'll maintain the list structure. So unique on your list looks like it will effectively do nothing (since the elements are almost certain to be different), and sort will probably reorganize the result list so that the shorter lists appear earlier than longer ones.

There are two obvious solutions. Either flatten the result list first, to remove the structure (non-destructively):

Code Select Expand
(unique (sort (flat result)))

Or, more instructively, change the way you build the results list:

Code Select Expand
(set 'result '())
(dolist (url '("http://www.newlisp.org" "http://newlisp.nfshost.com/wiki/"))
    (extend result (find-all {[a-zA-Z]+} (replace "<[^>]+.*+>" (get-url url) "" 0))))

joejoe · November 18, 2010, 11:05:35 AM

~~Quote from: "cormullion"~~Perhaps what you're forgetting is that lists can be simple or structured/nested/hierarchical

That is precisely what I was missing.

I went back to your Introduction

http://en.wikibooks.org/wiki/Introduction_to_newLISP">//http://en.wikibooks.org/wiki/Introduction_to_newLISP

and re-read the section on flat.

http://en.wikibooks.org/wiki/Introduction_to_newLISP/Lists#flat">http://en.wikibooks.org/wiki/Introducti ... Lists#flat">http://en.wikibooks.org/wiki/Introduction_to_newLISP/Lists#flat

~~Quote from: "cormullion"~~There are two obvious solutions. Either flatten the result list first, to remove the structure (non-destructively):

Code Select Expand (unique (sort (flat result)))

Or, more instructively, change the way you build the results list:

Code Select Expand (set 'result '()) (dolist (url '("http://www.newlisp.org" "http://newlisp.nfshost.com/wiki/")) (extend result (find-all {[a-zA-Z]+} (replace "<[^>]+.*+>" (get-url url) "" 0))))

Wow, incredible. Im facinated again and again w/ nL and its community.

Thank you again on this, cormullion. Works beautifully and I get it.

cormullion · November 19, 2010, 09:00:18 AM

Cool! If you're using the wikibooks' Intro to newLISP and find any errors or obscurities, please make a note or something. You can add comments to the 'discussion' page of each page if you like.

wikibooks version will be updated for newLISP 10.3 soon. Especially if Lutz promises not to break any existing code till 10.3.1... :)

Sadly syntax highlighting in wikibooks is still unavailable until they update their version of Geshi to the latest.

newLISP Fan Club

News:

dolist or not dolist

joejoe

cormullion

joejoe

joejoe

cormullion

joejoe

cormullion