Has anyone written a web crawler in newLISP?
I have looked, but cannot find such a beast.
Any pointers would be greatly appreciated.
I'd start looking here (//http)...
Quote from: "kanen"Has anyone written a web crawler in newLISP?
I have written later in 2009 some kind of crawler to gather information from one big goverment site.
Pretty simple thing, just several hundreds lines of code, "cgi.lsp" + regular expressions + lots of cookie romp. If you are going to make crawler without cookie, I think, simple crawler can be developed in one evening.
I had forgotten, after using Ruby and Python (and, of course, C) for a few years, just how fetching awesome newLISP is.
I did indeed write the simple crawler in one evening and it turns out to be quite fast.
Quote from: "Fritz"Quote from: "kanen"Has anyone written a web crawler in newLISP?
I have written later in 2009 some kind of crawler to gather information from one big goverment site.
Pretty simple thing, just several hundreds lines of code, "cgi.lsp" + regular expressions + lots of cookie romp. If you are going to make crawler without cookie, I think, simple crawler can be developed in one evening.
Hi Kanen
I wrote some simple tools for analysing websites in newLISP and used CURL for fetching urls, because (get-url) doesn't support "https". You can simple invoke CURL by the exec function.
Then using SXML for parsing the returned HTML would be the easiest.
Cheers
Hilti