extention request on get-url

Started by newdep, March 11, 2006, 11:16:31 AM

Previous topic - Next topic

newdep

Hi Lutz,



I would like to make a progress indicator of the download with 'get-url.

(or realtime display/scan or grab data )

(without rebuilding my own get-url function with proxy)

Its not possible currently, because there is NO signaling back about the buffer

of 'get-url. If i.e. an extra boolean function is enabled in 'get-url I could i.e. get

buffer-size indicators back form the C buffer... So when I put 'get-url into a (fork)

 process I can grab the data download in the main program.



This is what im thinking about ->

(get-url str-url [str-option] [int-timeout [str-header]] [boolean])



The boolean parameter will make get-url interactive, that is.. During the

get action in C, get-url will directly return data from the buffer used in C,

or it could simply send me the total bytes back during read.



A simple (println (get-url "http://bigfiles.com">http://bigfiles.com")) will print me ie. OR ->



(1) the complete data directly to screen (when "list" is enabled in get-url.

      (not afterwards but realtime)

(2) Print me realtime the blocks of buffer from the C buffer to the screen)



Well any other way to make 'get-url none blocking with data feedback

would be welcome too...;-)



Btw... If its none blocking in any form I dont need data feedback.. then

I could simple read the data from the variable it is set too like:

(println (setq data (get-url "somefiles" true)))





Regards, Norman.
-- (define? (Cornflakes))

Lutz

#1
It shouldn't be hard do write this in newLISP and then customize it to your taste. Before there was 'get-url' I had a 'get-url' written in newLISP in less then 10 lines of code for handling moset pages (no redirects and chunked pages).



The HTTP protocol is text based, I remember the first version was written just consulting the O'Reilly "Webmaster in a Nutshell" book. Which contains a description of HTTP protocol basics in less then 20 pages.



You do a connect to the server with net-connect, then do a one-line net-send with the GET rquest. All line must be terminated with rn. The you could use multiple net-receive to get the page. The following interactive code session shows you what I mean:



newLISP v.8.8.0 on OSX UTF-8, execute 'newlisp -h' for more info.

> (net-connect "newlisp.org" 80)
3
> (net-send 3 "GET /rn")
7
> (net-receive 3 'buff 10240)
4849
> buff
[text]<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>
...
...
</body>
</html>
[/text]
>


Instead od 10240 you choose a smaller buffer size and then report progress after each piece. A GET request in it's most simple form as used above will make the server give you back the page right away w/o a header info etc.. By making a more complex structured GET request you can get header info with content size, special formatting etc.. See HTTP docs for more details.



Lutz

newdep

#2
Pops into my mind just now...(the quarter drops the jukebox..),

that I actualy dont need this function by rewriting the get-url..

( I did that and was not very impressed afterall)





Its far more simpler ->



Fork a process that checks the file size and write that controlled bytes to screen

in % or whatever.. (it works nicely ;-)
-- (define? (Cornflakes))