lithper had me thinking... why go to all the trouble? Why not serve directly from newLISP? Here is a forking web server that serves only one page: "Hello world." Obviously, it's just proof of concept, but the results of httperf (a server benchmark program) are pretty promising. I set max-responders to one fewer than the number of processors I have.
(set 'content "Hello world.")
(set 'response (format {HTTP/1.0 200 OK
Date: Wed, 13 Mar 2008 23:59:59 GMT
Content-Type: text/html
Content-Length: %d
%s} (length content) content))
(define (sig-err n msg)
(println "nZowie!")
(println "Caught signal " n)
(if msg (println msg))
(print "Killing responders: ")
(map destroy procs)
(println "done!n")
(exit))
(constant 'SIGINT 2)
(constant 'SIGKILL 9)
(signal 2 'sig-err)
(signal 9 'sig-err)
(constant 'max-request-size 1024)
(constant 'max-responders 4)
(define (responder socket)
(let ((conn (net-accept socket)) request)
(while (not (net-error))
(net-receive conn 'request max-request-size)
;; in a real environment, get the request content here
(net-send conn response))
(close conn)
(exit)))
;; open port socket
(println "Server is starting.")
(set 'listener (net-listen 8080))
(unless listener (throw-error (net-error)))
;; main loop
(set 'procs '())
(while (not (net-error))
;; block until a connection attempt
(while (not (net-select listener "read" 1000)) (sleep 50))
;; fork a responder
(if (<= (length procs) max-responders)
(push (fork (responder listener)) procs -1)
(begin
(wait-pid (pop procs))
(push (fork (responder listener)) procs -1))))
;; check for errors
(if (net-error) (println (net-error)))
;; clean up and quit
(map destroy procs)
(exit 0)
And the results:
Quote
httperf --timeout=5 --client=0/1 --server=localhost --port=8080 --uri=/whatever --rate=200 --send-buffer=4096 --recv-buffer=16384 --num-conns=5000 --num-calls=10
Maximum connect burst length: 1
Total: connections 5000 requests 50000 replies 50000 test-duration 25.020 s
Connection rate: 199.8 conn/s (5.0 ms/conn, <=13 concurrent connections)
Connection time [ms]: min 0.8 avg 15.1 max 64.7 median 4.5 stddev 16.7
Connection time [ms]: connect 0.1
Connection length [replies/conn]: 10.000
Request rate: 1998.4 req/s (0.5 ms/req)
Request size : 68.0
Reply rate [replies/s]: min 1984.1 avg 1995.1 max 2000.1 stddev 7.3 (5 samples)
Reply time [ms]: response 1.5 transfer 0.0
Reply size : header 96.0 content 12.0 footer 0.0 (total 108.0)
Reply status: 1xx=0 2xx=50000 3xx=0 4xx=0 5xx=0
CPU time : user 4.61 system 16.93 (user 18.4% system 67.7% total 86.1%)
Net I/O: 343.5 KB/s (2.8*10^6 bps)
Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0
I don't quite believe your numbers. Are you sure you are not measuring the speed of a very short system error message? ;)))))))))
This would be one of the first rakes to step on, in my experience. That's why I try to set test pages at longer lengths and check the number in the output.
In my attempt to load a page from the program the session did not close, so in a browser I have to hit escape, "ab" hangs - while httperf steps over and measures some generated error, it seems - it seems.
What is really great is the fact that newLisp gives one ability to quickly create a forking server using its high level operators, as you did, or take another recent example from Dmitri. And that we toss little ideas to each other and code to test them. I'll play with the script to see how it behaves
As far as http serving is concerned, the problem is the need to comply with all the bloody web standards. Links like those from mod_lisp in the backend can be informal and easy-scripting exactly because the web server in front isolates us from the hostile external world full of ugly users, treacherous redirects, cut connections, or attacks with wrong type packets.
But I'd play with your script to see if any adjusting is needed and would save it as a crib.
It closed sessions out for me. The fork exits after the connection is finished. At this point, there is absolutely no checking of request headers or anything. It just replies with the message. It loads fine for me in firefox, but again- it's a program written in lipstick on a bar napkin. I just wanted to see how fast it would go.
Serving static pages, apache is much faster. But that is also using a little known secret- don't fork a lot higher than the number of processors on the box (or cores, whatever).
The test response is also very small. It doesn't do nearly so well with a 512kb response (but it can't chunk yet, either).
Seems to begin working properly if the responder is changed to:
; -----responder---------
(define (responder socket)
(let ((conn (net-accept socket)) request) ;
;while ; --- eliminate while, it causes the problem ---
; of waiting for more when conversation is over.
(if (not (net-error))
(begin
(net-receive conn 'request max-request-size)
;(println request)
;; in a real environment, get the request content here
(net-send conn response)
)
(begin
(net-close conn)
(exit)
)
);/end if/
) ; /end of let/
)
; /*end responder*/
..yes, and on this old 500MHz Pentium a 9.5kB page is delivered at 410/sec.
Each cycle supposedly reads it from the filesystem (while in fact OS will cache it and we're pushing memory to memory, most probably.
So your result, if it's not an error page, gives you a rough multiplier for all my tests for your box.
I don't think the while is the problem. You have to allow for persistent connections for http 1.1. Try with this header in the response:
Connection: closern
The reason it's not closing the connection is probably that the client is attempting to maintain it. Once the client closes the connection, the connection is terminated. You can watch the number of processes in top. It maintains a low number; if the processes weren't exiting when the client closed the connection, the loop would not terminate, and the wait-pid call would block indefinitely.
Actually, connection: close wouldn't apply - it's returning an http1.0 header (I forgot). At any rate, I couldn't get yours to return any data. It is closing the connection prematurely. You need to wait for net-error - for the client to close the connection. Use wget to get a better picture of what an individual connection is doing.
I just ran the test with a 40kb jpeg. It was a bit slower and there were a few errors where it missed responses, but the stats were still pretty good.
Here is a modified version of the responder that closes the connection after sending:
(define (responder socket)
(let ((conn (net-accept socket)) request)
(while (not (net-error))
(net-receive conn 'request max-request-size)
;; in a real environment, get the request content here
(let ((content (get-test-content)))
(net-send conn (get-test-content)))
(close conn))
(close conn)
(exit)))
It's odd. ab works with that. Without it, it does, indeed, report that connections are not being dropped. But with the server closing connections, httperf is reporting that the server is resetting the connection. Here are my results with 32 total threads against a 40 kb jpeg:
ab:
Quote
Finished 1243 requests
Server Software:
Server Hostname: localhost
Server Port: 8080
Document Path: /
Document Length: 38867 bytes
Concurrency Level: 100
Time taken for tests: 3.004 seconds
Complete requests: 1243
Failed requests: 0
Broken pipe errors: 0
Total transferred: 48505744 bytes
HTML transferred: 48374914 bytes
Requests per second: 413.78 [#/sec] (mean)
Time per request: 241.67 [ms] (mean)
Time per request: 2.42 [ms] (mean, across all concurrent requests)
Transfer rate: 16147.05 [Kbytes/sec] received
Connnection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.7 0 10
Processing: 26 231 31.0 237 282
Waiting: 23 231 31.0 236 282
Total: 26 232 30.7 237 283
Percentage of the requests served within a certain time (ms)
50% 237
66% 243
75% 246
80% 248
90% 253
95% 256
98% 262
99% 264
100% 283 (last request)
There's one more serious problem though.
If you look at what 's happening on your system when a well-written network software is waiting for connections, you won't see much activity.
This is how apache server instance (httpd) looks like:
.root. strace -f -p 22969
Process 22969 attached - interrupt to quit
accept(3,
while the despatching httpd that runs as root issues waitpid calls once a second. The stream looks like
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
waitpid(-1, 0xbffff2a8, WNOHANG|WUNTRACED) = 0
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
waitpid(-1, 0xbffff2a8, WNOHANG|WUNTRACED) = 0
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
That crooked splinter of a script from my last posting was also waiting on "accept":
.root. strace -f -p 23594
Process 23594 attached - interrupt to quit
accept(5,
If one looks at the server you wrote, however, there is a flood of activity, a stream running on the screen at high speed:
select(1024, [4], NULL, NULL, {0, 1000}) = 0 (Timeout)
nanosleep({0, 50000000}, NULL) = 0
select(1024, [4], NULL, NULL, {0, 1000}) = 0 (Timeout)
nanosleep({0, 50000000}, NULL) = 0
select(1024, [4], NULL, NULL, {0, 1000}) = 0 (Timeout)
nanosleep({0, 50000000}, NULL) = 0
select(1024, [4], NULL, NULL, {0, 1000}) = 0 (Timeout)
nanosleep({0, 50000000}, NULL) = 0
select(1024, [4], NULL, NULL, {0, 1000}) = 0 (Timeout)
nanosleep({0, 50000000}, NULL) = 0
select(1024, [4], NULL, NULL, {0, 1000}) = 0 (Timeout)
nanosleep({0, 50000000}, NULL) = 0
select(1024, [4], NULL, NULL, {0, 1000}) = 0 (Timeout)
nanosleep({0, 50000000}, NULL) = 0
select(1024, [4], NULL, NULL, {0, 1000}) = 0 (Timeout)
nanosleep({0, 50000000}, NULL) = 0
select(1024, [4], NULL, NULL, {0, 1000}) = 0 (Timeout)
nanosleep({0, 50000000}, NULL) = 0
select(1024, [4], NULL, NULL, {0, 1000}) = 0 (Timeout)
nanosleep({0, 50000000}, NULL) = 0
select(1024, [4], NULL, NULL, {0, 1000}) = 0 (Timeout)
nanosleep({0, 50000000}, NULL) = 0
select(1024, [4], NULL, NULL, {0, 1000}) = 0 (Timeout)
nanosleep({0, 50000000}, NULL) = 0
I.e. to wait the system has to run at full speed like a mad queen in Carrol's Alice.
I think network apps should not wait with select and sleep. One should wait on accept.
2. Yes, your last change gets rid of the errors on my machine.
It's not running at full speed. I'm seeing 0.00% cpu usage. There are plenty of system calls, but that can be easily eliminated by changing the sleep timer to a higher value. If you want it to be one second, make it 1000.
If you change it to wait until receiving, that blocks on each connection so that you cannot accept multiple concurrent connections.
I suppose the solution would be to initially start the forks and have each responder block and then loop on net-receive.
Ok... here is one that blocks on net-accept. The responder ignores errors and restarts itself (which will need to change in a real version) at the moment:
(define (num-cpus)
(int (first (exec "sysctl -n hw.ncpu"))))
(set 'test-content-file "/Users/jober/Desktop/src/server/bear.jpg")
(define (sig-err n msg)
(println "nWhat a world, what a world!")
(println "Caught signal " n)
(if msg (println msg))
(print "Killing responders: ")
(map destroy procs)
(println "done!n")
(exit))
(constant 'SIGINT 2)
(constant 'SIGKILL 9)
(signal 2 'sig-err)
(signal 9 'sig-err)
(constant 'max-request-size 1024)
(constant 'max-responders-per-cpu 4)
(constant 'max-responders (* (num-cpus) max-responders-per-cpu))
(define (get-test-content)
(letn ((content (read-file test-content-file))
(len (length content))
(response ""))
(write-buffer response "HTTP/1.0 200 OKrn")
(write-buffer response "Date: Wed, 13 Mar 2008 23:59:59 GMTrn")
(write-buffer response "Content-Type: image/jpegrn")
(write-buffer response (format "Content-Length: %drn" len))
(write-buffer response "rn")
(write-buffer response content)
response))
(define (responder socket)
(let (conn request)
(while true
(set 'conn (net-accept socket))
(net-receive conn 'request max-request-size)
(net-send conn (get-test-content))
(net-close conn)))
(exit))
;; open port socket
(println "Server is starting at " (date (date-value)) " with " max-responders " max responders.")
(set 'listener (net-listen 8080))
(unless listener (throw-error (net-error)))
;; start responders
(set 'procs '())
(dotimes (i max-responders) (push (fork (responder listener)) procs))
Now... watch this!
Quote
ab -n 5000 -c 200 http://localhost:8080/
This is ApacheBench, Version 1.3d <Revision> apache-1.3
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 500 requests
Completed 1000 requests
Completed 1500 requests
Completed 2000 requests
Completed 2500 requests
Completed 3000 requests
Completed 3500 requests
Completed 4000 requests
Completed 4500 requests
Finished 5000 requests
Server Software:
Server Hostname: localhost
Server Port: 8080
Document Path: /
Document Length: 38867 bytes
Concurrency Level: 200
Time taken for tests: 2.423 seconds
Complete requests: 5000
Failed requests: 0
Broken pipe errors: 0
Total transferred: 198332860 bytes
HTML transferred: 197792110 bytes
Requests per second: 2063.56 [#/sec] (mean)
Time per request: 96.92 [ms] (mean)
Time per request: 0.48 [ms] (mean, across all concurrent requests)
Transfer rate: 81854.26 [Kbytes/sec] received
Connnection Times (ms)
min mean[+/-sd] median max
Connect: 0 11 6.2 10 38
Processing: 15 81 20.6 78 221
Waiting: 7 80 20.7 78 220
Total: 15 92 19.3 89 225
Percentage of the requests served within a certain time (ms)
50% 89
66% 94
75% 97
80% 101
90% 107
95% 120
98% 161
99% 171
100% 225 (last request)
One other thing - don't use (exit) on that last code. Hit control-c. Otherwise you will leave stale procs.
Yep, this one looks interesting! ;)))
Even on my machine it produced (from the "ab" which tries to "hog" the connection) almost 800/sec on the same very file
Quote
Document Path: /index.html
Document Length: 9444 bytes
Concurrency Level: 300
Time taken for tests: 12.573616 seconds
Complete requests: 10000
Failed requests: 0
Write errors: 0
Total transferred: 95716725 bytes
HTML transferred: 94902756 bytes
Requests per second: 795.32 [#/sec] (mean)
Time per request: 377.208 [ms] (mean)
Time per request: 1.257 [ms] (mean, across all concurrent requests)
Transfer rate: 7434.06 [Kbytes/sec] received
Traces both from strace (system) and ltrace (app lib calls) look quite clean and in a short clear pattern, no extra activity is generated on the system
The first proc (that spawned others) did not accept connections, the others all worked serving the load.
I believe it might be a good skeleton to grow a generic forking server from, what do others have to say about it?
By the way, you do not have to calculate and report Content-length.
Web browsers will tolerate it.
It was necessary for mod_lisp because of the quirks and demands of their protocol.
Why do I mention it? - because in some cases you might wish to avoid slurping file into memory, or be able to precalculate it in other ways etc..
P.S. ..and it's worth mentioning that the web server embedded into newLisp ( newlisp -http -d 8082 -w `pwd` ) did roughly 180-200 hits/sec and once it got choked on the input data and refused to take more requests.
In function definition of responder:
(let (conn request) ...)
should be:
(local (conn request) ...)
'local' is there for exactly this case, where you want uninitialzed locals.
... more lint ;-)
(constant 'SIGINT 2)
(constant 'SIGKILL 9)
(signal SIGINT 'sig-err)
(signal SIGKILL 'sig-err)
... more suggestions:
(define (num-cpus)
(int (first (exec "sysctl -n hw.ncpu")) 1))
the '1' as a default value makes sure the thing runs on OSs which don't have 'sysctl -n hw.ncpu' and 'int' would fail.
I know this is just an experiment, but I think the basic structure of it is nice and worth developing into some kind of standard server.
Heh, sorry about that. I had already changed it but hadn't posted that:
(define (num-cpus)
"Returns the number of cpus as known by sysctl, or 1 if the exec call fails."
(or (int (first (exec "sysctl -n hw.ncpu"))) 1))
Is local faster?
Not sure if its faster, but that was not the point of my comment ;-)
When I was reading:
(let (conn request) ...)
I saw: he is setting the local variable 'conn' to the contents of 'request', and was looking for 'request' else where, finding out that it was never used, except inside the 'let' expression.
Basically you where initializing a local 'conn' to the contents of an unbound global variable 'request', never used before.
Of course you intention was, to have have two variables 'conn' and 'request' which are local to the function, which is better expressed using
(local (conn request) ...)
ps: remember, there is a syntax of 'let' where you can leave out the parenthesis around the variable-value pair;
(let (x 1 y 2) (list x y)) => (1 2)
perhaps this is where the confusion comes from.
See also: http://www.newlisp.org/downloads/newlisp_manual.html#let
Gotcha. I'm working on http header parsing at the moment. There are so many places it needs to be sanity checked. Lighthttpd's sources are helpful.