I have been playing with some of the shootout functions and trying to get decent results with newlisp. Here is the fastest implementation I have so far for the sieve of eratosthenes:
(define (sieve size , flags total)
(set 'flags (array (+ size 1))
'total 0)
(for (i 2 (- size 1))
(when (not (flags i))
(for (k (+ i i) size i (> k size))
(nth-set (flags k) 1))
(inc 'total)))
total)
(define (m n)
(* (pow 2 n) 10000))
(for (i 9 7)
(let (n (m i))
(println (format "Primes up to %8d %8d" n (sieve n)))))
It's roughly the same as the incrementing algorithm in the nsieve programs on the shootout site. I'm able to run it in around 20 seconds on my machine (which is pretty fast). The PHP implementation runs quite a bit faster than that. That shouldn't be right... PHP isn't really optimized for this kind of work. Any ideas on how to speed this up?[/code]
14 seconds here...
(iMac 2GHz Core Duo)
for comparison, the Ruby version (//http://shootout.alioth.debian.org/gp4/benchmark.php?test=nsieve&lang=ruby&id=1) took 48 seconds.
I don't see any obvious ways to improve it...
Lutz used character vectors and cpymem for his... //http://www.newlisp.org/syntax.cgi?benchmarks/nsieve.newlisp.txt
To really optimise code , you have to use "small iron" cpus to see the problems ;)
I tried nsieve on an old Win98 laptop (48mb ram 120 mhz) machine with PHP 5.2.2, LUA 5.1, and Python 2.5...
-------
->"C:Program Filesphp5php.exe" test.php
Primes up to 10000 1229
Primes up to 0 0
Primes up to 0 0
------
->"C:Lualua5.1.exe" test.lua
Primes up to 40000 4203
Primes up to 20000 2262
Primes up to 10000 1229
------
->C:Python25PYTHON.EXE test.py
Traceback (most recent call last):
File "test.py", line 18, in <module>
nsieve((1 << (int(sys.argv[1]) - k)) * 10000)
IndexError: list index out of range
------
PHP and Lua both loaded and ran in a few seconds... Until hitting memory limits and returning partial results...
Python crashes, I think for not being able to solce the problem due to lack of memory space... Required indentation makes copy and paste Python programming at best a time consuming pain and at worst, a dangerous practice!
Your newLISP code Jeff, "runs forever" at 100% CPU usage, I killed it after 10 minutes... It continually accesses the hard drive to use the hard drive virtual memory... (The functional language/VN disk thrash record on this machine is held by PLT Scheme's Dr. Scheme IDE, which takes almost 50 minutes to load and run!)
So you might be having memory "juggling" issues that don't show up on large memory machines, but requires more time than expected for your code to execute...
Swans and Lisp(s) look gracefully smooth on the surface, while paddling like heck under the surface!
-- xytroxon
Well, look at the numbers. It's making arrays of 512 * 10000 elements. Run it with smaller sizes and multiply the results. The array is used for constant access speeds.
The same with the other languages.
Xytroxon - it ran consistently at 20-21 seconds on my 2.5 ghz quad-core ppc with 4 gigs of ram.
Quote from: "Jeff"
Well, look at the numbers. It's making arrays of 512 * 10000 elements. Run it with smaller sizes and multiply the results. The array is used for constant access speeds.
The same with the other languages.
Xytroxon - it ran consistently at 20-21 seconds on my 2.5 ghz quad-core ppc with 4 gigs of ram.
Backed it down to 1000
D:MinGW - 18:14:04.58 Tue 09-02-2008
->"c:program filesphp5php.exe" test.php
Primes up to 1000 168
Primes up to 0 0
Primes up to 0 0
D:MinGW - 18:14:30.78 Tue 09-02-2008
->nl test.nl
Primes up to 512000 42445
Primes up to 256000 22525
Primes up to 128000 11987
D:MinGW - 18:16:20.52 Tue 09-02-2008
->lua test.lua
"C:Lualua5.1.exe" test.lua
Primes up to 4000 550
Primes up to 2000 303
Primes up to 1000 168
----
The spaces all look right, but Python is still MIA...
-----
While newLISP takes a few minutes to run... It looks like it is doing the calculations on this machine! Amazing!!!
Is thw PHP code the problem? Or is the Zend engine kicking in to help the speed???
This was fun... Timw for supper!!!
(setq sieve (lambda(size)
(let ((flags (array (+ size 1)))
(sqrtsize (sqrt size))
(total 0))
(for (i 2 sqrtsize)
(when (not (flags i))
(for (k (* i i) size i)
(nth-set (flags k) 0))
(inc 'total)))
(for (i (+ sqrtsize 1) size)
(unless (flags i)
(inc 'total)))
total)))
This one is ~ 1.5 times faster, but trick is mathematical, not programming.
I can do it in 11 seconds (as opposed to 14) if I get spawn on the case.
Kasimir's function and spawn get me down to 6 seconds... :)
.. what is it I am doing wrong?
running your snippet on a 1.8GHz single-core Intel compaq/HP laptop with 1GB of memory takes 13 seconds under Linux
Looks like an unsurprising time to me. My 2Ghz Core Duo runs Jeff's code in 14 seconds (presumably without engaging both CPUs). Allowing for the way I'm running scripts (not from the command line but from an editor) that seems to match yours. Jeff's using PPCs rather than Intels - are they a bit slower?
For math and graphics, PPCs should be faster.
Interesting - and puzzling too, that a "2.5 ghz quad-core ppc with 4 gigs of ram" manages 20 seconds whereas a "2 ghz intel core duo with 1 gig of ram" manages 14... Still, the results are in the same general area - not 0.3 seconds, for example.