Benchmarking

Started by pjot, July 30, 2009, 02:02:46 AM

Previous topic - Next topic

pjot

newLisp guru's,



What would be the best code, if possible one-liner, to benchmark the performance of newLisp?



Greetings

Peter

newdep

#1
I would guess a manipulation of a big list...

sort, lookup, replace, setf



Btw.. Benchmarking against what acualy?



A oneliner i leave for Lutz ;-)
-- (define? (Cornflakes))

pjot

#2
OK my question was not specific enough. :-)



So let me rephrase: what would be the best portable code, if possible a one-liner, to benchmark the performance of newLisp?



The idea is to compare the performance of newLisp with other languages.

newdep

#3
(dotimes (x 1000000) (push x buffer -1))



or a for loop



(for (x 1 100000 2) (push x buffer))



If the other language doesn have lists, then you could

concatenate a string..

Output to screen is not a real performance test..

..so make that 'silent in newlisp ;-)
-- (define? (Cornflakes))

Lutz

#4
if possible one-liner, to benchmark the performance of newLisp?

short answer

============



sorry, there is no such thing.



long answer

===========



For any one line, you will get a language ranking which doesn't say anything about the language. And you change the hardware it is running on, or only the OS and it puts the results on its head.



Even benchmark collections like this:



http://www.newlisp.org/benchmarks/">http://www.newlisp.org/benchmarks/



can bring completely different results when changing the platform or OS.



In the source distribution you find a file qa-bench, which measures performance for most of the built-in functions and consolidates the result in to one performance index. This index is calibrated as 1.0 on Mac OSX on a MacMini 1.83 Ghz with 1 G memory.



You run it like this:


~> newlisp qa-bench
2363 ms
performance ratio: 1.00 (1.0 on Mac OS X, 1.83 GHz Intel Core 2 Duo)
~>


or like this:


~> newlisp qa-bench report
!=               9 ms
$                9 ms
%                9 ms
&                9 ms
*                9 ms
+                9 ms
-                9 ms
/                9 ms
<                9 ms
<<               9 ms
<=              10 ms
>=               9 ms
>>               9 ms
NaN?             9 ms
^                9 ms
abs              8 ms
acos             8 ms
acosh            9 ms
add              9 ms
address          9 ms
amb              9 ms
and             10 ms
append           9 ms
apply           10 ms
args             9 ms
array           10 ms
array-list      10 ms
array?          10 ms
asin             9 ms
asinh            9 ms
assoc            9 ms
atan             9 ms
atan2           10 ms
atanh            8 ms
atom?            9 ms
base64-dec      10 ms
base64-enc      10 ms
bayes-query      8 ms
bayes-train      9 ms
begin            9 ms
beta            10 ms
betai            9 ms
bind            10 ms
binomial         9 ms
bits            10 ms
case             9 ms
catch           10 ms
ceil             8 ms
char            10 ms
chop            10 ms
clean           10 ms
cond             9 ms
cons             9 ms
constant         9 ms
context          9 ms
context?         8 ms
copy             9 ms
cos              8 ms
cosh             8 ms
count           10 ms
cpymem           9 ms
crc32           11 ms
crit-chi2       10 ms
crit-z          11 ms
curry            9 ms
date            12 ms
date-value      10 ms
debug           10 ms
dec             10 ms
def-new         11 ms
default         10 ms
define           9 ms
define-macro     9 ms
delete          11 ms
det              9 ms
difference      10 ms
div              9 ms
do-until         9 ms
do-while        10 ms
doargs           9 ms
dolist          10 ms
dostring        11 ms
dotimes         10 ms
dotree          11 ms
dump             9 ms
dup             10 ms
empty?          10 ms
encrypt          9 ms
ends-with       11 ms
env             12 ms
erf              9 ms
error-event      9 ms
eval            10 ms
eval-string     11 ms
exists          10 ms
exp              8 ms
expand           9 ms
explode         10 ms
factor          10 ms
fft              9 ms
filter          10 ms
find            11 ms
find-all        10 ms
first           10 ms
flat             9 ms
float            9 ms
float?           8 ms
floor           15 ms
flt              9 ms
for             10 ms
for-all         10 ms
format          11 ms
fv               9 ms
gammai           9 ms
gammaln          8 ms
gcd              9 ms
get-char         9 ms
get-float        9 ms
get-int         10 ms
get-long         9 ms
get-string       9 ms
global           8 ms
global?          9 ms
if              11 ms
if-not           8 ms
ifft             9 ms
import           8 ms
inc             11 ms
index            9 ms
int              9 ms
integer?         8 ms
intersect       10 ms
invert          10 ms
irr             10 ms
join             9 ms
lambda?          8 ms
last            11 ms
last-error       9 ms
legal?          10 ms
length          13 ms
let             10 ms
letex           10 ms
letn             9 ms
list            12 ms
list?            9 ms
local            9 ms
log              8 ms
lookup          10 ms
lower-case      10 ms
macro?           8 ms
main-args        9 ms
map             11 ms
mat             10 ms
match            9 ms
max             10 ms
member          10 ms
min             10 ms
mod              9 ms
mul              9 ms
multiply        15 ms
name             9 ms
new             12 ms
nil?            10 ms
normal          15 ms
not             10 ms
now             10 ms
nper             9 ms
npv             10 ms
nth              9 ms
null?           10 ms
number?          9 ms
or              10 ms
pack            10 ms
parse           10 ms
pmt              9 ms
pop             10 ms
pop-assoc       10 ms
pow              9 ms
pretty-print     9 ms
primitive?       8 ms
prob-chi2        9 ms
prob-z           8 ms
protected?       9 ms
push             9 ms
pv               9 ms
quote            8 ms
quote?           8 ms
rand             9 ms
random           9 ms
randomize        9 ms
read-expr       10 ms
ref              9 ms
ref-all          9 ms
regex           11 ms
regex-comp       9 ms
replace         10 ms
rest            10 ms
reverse         10 ms
rotate          10 ms
round           10 ms
seed             9 ms
select          11 ms
sequence        10 ms
series          11 ms
set              9 ms
set-locale      11 ms
set-ref         10 ms
set-ref-all     10 ms
setf            10 ms
setq            11 ms
sgn              9 ms
sin              8 ms
sinh             9 ms
slice           11 ms
sort            10 ms
source          11 ms
sqrt             8 ms
starts-with     10 ms
string          11 ms
string?          9 ms
sub              9 ms
swap            11 ms
sym              9 ms
symbol?          9 ms
symbols         13 ms
sys-error       11 ms
sys-info        10 ms
tan              8 ms
tanh             8 ms
throw           11 ms
throw-error     10 ms
time             8 ms
time-of-day      9 ms
title-case      11 ms
transpose       10 ms
trim             9 ms
true?            9 ms
unify           10 ms
unique          10 ms
unless           9 ms
unpack          10 ms
until            9 ms
upper-case       9 ms
uuid            10 ms
when             9 ms
while           10 ms
write-buffer    10 ms
write-line       9 ms
xml-parse       10 ms
xml-type-tags    9 ms
zero?           11 ms
|                9 ms
~               10 ms
2443 ms
performance ratio: 1.00 (1.0 on Mac OS X, 1.83 GHz Intel Core 2 Duo)


you also can specify the repetition number for more precise results:


~> newlisp qa-bench report 10
!=              98 ms
$               91 ms
...
...
xml-parse      100 ms
xml-type-tags   93 ms
zero?           94 ms
|               93 ms
~               98 ms
24792 ms
performance ratio: 1.00 (1.0 on Mac OS X, 1.83 GHz Intel Core 2 Duo)


Although calibrated for equal time in each function originally, you see already a difference when you repeat the benchmark, because the environment on a time share OS constantly changes.



Running this under Linux on the same CPU completely changes the picture. Some functions suddenly perform double as fast or slow.

newdep

#5
Btw.. 1 benchmark is for sure, I think newlisp is the fastes lisp on the planet so we can skip that part on the ranking ;-)
-- (define? (Cornflakes))

DrDave

#6
I'm wondering why FLOOR runs about 2X slower than CEIL.



DrDave
...it is better to first strive for clarity and correctness and to make programs efficient only if really needed.

\"Getting Started with Erlang\"  version 5.6.2

pjot

#7
Quote
And you change the hardware it is running on, or only the OS and it puts the results on its head.

I am running all benchmarks on the same system in the same OS.
Quote
Running this under Linux on the same CPU completely changes the picture. Some functions suddenly perform double as fast or slow.

Good remark. This means that different benchmarks should run for a longer time, like 15 or 30 minutes.



So maybe we have to look at it the other way around: instead of running a program and see how long it takes to complete, run a program for some time, and then see how many actions were performed.

newdep

#8
QuoteSo maybe we have to look at it the other way around: instead of running a program and see how long it takes to complete, run a program for some time, and then see how many actions were performed.


...Uuuhhhh...



If I know that in a textfile of 2Gig there are 100000 volwels,

...both programs come to the same result finaly...



So where is the advantage of doing this not measuring it against a competitive target..like time?



PS: The amount of actions does not always result in a faster/efficient result.
-- (define? (Cornflakes))

pjot

#9
Go tease some sheep, you compleat fan! ;-)



But the idea is not so difficult? Suppose we check the (add) statement. Let's run a newLisp program continuously adding 0.1 starting from 0, and let's run that program for 5 minutes.



Now, let's do the same thing in another language.



After those 5 minutes, we can see which value was reached, right?



Suppose newLisp reached 1000 and some other language X reached 500, we may safely conclude that newLisp is faster when it comes to adding floats? If we take newLisp as reference, it means language X is 50% slower?

pjot

#10
So let me give an example. This compiled BASIC program runs for 10 seconds adding 0.0001 to a variable.



DECLARE t TYPE double

t = 0

start = SECOND(NOW)
end = start + 10

WHILE SECOND(NOW) NE end DO
    t = t + 0.0001
WEND

PRINT "Result is: ", t

END


Now, the equivalent of such a BASIC program in newLisp is like this (and correct me if I can implement it more efficiently):



(set 't 0.0)

(set 'start (apply date-value (now)))

(set 'end (+ start 10))

(while (not (= (apply date-value (now)) end))
    (set 't (add t 0.0001))
)

(println "Result is: " t)
(exit)


When run the compiled BASIC program, the result is:


Quote
peter@solarstriker:~/programming$ ./benchmark

Result is: 574.7542999


When I run the newLisp program, the result is:


Quote
peter@solarstriker:~/programming$ newlisp benchmark.lsp

Result is: 373.0229


Both programs run on the same machine in the same Operating System, and to me it seems the results indicate that the BASIC compiler is faster? Again, maybe there can be an optimization for the newLisp program? What do you folks say about it?



Peter

Lutz

#11
If this is compiled Basic (it seems to be, judging from the type declarations), then it looks pretty good for newLISP.



But still comparing compiled vs dynamic languages is comparing apples and oranges.



It is also not clear what this example really measures. Probably not floating point addition but rather internal time functions, or both.



By just changing the way time is measured newLISP is double as fast doing more than double the floating point additions than before and beats compiled BASIC:


(set 't 0.0)
(set 'start (time-of-day))
(set 'end (+ start 10000))

(while (not (= (time-of-day) end))
    (set 't (add t 0.0001))
)

(println "Result is: " t)
(exit)

Result is: 925.3656998 ; versus 412.2727 using 'date-value' on Mac Mini 1.83 Ghz


But my main point is, that languages should not be compared by just testing one or two things, in this case floating point addition and retrieval of system time.



The net is full of this type of toy comparisons doing just some little thing. They make for lots of hits in a block-post but really don't say anything about the programming languages involved.



The best way to benchmark is, to either benchmark lots of well defined specific operations (similar to what qa-bench is doing) or to benchmark well defined real world tasks, big enough to work a broader area of the function repertoire of the language.

pjot

#12
Quote
If this is compiled Basic, then it looks pretty good for newLISP.

It is compiled BASIC all right and indeed, newLisp runs very well!!


Quote
But still comparing compiled vs dynamic languages is comparing apples and oranges.

In this case, I am particularly interested in newLisp versus any compiled language. I do want to see how well newLisp performs compared with a compiled binary. One of the traditional objections against interpreted languages is, that they are slow. I already observed a very good performance with newLisp programs, but how well does newLisp perform?


Quote
It is also not clear what this example really measures. Probably not floating point addition but rather internal time functions, or both.

Fully agreed. This will always be a problem of benchmarks. Maybe we should say: a similar program with the exact same functionality.


Quote
But my main point is, that languages should not be compared by just testing one or two things, in this case floating point addition and retrieval of system time.

Obviously not! This was just an example. I already was thinking of multiple tests.

In the end one never will get the exact performance. Nevertheless, some sort of global indication is sufficient for me.

Your code indeed improves the performance tremendously. If I also improve the BASIC code in a similar way, with compile optimizations (-fnative) then these are the results:



DECLARE t TYPE double

t = 0

end = NOW + 10

WHILE NOW < end DO
    t = t + 0.0001
WEND

PRINT "Result is: ", t


Result is: 10543.5752



#!/bin/newlisp

(set 't 0.0)
(set 'end (+ (time-of-day) 10000))

(while (< (time-of-day) end)
    (set 't (add t 0.0001))
)

(println "Result is: " t)
(exit)


Result is: 563.1762999



So newLisp runs 94.66% slower compared to the compiled BASIC binary with the same functionality.



Again, it is admitted that the actual test is blurry, therefore I will make more tests to see the difference. The performance on lists for example, will be much better than a similar functionality in BASIC (arrays?). Probably there are more typical Lisp aspects where even a BASIC-compiler will be beaten.



Peter

Lutz

#13
A good native compile will always be at least 50 times faster than a interpreted language. Here are some interesting comparisons for fibonacci:



http://dada.perl.it/shootout/fibo.html">http://dada.perl.it/shootout/fibo.html



and here for other algorithms showing different rankings:



http://dada.perl.it/shootout">http://dada.perl.it/shootout



It is interesting to see how well JIT (Just In Time) compilation is doing for Java on number crunching tasks.

cormullion

#14
I thought I'd start adding to the list you started, Lutz:


2282 ms         ; pr: 0.9 ; Mac OS X 2.0 GHz Intel Core 2 Duo
1658 ms         ; pr: 0.7 ; FreeBSD at NFSHOST (no idea what CPU - ?)


I'll add more results for my motley collection of computers when I get  a http://www.kidsturncentral.com/roundtoit.htm">Roundtoit. I expect to hit the 5 second mark later... :)