Benchmarking

pjot · July 30, 2009, 02:02:46 AM

newLisp guru's,

What would be the best code, if possible one-liner, to benchmark the performance of newLisp?

Greetings

Peter

newdep · July 30, 2009, 02:05:15 AM

I would guess a manipulation of a big list...

sort, lookup, replace, setf

Btw.. Benchmarking against what acualy?

A oneliner i leave for Lutz ;-)

pjot · July 30, 2009, 02:22:08 AM

OK my question was not specific enough. :-)

So let me rephrase: what would be the best portable code, if possible a one-liner, to benchmark the performance of newLisp?

The idea is to compare the performance of newLisp with other languages.

newdep · July 30, 2009, 02:31:36 AM

(dotimes (x 1000000) (push x buffer -1))

or a for loop

(for (x 1 100000 2) (push x buffer))

If the other language doesn have lists, then you could

concatenate a string..

Output to screen is not a real performance test..

..so make that 'silent in newlisp ;-)

Lutz · July 30, 2009, 04:36:21 AM

Code Select Expand
if possible one-liner, to benchmark the performance of newLisp?

short answer

============

sorry, there is no such thing.

long answer

===========

For any one line, you will get a language ranking which doesn't say anything about the language. And you change the hardware it is running on, or only the OS and it puts the results on its head.

Even benchmark collections like this:

http://www.newlisp.org/benchmarks/">http://www.newlisp.org/benchmarks/

can bring completely different results when changing the platform or OS.

In the source distribution you find a file qa-bench, which measures performance for most of the built-in functions and consolidates the result in to one performance index. This index is calibrated as 1.0 on Mac OSX on a MacMini 1.83 Ghz with 1 G memory.

You run it like this:

Code Select Expand
~> newlisp qa-bench
2363 ms
performance ratio: 1.00 (1.0 on Mac OS X, 1.83 GHz Intel Core 2 Duo)
~>

or like this:

Code Select Expand
~> newlisp qa-bench report
!=               9 ms
$                9 ms
%                9 ms
&                9 ms
*                9 ms
+                9 ms
-                9 ms
/                9 ms
<                9 ms
<<               9 ms
<=              10 ms
>=               9 ms
>>               9 ms
NaN?             9 ms
^                9 ms
abs              8 ms
acos             8 ms
acosh            9 ms
add              9 ms
address          9 ms
amb              9 ms
and             10 ms
append           9 ms
apply           10 ms
args             9 ms
array           10 ms
array-list      10 ms
array?          10 ms
asin             9 ms
asinh            9 ms
assoc            9 ms
atan             9 ms
atan2           10 ms
atanh            8 ms
atom?            9 ms
base64-dec      10 ms
base64-enc      10 ms
bayes-query      8 ms
bayes-train      9 ms
begin            9 ms
beta            10 ms
betai            9 ms
bind            10 ms
binomial         9 ms
bits            10 ms
case             9 ms
catch           10 ms
ceil             8 ms
char            10 ms
chop            10 ms
clean           10 ms
cond             9 ms
cons             9 ms
constant         9 ms
context          9 ms
context?         8 ms
copy             9 ms
cos              8 ms
cosh             8 ms
count           10 ms
cpymem           9 ms
crc32           11 ms
crit-chi2       10 ms
crit-z          11 ms
curry            9 ms
date            12 ms
date-value      10 ms
debug           10 ms
dec             10 ms
def-new         11 ms
default         10 ms
define           9 ms
define-macro     9 ms
delete          11 ms
det              9 ms
difference      10 ms
div              9 ms
do-until         9 ms
do-while        10 ms
doargs           9 ms
dolist          10 ms
dostring        11 ms
dotimes         10 ms
dotree          11 ms
dump             9 ms
dup             10 ms
empty?          10 ms
encrypt          9 ms
ends-with       11 ms
env             12 ms
erf              9 ms
error-event      9 ms
eval            10 ms
eval-string     11 ms
exists          10 ms
exp              8 ms
expand           9 ms
explode         10 ms
factor          10 ms
fft              9 ms
filter          10 ms
find            11 ms
find-all        10 ms
first           10 ms
flat             9 ms
float            9 ms
float?           8 ms
floor           15 ms
flt              9 ms
for             10 ms
for-all         10 ms
format          11 ms
fv               9 ms
gammai           9 ms
gammaln          8 ms
gcd              9 ms
get-char         9 ms
get-float        9 ms
get-int         10 ms
get-long         9 ms
get-string       9 ms
global           8 ms
global?          9 ms
if              11 ms
if-not           8 ms
ifft             9 ms
import           8 ms
inc             11 ms
index            9 ms
int              9 ms
integer?         8 ms
intersect       10 ms
invert          10 ms
irr             10 ms
join             9 ms
lambda?          8 ms
last            11 ms
last-error       9 ms
legal?          10 ms
length          13 ms
let             10 ms
letex           10 ms
letn             9 ms
list            12 ms
list?            9 ms
local            9 ms
log              8 ms
lookup          10 ms
lower-case      10 ms
macro?           8 ms
main-args        9 ms
map             11 ms
mat             10 ms
match            9 ms
max             10 ms
member          10 ms
min             10 ms
mod              9 ms
mul              9 ms
multiply        15 ms
name             9 ms
new             12 ms
nil?            10 ms
normal          15 ms
not             10 ms
now             10 ms
nper             9 ms
npv             10 ms
nth              9 ms
null?           10 ms
number?          9 ms
or              10 ms
pack            10 ms
parse           10 ms
pmt              9 ms
pop             10 ms
pop-assoc       10 ms
pow              9 ms
pretty-print     9 ms
primitive?       8 ms
prob-chi2        9 ms
prob-z           8 ms
protected?       9 ms
push             9 ms
pv               9 ms
quote            8 ms
quote?           8 ms
rand             9 ms
random           9 ms
randomize        9 ms
read-expr       10 ms
ref              9 ms
ref-all          9 ms
regex           11 ms
regex-comp       9 ms
replace         10 ms
rest            10 ms
reverse         10 ms
rotate          10 ms
round           10 ms
seed             9 ms
select          11 ms
sequence        10 ms
series          11 ms
set              9 ms
set-locale      11 ms
set-ref         10 ms
set-ref-all     10 ms
setf            10 ms
setq            11 ms
sgn              9 ms
sin              8 ms
sinh             9 ms
slice           11 ms
sort            10 ms
source          11 ms
sqrt             8 ms
starts-with     10 ms
string          11 ms
string?          9 ms
sub              9 ms
swap            11 ms
sym              9 ms
symbol?          9 ms
symbols         13 ms
sys-error       11 ms
sys-info        10 ms
tan              8 ms
tanh             8 ms
throw           11 ms
throw-error     10 ms
time             8 ms
time-of-day      9 ms
title-case      11 ms
transpose       10 ms
trim             9 ms
true?            9 ms
unify           10 ms
unique          10 ms
unless           9 ms
unpack          10 ms
until            9 ms
upper-case       9 ms
uuid            10 ms
when             9 ms
while           10 ms
write-buffer    10 ms
write-line       9 ms
xml-parse       10 ms
xml-type-tags    9 ms
zero?           11 ms
|                9 ms
~               10 ms
2443 ms
performance ratio: 1.00 (1.0 on Mac OS X, 1.83 GHz Intel Core 2 Duo)

you also can specify the repetition number for more precise results:

Code Select Expand
~> newlisp qa-bench report 10
!=              98 ms
$               91 ms
...
...
xml-parse      100 ms
xml-type-tags   93 ms
zero?           94 ms
|               93 ms
~               98 ms
24792 ms
performance ratio: 1.00 (1.0 on Mac OS X, 1.83 GHz Intel Core 2 Duo)

Although calibrated for equal time in each function originally, you see already a difference when you repeat the benchmark, because the environment on a time share OS constantly changes.

Running this under Linux on the same CPU completely changes the picture. Some functions suddenly perform double as fast or slow.

newdep · July 30, 2009, 12:44:58 PM

Btw.. 1 benchmark is for sure, I think newlisp is the fastes lisp on the planet so we can skip that part on the ranking ;-)

DrDave · July 31, 2009, 12:00:44 AM

I'm wondering why FLOOR runs about 2X slower than CEIL.

DrDave

pjot · July 31, 2009, 05:56:31 AM

~~Quote~~
And you change the hardware it is running on, or only the OS and it puts the results on its head.

I am running all benchmarks on the same system in the same OS.

~~Quote~~
Running this under Linux on the same CPU completely changes the picture. Some functions suddenly perform double as fast or slow.

Good remark. This means that different benchmarks should run for a longer time, like 15 or 30 minutes.

So maybe we have to look at it the other way around: instead of running a program and see how long it takes to complete, run a program for some time, and then see how many actions were performed.

newdep · July 31, 2009, 06:08:29 AM

~~Quote~~So maybe we have to look at it the other way around: instead of running a program and see how long it takes to complete, run a program for some time, and then see how many actions were performed.

...Uuuhhhh...

If I know that in a textfile of 2Gig there are 100000 volwels,

...both programs come to the same result finaly...

So where is the advantage of doing this not measuring it against a competitive target..like time?

PS: The amount of actions does not always result in a faster/efficient result.

pjot · July 31, 2009, 06:26:20 AM

Go tease some sheep, you compleat fan! ;-)

But the idea is not so difficult? Suppose we check the (add) statement. Let's run a newLisp program continuously adding 0.1 starting from 0, and let's run that program for 5 minutes.

Now, let's do the same thing in another language.

After those 5 minutes, we can see which value was reached, right?

Suppose newLisp reached 1000 and some other language X reached 500, we may safely conclude that newLisp is faster when it comes to adding floats? If we take newLisp as reference, it means language X is 50% slower?

pjot · July 31, 2009, 06:47:05 AM

So let me give an example. This compiled BASIC program runs for 10 seconds adding 0.0001 to a variable.

Code Select Expand

DECLARE t TYPE double

t = 0

start = SECOND(NOW)
end = start + 10

WHILE SECOND(NOW) NE end DO
    t = t + 0.0001
WEND

PRINT "Result is: ", t

END

Now, the equivalent of such a BASIC program in newLisp is like this (and correct me if I can implement it more efficiently):

Code Select Expand

(set 't 0.0)

(set 'start (apply date-value (now)))

(set 'end (+ start 10))

(while (not (= (apply date-value (now)) end))
    (set 't (add t 0.0001))
)

(println "Result is: " t)
(exit)

When run the compiled BASIC program, the result is:

~~Quote~~
peter@solarstriker:~/programming$ ./benchmark

Result is: 574.7542999

When I run the newLisp program, the result is:

~~Quote~~
peter@solarstriker:~/programming$ newlisp benchmark.lsp

Result is: 373.0229

Both programs run on the same machine in the same Operating System, and to me it seems the results indicate that the BASIC compiler is faster? Again, maybe there can be an optimization for the newLisp program? What do you folks say about it?

Peter

Lutz · July 31, 2009, 07:54:42 AM

If this is compiled Basic (it seems to be, judging from the type declarations), then it looks pretty good for newLISP.

But still comparing compiled vs dynamic languages is comparing apples and oranges.

It is also not clear what this example really measures. Probably not floating point addition but rather internal time functions, or both.

By just changing the way time is measured newLISP is double as fast doing more than double the floating point additions than before and beats compiled BASIC:

Code Select Expand
(set 't 0.0)
(set 'start (time-of-day))
(set 'end (+ start 10000))

(while (not (= (time-of-day) end))
    (set 't (add t 0.0001))
)

(println "Result is: " t)
(exit) 

Result is: 925.3656998 ; versus 412.2727 using 'date-value' on Mac Mini 1.83 Ghz

But my main point is, that languages should not be compared by just testing one or two things, in this case floating point addition and retrieval of system time.

The net is full of this type of toy comparisons doing just some little thing. They make for lots of hits in a block-post but really don't say anything about the programming languages involved.

The best way to benchmark is, to either benchmark lots of well defined specific operations (similar to what qa-bench is doing) or to benchmark well defined real world tasks, big enough to work a broader area of the function repertoire of the language.

pjot · July 31, 2009, 10:37:25 AM

~~Quote~~
If this is compiled Basic, then it looks pretty good for newLISP.

It is compiled BASIC all right and indeed, newLisp runs very well!!

~~Quote~~
But still comparing compiled vs dynamic languages is comparing apples and oranges.

In this case, I am particularly interested in newLisp versus any compiled language. I do want to see how well newLisp performs compared with a compiled binary. One of the traditional objections against interpreted languages is, that they are slow. I already observed a very good performance with newLisp programs, but how well does newLisp perform?

~~Quote~~
It is also not clear what this example really measures. Probably not floating point addition but rather internal time functions, or both.

Fully agreed. This will always be a problem of benchmarks. Maybe we should say: a similar program with the exact same functionality.

~~Quote~~
But my main point is, that languages should not be compared by just testing one or two things, in this case floating point addition and retrieval of system time.

Obviously not! This was just an example. I already was thinking of multiple tests.

In the end one never will get the exact performance. Nevertheless, some sort of global indication is sufficient for me.

Your code indeed improves the performance tremendously. If I also improve the BASIC code in a similar way, with compile optimizations (-fnative) then these are the results:

Code Select Expand

DECLARE t TYPE double

t = 0

end = NOW + 10

WHILE NOW < end DO
    t = t + 0.0001
WEND

PRINT "Result is: ", t

Result is: 10543.5752

Code Select Expand

#!/bin/newlisp

(set 't 0.0)
(set 'end (+ (time-of-day) 10000))

(while (< (time-of-day) end)
    (set 't (add t 0.0001))
)

(println "Result is: " t)
(exit)

Result is: 563.1762999

So newLisp runs 94.66% slower compared to the compiled BASIC binary with the same functionality.

Again, it is admitted that the actual test is blurry, therefore I will make more tests to see the difference. The performance on lists for example, will be much better than a similar functionality in BASIC (arrays?). Probably there are more typical Lisp aspects where even a BASIC-compiler will be beaten.

Peter

Lutz · July 31, 2009, 11:06:12 AM

A good native compile will always be at least 50 times faster than a interpreted language. Here are some interesting comparisons for fibonacci:

http://dada.perl.it/shootout/fibo.html">http://dada.perl.it/shootout/fibo.html

and here for other algorithms showing different rankings:

http://dada.perl.it/shootout">http://dada.perl.it/shootout

It is interesting to see how well JIT (Just In Time) compilation is doing for Java on number crunching tasks.

cormullion · August 01, 2009, 04:52:59 AM

I thought I'd start adding to the list you started, Lutz:

Code Select Expand
2282 ms         ; pr: 0.9 ; Mac OS X 2.0 GHz Intel Core 2 Duo 
1658 ms         ; pr: 0.7 ; FreeBSD at NFSHOST (no idea what CPU - ?)

I'll add more results for my motley collection of computers when I get a http://www.kidsturncentral.com/roundtoit.htm">Roundtoit. I expect to hit the 5 second mark later... :)

newLISP Fan Club

News:

Benchmarking

pjot

newdep

pjot

newdep

Lutz

newdep

DrDave

pjot

newdep

pjot

pjot

Lutz

pjot

Lutz

cormullion