Strange HikUp in 'difference

newdep · June 01, 2007, 06:51:10 AM

Hi Lutz,

I have something strange in newlisp...

If I use 1 big file which contains 3 lists -> 'one 'two 'three

and I do a (load "my-big-list")

List 'one has random data and is >200.000 entries

List 'two has unique data and is >200.000 entries

List 'three has nested data, random and is >800.000 entries

then I do a (set 'one (difference two one true))

that is normal and is fast...

But! if I do

(load "my-big-list-one") ; list 'one

(load "my-big-list-two") ; list 'two

(load "my-big-list-three") ; list ' three

then I do a (set 'one (difference two one true))

Then that action always takes longer then the example above???

( 1 or 2 seconds longer )

I cant emagine that the way of loading data has effect, this is very odd...

Norman.

newdep · June 01, 2007, 07:09:31 AM

I think its something else...

I run newlisp with "newlisp -m 256" this is fine..

But I also do , just befor i do 'difference, a (unqiue (flat three))

I exclude the fact that 'load has anything to do with this..

. so... below the 800.000 i had a good speed..now above the 800.000 I have a

delayed speed...

Is there a speed issue with 'flat or 'unique regarding the length of lists?

(i dont think so...but im trying to eliminate the problem..)

PS: its all done during execution of a script not from the commandline.

Norman.

Jeff · June 01, 2007, 07:32:52 AM

Well, recursion is typically used in the design of functions like flat, due to the fact that there may be any number of nested levels in the list. With 800,000 entries, it would very much depend on the contents of the list for speed. If it stores string data, it will take a lot longer than symbols, for example. Also, if the list is hugely deep, that can take a long time due to recursion's overhead.

newdep · June 01, 2007, 07:49:34 AM

Yes its a string list but only 1 level deep.. The odd thing is though that the

difference between 200.000 and 700.000 is nihil.

The list contains of 24.000 indicators with all 1 nested list.

The nested content vary's per list but is in total >800.000.

I think the problem started around 800.000 list entry's..

I have enought memory left , newlisp eat currently 136 MB.

..its odd that a difference of 100.000 lists does decrease 2 seconds...

(above +/- 750.000)

Norman.

Lutz · June 01, 2007, 07:50:23 AM

Not sure if the 'unique' before brings any advantage. If you use 'difference' without the flag, it will give you a unique result anyway.

Internally newLISP will sort both lists involved and then do the 'unique' (for the mode without the flag) on-the-go.

It is hard to pin down these kind of differences and it may go into the opposite direction on a different OS, I have seen this in many benchmarks, which look completely different on an Intel architecture versus the PPC architecture. Both of them favouring different patterns or memory allocation/deallocation etc.

Lutz

newdep · June 01, 2007, 07:54:13 AM

Yes you have a point there..

I was monitoring the memory when i started to hear the FAN blow...

Its somehow difficult to see what happens but it looks like its a kernel-memory

allocation issue indeed...

Currently the FAN is quiet again...sometimes the FAN blows for 1/2 second..

Mmmmm funny... ;-)

newdep · June 01, 2007, 11:06:11 AM

yes..jeff, the 'flat is a real bugger currently.. Ill remove it..

any convertion at lists is actualy a delay...

Ill move towards pure list functions now...'ref-all will speed it up..

newdep · June 01, 2007, 11:15:16 AM

My Mounth just hit the ground !!! (Doingggggg)....

Ref-all is at least 1000 times faster than 'flat or 'string on a nested list... ;-)

newLISP Fan Club

News:

Strange HikUp in 'difference

newdep

newdep

Jeff

newdep

Lutz

newdep

newdep

newdep