Strange HikUp in 'difference

Started by newdep, June 01, 2007, 06:51:10 AM

Previous topic - Next topic

newdep

Hi Lutz,



I have something strange in newlisp...





If I use 1 big file which contains 3 lists -> 'one 'two 'three

and I do a (load "my-big-list")



List 'one has random data and is >200.000 entries

List 'two has unique data and is >200.000 entries

List 'three has nested data, random and is >800.000 entries



then I do a (set 'one (difference two one true))

that is normal and is fast...







But! if I do



(load "my-big-list-one") ; list 'one

(load "my-big-list-two") ; list 'two

(load "my-big-list-three") ; list ' three



then I do a (set 'one (difference two one true))



Then that action always takes longer then the example above???

( 1 or 2 seconds longer )





I cant emagine that the way of loading data has effect, this is very odd...



Norman.
-- (define? (Cornflakes))

newdep

#1
I think its something else...



I run newlisp with "newlisp -m 256" this is fine..



But I also do , just befor i do 'difference, a (unqiue (flat three))



I exclude the fact that 'load has anything to do with this..



. so... below the 800.000 i had a good speed..now above the 800.000 I have a

delayed speed...



Is there a speed issue with 'flat or 'unique regarding the length of lists?

(i dont think so...but im trying to eliminate the problem..)





PS: its all done during execution of a script not from the commandline.





Norman.
-- (define? (Cornflakes))

Jeff

#2
Well, recursion is typically used in the design of functions like flat, due to the fact that there may be any number of nested levels in the list.  With 800,000 entries, it would very much depend on the contents of the list for speed.  If it stores string data, it will take a lot longer than symbols, for example.  Also, if the list is hugely deep, that can take a long time due to recursion's overhead.
Jeff

=====

Old programmers don\'t die. They just parse on...



http://artfulcode.net\">Artful code

newdep

#3
Yes its a string list but only 1 level deep.. The odd thing is though that the

difference between 200.000 and 700.000 is nihil.



The list contains of 24.000 indicators with all 1 nested list.

The nested content vary's per list but is in total >800.000.



I think the problem started around 800.000 list entry's..



I have enought memory left , newlisp eat currently 136 MB.



..its odd that a difference of 100.000 lists does decrease 2 seconds...

(above +/- 750.000)



Norman.
-- (define? (Cornflakes))

Lutz

#4
Not sure if the 'unique' before brings any advantage. If you use 'difference' without the flag, it will give you a unique result anyway.



Internally newLISP will sort both lists involved and then do the 'unique' (for the mode without the flag) on-the-go.



It is hard to pin down these kind of differences and it may go into the opposite direction on a different OS, I have seen this in many benchmarks, which look completely different on an Intel architecture versus the PPC architecture. Both of them favouring different patterns or memory allocation/deallocation etc.

Lutz

newdep

#5
Yes you have a point there..



I was monitoring the memory when i started to hear the FAN blow...



Its somehow difficult to see what happens but it looks like its a kernel-memory

allocation issue indeed...



Currently the FAN is quiet again...sometimes the FAN blows for 1/2 second..



Mmmmm funny... ;-)
-- (define? (Cornflakes))

newdep

#6
yes..jeff, the 'flat is a real bugger currently.. Ill remove it..



any convertion at lists is actualy a delay...



Ill move towards pure list functions now...'ref-all will speed it up..
-- (define? (Cornflakes))

newdep

#7
My Mounth just hit the ground !!! (Doingggggg)....



Ref-all is at least 1000 times faster than 'flat or 'string on a nested list... ;-)
-- (define? (Cornflakes))