(length) function speed

Started by Dmi, July 19, 2005, 02:56:45 PM

Previous topic - Next topic

Dmi

Does "length" - function for a list pre-knows it's size, or it iterates through all list elements?

In another words - is length function quite fast?
WBR, Dmi

Sammo

#1
On my WIN2K SP4 1.4GHz laptop, (length) seems to execute in a fixed amount of time (about 54 microseconds) regardless of list size. I executed



;long list

(silent (set 'a (sequence 0 1000000)))

(time (dotimes (x 10000000)) (length a))



;short list

(silent (set 'a (sequence 0 100)))

(time (dotimes (x 10000000)) (length a))



and got the same answer (540) from both.



-- Sam

Dmi

#2
Thanks!
WBR, Dmi

Sammo

#3
Oops! Since (time) returns milliseconds, the time computed for each invocation of (length) is actually 540 nanoseconds instead of 540 microseconds. I apologize for the error.

nigelbrown

#4
The code I think is for for length in nl-liststr.c  (newlisp-8.6.0 source)

seems to suggest list lengths are counted (but fast!)

--snip

   case CELL_EXPRESSION:

   case CELL_LAMBDA:

   case CELL_MACRO:

      list = (CELL *)params->contents;

      while(list != nilCell)

         {

         ++length;

         list = list->next;

         }

----snip



function is:



CELL * p_length(CELL * params)

{

UINT length;

SYMBOL * symbol;

CELL * list;



params = evaluateExpression(params);

length = 0;

switch(params->type)

   {

   case CELL_NUMBER:

      length = sizeof(UINT); break;

   case CELL_FLOAT:

      length = sizeof(double); break;

   case CELL_STRING:

      length = params->aux - 1; break;

   case CELL_CONTEXT:

   case CELL_SYMBOL:

      symbol = (SYMBOL *)params->contents;

      length = strlen(symbol->name);

      break;

   case CELL_DYN_SYMBOL:

      length = strlen((char *)params->contents);

      break;

   case CELL_EXPRESSION:

   case CELL_LAMBDA:

   case CELL_MACRO:

      list = (CELL *)params->contents;

      while(list != nilCell)

         {

         ++length;

         list = list->next;

         }

      break;

   case CELL_ARRAY:

      length = (params->aux - 1) / sizeof(UINT);

   default:

      break;

   }

return(stuffInteger(length));

}



Nigel



PS my timings for two lists is



> (time (dotimes (x 1000000) (length b)))

3225

> (time (dotimes (x 1000000) (length a)))

210

> (length a)

3

> (length b)

1000

>

Dmi

#5
Hm... strange. I see, that it really counts the length.

But (time (dotimes (x 10000000)) (length a)) returns the same value for any length of a.

I ran the same tests as Sammo, for length of 3, 100 and 1M.

I got 593 from any of tests. (Cel. P4 2400, Linux 2.6, newLisp 8.6)



I think, that means ONE "dotimes" iteraion and ONE  function call eats MUCH MUCH more than ONE counting of list of any length.



I think, for interpreter that could be normal. Also, possible C-code for list->next iteration have a very good optimisation on P4.



Also, checking code internals, I still don't understand the sense of returning last element of list instead of nil in indexing functions when the index is beyond the list length. The code HAVE nil-like element at the end of list and that element is ALWAYS checked in iterations.

So internal realisation has nothing against that.

Yes, nil is not synonym of "empty list", but it's still synonym for "empty symbol" and "not existing symbol".

So '() CAN safely be equivalent to '(nil), or '(nil,nil,nil,...)



And that have a real sense for programming: if I got list with undefined length (say, from broken stream), and I have to check n-th element - I always will got IT'S value or nil, but not the value of any of previous elements.



But can anybody give me the real example of usefulness of current behavior of list indexing while beyond the list length?
WBR, Dmi

HPW

#6
You and Sam used the wrong time-function.



(time (dotimes (x 10000000)) (length a))



Nigel used the right one:



(time (dotimes (x 1000000) (length a)))



Note the position of the paranthesis.

In your function you run an empty loop.



The function time does not complain about a second/more parameter.

They are also not evaluated.
Hans-Peter

Dmi

#7
:-)))

But how beautiful was theory constructed :-

Thanks!



As result:

(time (dotimes (x 1000000) (length a)))

576455 for 100000

323 for 100



so, size matters :-)
WBR, Dmi

HPW

#8
>so, size matters :-)



As mostly in life!

;-)

But newLISP is so incredible small compared to its power!

;-)))



PS: I like the quote from JSF (kozoru)
Quote
ANSI C and Lisp got maried and had a superbaby.
Hans-Peter