Need faster find (with a semi-complex example)

Started by kanen, April 16, 2011, 12:01:05 AM

Previous topic - Next topic

kanen

I am processing tens of thousands of bits of data per minute (and, sometimes, per second).



I have information coming in as a list in the form '(0 30 55 200 0 1) and I have a reference in a similar format.



I need to match the contents of my known lists with the new list which has just arrived. And, it needs to be very fast.



Thanks to Lutz (and others), I have come up with a good solution, but I need it to be faster.



Starting with the reference lists

(dolist (s (sequence 1 100))
  (set 'n (rand 255 (rand 100)))
  (push n e -1)
)


Now we have a list of 100 reference lists, which we need to match to a new list containing one of the referenced lists.



(set 'f (rand 255 100))  ; create a new list
(push (e (rand (length e))) f) ; grab from 'e above and push into our list


Now, if we convert everything to hex and do a check...



(set 'eh (map (fn (l) (pack (dup "b" (length l)) l)) e))  ; convert to hex (much faster!)
;
(set 'hex (pack (dup "b" (length f)) f))
(clean nil? (map (fn (s) (find s hex)) eh))
;> (n) ; where n is the location found
;
(time (clean nil? (map (fn (s) (find s hex)) eh)))
;> ~ 1ms  ; this is too slow


I only care about that last bit, where I am using a (clean) and (map) to find the hex. Right now, it tops out at ~1 ms (on my laptop) and about ~4ms on my reference system. This is painfully slow for the amount of traffic I am processing.



Any thoughts would be appreciated.
. Kanen Flowers http://kanen.me[/url] .

newdep

#1
'map is very elegant but i think a 'for loop is perhpas quicker.

And using 'array's here could speed up too.

Im not sure if 'find has a 'regex layer underwater but if it hasn't perhpas regex might speed it up too.



Just my Saturday's 2 cents ;-)
-- (define? (Cornflakes))

cormullion

#2
Hi Kanen.



But in general, my understanding of newLISP is that mapping solutions aren't as fast as plain iterative solutions, and that mapping anonymous functions is slightly even less fast than mapping defined functions.



I don't know at what point you have to get closer to the machine than an interpreted scripting language lets you. Perhaps you're near that point now.



Edit: I read the post wrong...