Need faster find (with a semi-complex example)

kanen · April 16, 2011, 12:01:05 AM

I am processing tens of thousands of bits of data per minute (and, sometimes, per second).

I have information coming in as a list in the form '(0 30 55 200 0 1) and I have a reference in a similar format.

I need to match the contents of my known lists with the new list which has just arrived. And, it needs to be very fast.

Thanks to Lutz (and others), I have come up with a good solution, but I need it to be faster.

Starting with the reference lists

Code Select Expand

(dolist (s (sequence 1 100))
  (set 'n (rand 255 (rand 100)))
  (push n e -1)
)

Now we have a list of 100 reference lists, which we need to match to a new list containing one of the referenced lists.

Code Select Expand

(set 'f (rand 255 100))  ; create a new list
(push (e (rand (length e))) f) ; grab from 'e above and push into our list

Now, if we convert everything to hex and do a check...

Code Select Expand

(set 'eh (map (fn (l) (pack (dup "b" (length l)) l)) e))  ; convert to hex (much faster!)
;
(set 'hex (pack (dup "b" (length f)) f))
(clean nil? (map (fn (s) (find s hex)) eh))
;> (n) ; where n is the location found
;
(time (clean nil? (map (fn (s) (find s hex)) eh)))
;> ~ 1ms  ; this is too slow

I only care about that last bit, where I am using a (clean) and (map) to find the hex. Right now, it tops out at ~1 ms (on my laptop) and about ~4ms on my reference system. This is painfully slow for the amount of traffic I am processing.

Any thoughts would be appreciated.

newdep · April 16, 2011, 02:06:14 AM

'map is very elegant but i think a 'for loop is perhpas quicker.

And using 'array's here could speed up too.

Im not sure if 'find has a 'regex layer underwater but if it hasn't perhpas regex might speed it up too.

Just my Saturday's 2 cents ;-)

cormullion · April 16, 2011, 02:55:12 AM

Hi Kanen.

But in general, my understanding of newLISP is that mapping solutions aren't as fast as plain iterative solutions, and that mapping anonymous functions is slightly even less fast than mapping defined functions.

I don't know at what point you have to get closer to the machine than an interpreted scripting language lets you. Perhaps you're near that point now.

Edit: I read the post wrong...

newLISP Fan Club

News:

Need faster find (with a semi-complex example)

kanen

newdep

cormullion