Functional programming - permutations

Started by cormullion, November 14, 2007, 08:26:23 AM

Previous topic - Next topic

TedWalther

#15
If you just want to make combinations (not permutations) here is a function I came up with:



(define (combinations n k (r '()))
  (if (= (length r) n)
    (list (sort r <))
    (apply append (map (fn (x) (combinations n ((+ 1 $idx) k) (cons x r))) k))))


Careful, it can blow up quickly and eat all your memory.  For better memory effiency, I'd use dolist and an accumulator variable.  For my purposes, creating all possible poker hands, it worked quickly and well.
Cavemen in bearskins invaded the ivory towers of Artificial Intelligence.  Nine months later, they left with a baby named newLISP.  The women of the ivory towers wept and wailed.  \"Abomination!\" they cried.

TedWalther

#16
Here is a sample of how to call the "combinations" function:



> (combinations 2 '(a b c))
((a b) (a c) (b c))
Cavemen in bearskins invaded the ivory towers of Artificial Intelligence.  Nine months later, they left with a baby named newLISP.  The women of the ivory towers wept and wailed.  \"Abomination!\" they cried.

TedWalther

#17
I have updated the function, now it doesn't blow up the stack, keeps memory usage within very small bounds:



;; n is the set of elements
;; k is the number of elements to choose
(define (combinations n k (r '()))
  (if (= (length r) k)
    (list r)
    (let (rlst '())
      (dolist (x n)
        (setq rlst (append rlst
                      (combinations ((+ 1 $idx) n) k (append r (list x))))))
      rlst)))


This version also corrects a problem with the arguments; n is supposed to represent the set, and k the number of elements to "choose".  Previous version of the function had this reversed.  It is now proper standard behavior.



However, be warned; this new version is 6 times slower than the version that uses up all your ram.  Any ideas on how to speed it up?
Cavemen in bearskins invaded the ivory towers of Artificial Intelligence.  Nine months later, they left with a baby named newLISP.  The women of the ivory towers wept and wailed.  \"Abomination!\" they cried.

TedWalther

#18
As an example, the memory efficient version took 77 seconds to calculate all possible poker hands in a deck of cards.  The fast version using map took 11 seconds.  So, a 7x speed difference.



Now if only there was a way to speed things up.  I don't like a 7x speed hit just to stay within memory bounds.
Cavemen in bearskins invaded the ivory towers of Artificial Intelligence.  Nine months later, they left with a baby named newLISP.  The women of the ivory towers wept and wailed.  \"Abomination!\" they cried.

abaddon1234

#19
They have some very nice LOGIC programming theory online too..

https://www.gclubtg.com/royal1688-download/">โหลด royal1688

rrq

#20
You can gain a magnitude in speed by taking some more care in avoiding copying, as in the following
(define (permutations items)
  (if (empty? items) '()
    (1 items)
    (let ((e (cons (first items))) (n (length items)))
      (flat (map (fn (p (i -1)) (collect (append (0 (inc i) p) e (i p)) n))
                 (permutations (rest items)))
            1))
    (list items)))


and a similar care to combinations led me to the following variant, which is slim and fast:
(define (combinations items k)
  (if (<= (length items) k) (list items)
    (= k 1) (map list items)
    (append (combinations (rest items) k)
            (map (curry cons (first items))
                 (combinations (rest items) (dec k))))))

rrq

#21
I've been time testing permutations function above, using the expression:
(time (permutations (sequence 1 10)))
The funny thing is, that when repeating this a number of times, the time goes up some 100 ms on each test.

For example, for the first 20 repetitions (clipped to ms), I got:  2854, 2498, 2523, 2589, 2600, 2671, 2752, 2908, 2968, 3132, 3191, 3516, 3752, 4162, 4256, 4556, 4644, 4895, 5005, 5180 ms, in that order,

Then, for the next 20 got: 5340, 5513, 5640, 5880, 6057, 6294, 6412, 6601, 6736, 6910, 7051, 7302, 7485, 7659, 7849, 8111, 8291, 8480, 8730, 8929 ms, in that order.

and so on.



Doing (reset) makes it start again from the beginning.



It looks like it'd be some memory management bug, but I haven't been able to find out where; I'm still scouting.



This concerns newlisp 10.6.4 as of 2015-09-18 5:10pm +10:00, with version line:

newLISP v.10.6.4 32-bit on Linux IPv4/6 UTF-8 libffi.

Lutz

#22
newLISP frees most memory allocated immediately to the OS. Only unused cell structures are kept in a free-cell list for reuse. This normally leads to faster performance avoiding to call memory allocation and deallocation routines in the OS every time for lisp cells.



The slowdown on repeated execution of a function is extremely rare, normally repeated execution of a program with a mixture of different functions will never slow down. In most  cases the cell recycle-list speeds up program execution.



In your specific case you could avoid slow down by inserting a (reset nil) in your function:



(define (permutations items)
  (if (empty? items) '()
    (begin (reset nil) (1 items))
    (let ((e (cons (first items))) (n (length items)))
      (flat (map (fn (p (i -1)) (collect (append (0 (inc i) p) e (i p)) n))
                 (permutations (rest items)))
            1))
    (list items)))


This destroys the cell recycling pool. This usage of reset is not documented and in my own work, I have never found it to be necessary and have never used it, except for testing the memory system.

rrq

#23
Right. Yes, as far as I can work it out, it really appears to be due to something deteriorating when the distance between consecutive free cells grows.



I simplified the testing code down to the following (though I'm sure there are better ways):
(and (setf A (sequence 1 20000000)) true)
(dotimes (j 100) (println (time (dotimes (i 10000) (0 (* 10 i) A)))))

Then I instrumented the code so each time also reports on the number of cells allocated via copyCell, and the sum of the consecutive address distances of the allocated cells. From that I could see that the average distance between memory cells starts of at 3 then grows steadily to around 34000 while the computation time grows from 5s to 15s.



Basically it suggests that when the address distance between consecutive freed cells is large, it takes longer time to allocate; which is what the implicit indexing does. I would guess this is due to some page caching (kernel or processor) becoming ineffective, and there's not much a newlisp user can do about it. Adding (reset nil) fixes the distances, but it takes an awfully long time in itself.