I have almost gotten this on my own, but am stumped.
I am after a list of the words occurring two or more times in 'title-words, that are "good", sorted by frequency (high to low).
Here I try:
#/usr/local/bin/newlisp
; my list of words:
(set 'title-words '("one" "two" "two" "three" "three" "three" "four" "four" "four" "four" "five" "five" "five" "five" "five" "six" "six"))
; words to remove from my list:
(set 'bad-title-words '("two" "four"))
; the "good" words i want:
(set 'good (difference title-words bad-title-words))
; an index count of the good words frequencies:
(set 'title-words-index (count good title-words))
; good word frequencies that occur more than once in my list
(set 'big-title-words-index (ref-all '1 title-words-index < true))
(println title-words) ; initial list of words
:-> ("one" "two" "two" "three" "three" "three" "four" "four" "four" "four" "five" "five" "five" "five" "five" "six" "six")
(println bad-title-words) ; words to remove
;-> ("two" "four")
(println good) ; words i want to keep
;-> ("one" "three" "five" "six")
(println title-words-index) ; a count of good word frequency
;-> (1 3 5 2)
(println big-title-words-index) ; somehow related to the words i want to get and sort by frequency
;-> (3 5 2)
(exit)
Now i dont know how to get the words back again. I think I got lost and even went down a wrong road.
I would appreciate any directions back to my path! :0)
I'm not 100% sure what you wanted, since your description and the example seem to contradict, so let me see: You want all duplicates in a list, sorted by how often they occur?
;; reverse inner list, so that only the value is an item again
(println (map
(fn (x) (first x))
;; filter out those elements with 1's
(filter
(fn (x) (not (= (last x) 1)))
(sort
;; make a list with (value, index)-items
(map
(fn (c x) (list x c))
(count (unique title-words) title-words)
(unique title-words))
;; define compariosn function
(fn (a b) (> (last a) (last b)))))))
Here that's what I could come up with. Basically first, I look at all the different things that are in the list, generate a list which items are also lists with two elements each: first the value (e.g. "one" or "two"), then how often they are used. Then I sort that list with the custom comparison function (fn (a b) (> (last a) (last b))) (i.e. based on how often they occur), then I filter out all those who are only used once and last, but not least, I revert the list, so that it doesn't contain the (value, index)-pairs anymore, but only the values like you wanted.
Seems this slightly shorter version also works:
;; reverse inner list, so that only the value is an item again
(println (map
(fn (x) (last x))
;; filter out those elements with 1's
(filter
(fn (x) (!= (first x) 1))
(sort
;; make a list with (index, value)-items
(map
list
(count (unique title-words) title-words)
(unique title-words))
>))))
Another approach is to use contexts...
(set 'title-words '("one" "two" "two" "three" "three" "three" "four" "four" "four" "four" "five" "five" "five" "five" "five" "six" "six"))
(set 'bad-title-words '("two" "four"))
(define C:C)
(dolist (word title-words)
(if (set 'tally (C word))
(C word (inc tally))
(C word 1)))
and a function to check words:
(define (good? word)
(not (find word bad-title-words)))
A list of all words sorted by their frequency:
(sort (C) (fn (x y) (> (last x) (last y))))
;-> (("five" 5) ("four" 4) ("three" 3) ("two" 2) ("six" 2) ("one" 1))
Just the good words, and their frequency:
(filter (fn (x) (good? (first x))) (C))
;-> (("five" 5) ("one" 1) ("six" 2) ("three" 3))
So, all the good words and their frequencies, sorted by frequency:
(filter (fn (x) (good? (first x)))
(sort (C)
(fn (x y) (> (last x) (last y)))))
;-> (("five" 5) ("three" 3) ("six" 2) ("one" 1))
Kickin!
Thanks Patrick and cormullion! Im on the move again!
Both are superb examples and really teach a lot! Much appreciated again! :0)