Speed test: Ruby & hash tables vs. newLISP & symbols

Started by William James, June 29, 2006, 07:47:19 PM

Previous topic - Next topic

William James

Most people who adopt newLISP after using awk, Ruby, Perl, or Python are concerned about the lack of associative arrays (hash tables).  So I decided to test how well newLISP's symbols in contexts compare to Ruby's hash tables.  The task is to parse a file of 3,622,143 bytes that contains letters P--Q of an unabridged dictionary.  The first programs simply count the number of unique words and the number of letters in those words.  The second pair of programs count the number of times each word occurs.



newLISP:

(set 'start-time (time-of-day))

(while (read-line)
  (dolist (word (parse (current-line) {[^A-Za-z]+} 0))
    (if (not (empty? word))
      (sym word 'Words))))

(set 'middle-time (time-of-day))

(set 'char-count 0)
(dolist (word-sym  (symbols Words))
  (inc 'char-count  (length word-sym)))

(set 'end-time (time-of-day))

(set 'fmt "%-34s%5dn" )
(print (format fmt "Milliseconds to parse file: "
  (- middle-time start-time)))
(print (format fmt "Milliseconds to count characters: "
  (- end-time middle-time)))
(print (format fmt "Total milliseconds: "
  (- end-time start-time)))
(println (length (symbols Words)) " words; "
  char-count " characters")

Ruby:

def mil( f ); (f * 1000).to_int; end


start_time = Time.now

words = {}
while line = gets
  line.split( /[^A-Za-z]+/ ).each{ |word|

    words[ word ] = true   if not word.empty?
  }
end

middle_time = Time.now

char_count = 0
words.each_key{ |word|  char_count += word.size }

end_time = Time.now

fmt = "%-34s%5dn"
puts fmt % [ "Milliseconds to parse file:",
  mil(middle_time - start_time) ]
puts fmt % [ "Milliseconds to count characters:",
  mil(end_time - middle_time) ]
puts fmt % [ "Total milliseconds: ",
  mil(end_time - start_time) ]
puts "#{ words.size } words; #{ char_count } characters"



Ruby:
Milliseconds to parse file:       11646
Milliseconds to count characters:   121
Total milliseconds:               11767
40821 words; 304947 characters

newLISP:
Milliseconds to parse file:        5538
Milliseconds to count characters:    70
Total milliseconds:                5608
40821 words; 304947 characters

newLISP:

(set 'start-time (time-of-day))

(while (read-line)
  (dolist (word (parse (current-line) {[^A-Za-z]+} 0))
    (if (not (empty? word))
      (if (sym word 'Words nil)
        (inc (sym word 'Words))
        (context 'Words word 1)))))

(set 'middle-time (time-of-day))

(set 'word-count 0)
(dolist (word-sym  (symbols 'Words))
  (inc 'word-count (eval word-sym)))

(set 'end-time (time-of-day))

(set 'fmt "%-34s%5dn" )
(print (format fmt "Milliseconds to parse file: "
  (- middle-time start-time)))
(print (format fmt "Milliseconds to count words: "
  (- end-time middle-time)))
(print (format fmt "Total milliseconds: "
  (- end-time start-time)))
(println word-count " words; " (length (symbols 'Words))
  " unique words")

Ruby:

def mil( f ); (f * 1000).to_int; end

start_time = Time.now

words = Hash.new( 0 )
while line = gets
  line.split( /[^A-Za-z]+/ ).each{ |word|
    words[ word ] += 1   if not word.empty?
  }
end

middle_time = Time.now

word_count = 0
words.each_value{ |cnt|  word_count += cnt }

end_time = Time.now

fmt = "%-34s%5dn"
puts fmt % [ "Milliseconds to parse file:",
  mil(middle_time - start_time) ]
puts fmt % [ "Milliseconds to count words:",
  mil(end_time - middle_time) ]
puts fmt % [ "Total milliseconds: ",
  mil(end_time - start_time) ]
puts "#{ word_count } words; #{ words.size } unique words"


Ruby:
Milliseconds to parse file:       11827
Milliseconds to count words:         80
Total milliseconds:               11907
662846 words; 40821 unique words

newLISP:
Milliseconds to parse file:        6930
Milliseconds to count words:         60
Total milliseconds:                6990
662846 words; 40821 unique words

Keep in mind that Ruby is slower than Python and Perl.

cormullion

#1
Nice work! You may be amused at my fumblings in a similar vein:



http://www.alh.net/newlisp/phpbb/viewtopic.php?p=5766&highlight=ruby#5766">//http://www.alh.net/newlisp/phpbb/viewtopic.php?p=5766&highlight=ruby#5766



I'm not an obsessive speed freak, since I've now learnt about the trade-offs between elegant code, fast code, and easy-to-write code.

William James

#2
I didn't realize that I was duplicating some of your work.  It's always good to see someone who is familiar with Ruby here.  Look at this:


creator of newLISP: Lutz
creator of Ruby:    matz

Both names are 4 letters long and end with "tz"!

Furthermore,
QuoteRuby is a language designed in the following steps:



  * take a simple lisp language (like one prior to CL).

  * remove macros, s-expression.

  * add simple object system (much simpler than CLOS).

  * add blocks, inspired by higher order functions.

  * add methods found in Smalltalk.

  * add functionality found in Perl (in OO way).



So, Ruby was a Lisp originally, in theory.

Let's call it MatzLisp from now on. ;-)



                                                        matz.

MatzLisp and LutzLisp!

cormullion

#3
:-) In fact the script is from Matt Neuburg's "AppleScript: The Definitive Guide", which was where I first heard about Ruby and perhaps when I started looking for alternatives to AppleScript for some tasks. (A quest which continues until somebody writes an AppleEvent interface for newLISP :-))



I think newLISP, Ruby, and Python do well because one person is at the steering wheel, rather than a steering committee...











newLISP: also a Lisp, not unlike Ruby ;-)