While we are discussing new features...

Started by Jeff, March 20, 2008, 05:27:49 PM

Previous topic - Next topic

Jeff

Can we have the ability to force caching of a compiled regular expression?  If I have five that I iterate over regularly in an application, it makes no sense to only cache one.  A function:



(re-compile {some regex string}) ; => int pointer to expr.



It would mean that regular expression using functions would need to be modified to accept that as well, but it would give us a lot of savings for that.
Jeff

=====

Old programmers don\'t die. They just parse on...



http://artfulcode.net\">Artful code

Lutz

#1
This is included in version 9.3.5 (this week)


; with precompilation

(set 'p1 (regex-comp "a\d" 0))
(set 'p2 (regex-comp "z\d" 0))

(time (begin (find p1 "ab1ca9z6dh:" 0)
             (find p1 "ab1ca9z6h:" 0)) 1000000)

=> 508 ms

; without precompilation

(time (begin (find "a\d" "ab1ca9z6dh:" 0)
             (find "z\d" "ab1ca9z6h:" 0)) 1000000)

=> 2236 ms


precompiling gives about 400% speedup on these patterns, mileage varies with patterns and target strings.



Note that precompiling is only worth it, when patterns change and are repeated, when just one pattern is repeated immediately, newLISP by itself takes care of compiling and caching  pattern.

Jeff

#2
Thanks!  Pattern matching is one of lisp's strengths.  CL users often dismiss regular expressions as a method of making Perl harder to read, but when dealing with strings they are the equivalent of match and unify.  Having optimized regular expressions will give many programs significant speedups.



Also, do you have any documentation for the the c types and their helper functions in newLISP?  When embedding newLISP or accessing it via an API, I would like to have finer grained access than newlispEvalStr.
Jeff

=====

Old programmers don\'t die. They just parse on...



http://artfulcode.net\">Artful code

Lutz

#3
Quotedo you have any documentation for the the c types and their helper functions in newLISP?


newLISP uses the function 'import' and the following helper functions: get-char, get-float, get-int, get-long, get-string, pack, unpack, flt, cpymen and callback. The usage of this functions requires good C-knowledge, the understanding of C data types and calling conventions. The result is a much tighter and faster binding to external C-libraries, than it is possible with other types of interfaces.



More about interfacing to C data types when wrapping external C-libraries is found here:



http://www.newlisp.org/CodePatterns.html#extending">http://www.newlisp.org/CodePatterns.html#extending



the following modules installed in /usr/share/newlisp/modules or in C:Program Filesnewlispmodules are built using this method:



crypto.lsp, gmp.lsp mysql.lsp, mysql5.lsp, odbc.lsp, sqlite3.lsp, unix.lsp and zlib.lsp



ps: see also the file newlisp-x.x.x/examples/opengl-demo.lsp which shows how to implement callbacks

Jeff

#4
Actually, I was looking at it the other way around- documentation on the implementation of newLISP.  Python has an interesting module, ctypes, which I am using to embed newlisp in my python modules.  Being able to work at a lower level than the blanket newlispStrEval would be nice.
Jeff

=====

Old programmers don\'t die. They just parse on...



http://artfulcode.net\">Artful code