using regular expression causes newlisp.exe terminated

Started by xmftlg, March 23, 2013, 07:57:37 AM

Previous topic - Next topic

xmftlg

Files in  attachment  are test.lsp b.txt c.txt



test.lsp:

(set 's (read-file "c.txt"))
(println (find-all {(?s)target=_blank>(?:(?!target=_blank>).)*?在线观看_百度视频}  s ) )

(exit
)


while b.txt and c.txt are actually html source code.



E:newlisp>newlisp

newLISP v.10.4.7 on Win32 IPv4/6 UTF-8 libffi, execute 'newlisp -h' for options.



> (load "test.lsp")



And newlisp terminated abnormal.

 

change in test.lsp:
(set 's (read-file "b.txt"))

E:newlisp>newlisp

newLISP v.10.4.7 on Win32 IPv4/6 UTF-8 libffi, execute 'newlisp -h' for options.





> (load "test.lsp")

("target=_blank>銆?em>鍟﹀暒鍟﹀痉鐜涜タ浜?/em>銆嬪姩婕紙2瀛e叏锛夐珮娓呭湪绾

胯鐪媉鐧惧害瑙嗛")



Now see the correct string.



in utf8 env the string is :

   ("target=_blank>《啦啦啦德玛西亚》动漫(2季全)高清在线观看_百度视频")  



Testing it in  v10.4.5 is the same result.



D:newlisp>newlisp

newLISP v.10.4.5 on Win32 IPv4/6 UTF-8 libffi, execute 'newlisp -h' for more inf

o.



> (load "test.lsp") ;;read c.txt



D:newlisp>newlisp

newLISP v.10.4.5 on Win32 IPv4/6 UTF-8 libffi, execute 'newlisp -h' for more inf

o.



> (load "test.lsp") ;;read b.txt

("target=_blank>銆?em>鍟﹀暒鍟﹀痉鐜涜タ浜?/em>銆嬪姩婕紙2瀛e叏锛夐珮娓呭湪绾

胯鐪媉鐧惧害瑙嗛")  



can anyone help?

xmftlg

I also try to increase the newlisp stack like :



E:newlisp>newlisp -s 100000 test.lsp



E:newlisp>newlisp -s 1000000 test.lsp



but seems change nothing.

Lutz

#2
This is a problem in the PCRE library routines. See also here:

http://stackoverflow.com/questions/3613121/regular-expression-crashes-apache-due-to-pcre-limitations-need-some-help-optimis">http://stackoverflow.com/questions/3613 ... lp-optimis">http://stackoverflow.com/questions/3613121/regular-expression-crashes-apache-due-to-pcre-limitations-need-some-help-optimis



and here:

http://newlispfanclub.alh.net/forum/viewtopic.php?f=16&t=3724&p=18722&hilit=regex+crash#p18722">http://newlispfanclub.alh.net/forum/vie ... ash#p18722">http://newlispfanclub.alh.net/forum/viewtopic.php?f=16&t=3724&p=18722&hilit=regex+crash#p18722



On OSX this causes a crash, which occurs in pcre_exec(). It seems to have to do with nesting of HTML blocks.

xmftlg

#3
Thanks Lutz.