newLISP Fan Club

Forum => newLISP in the real world => Topic started by: newdep on May 26, 2007, 04:05:08 PM

Title: Bug? dots in starts/ends-with
Post by: newdep on May 26, 2007, 04:05:08 PM
Hi Lutz,



We have a "DOT" bugging around in starts-with and ends-with.. ;-)



Something realy funny is going on here...



> (starts-with "ohno!" "oh" 1)

true



> (starts-with "ohno!" "oh." 1)

true



> (starts-with "ohno!" "Yes|oh." 1)

true



> (starts-with ".ohno!" "." )

true



> (starts-with ".ohno!" "." 1)

true



> (ends-with "ohno." "." 1)

nil



> (ends-with "ohno." ".")

true







Norman.
Title:
Post by: Lutz on May 26, 2007, 04:44:31 PM
the dot in a regular expression means 'any character'. Except for the second to last one where I am not sure, they are all correct.



Lutz
Title:
Post by: Lutz on May 26, 2007, 04:50:54 PM
If you are checking for a dot as the last character with a regular expression you would do:


(ends-with "ohno." "\." 0) => true

Lutz
Title:
Post by: newdep on May 26, 2007, 04:52:08 PM
aha oke... Then i think the documentation should be a little adjusted from the

string part on starts-with and ends-with..



because i was under the impression that only for lists the regex where possible..

and that the "....|....." was a hardcoded "or"..



but indeed this one bothers me and thats the one im fighting all day now..



> (ends-with "ohno." "." 1)

nil





Norman.
Title:
Post by: newdep on May 26, 2007, 04:59:14 PM
Aggg those regex kill me... ;-)



double \ man...i was hitting . all day...





Its time for some logic inside regex ;-) (for the none regex manual reading kind of programmer)
Title:
Post by: Lutz on May 26, 2007, 05:14:19 PM
... but there is indeed a problem, which will be fixed in 9.1.7.



As a workaround when using regex in 'ends-with' always anchor the regulare expression to the end:


(ends-with "onhno." ".$" 1) => true

now it works correctly



Lutz
Title:
Post by: newdep on May 27, 2007, 02:28:43 AM
Mmmm its not the solution.... the '.$" removes everything from my lists ;-)

Ill have to do it differently for now...



Thanks...



Norman.
Title:
Post by: Lutz on May 27, 2007, 03:48:54 AM
if you want to detect an ending dot you really should use:


(ends-with xyz "\.$" 0)
and not
(ends-with xyz ".$" 0)
 which would fire on any string in xyz



The anchoring bug is fixed in 9.17.tgz but don't want to relase until the GUI stuff is done in a few days. If this is an urgent problem I can release the 9.1.7 version earlier. But including '$' at the end of the regex string really should take care of your problem.



what regex pattern are you looking for? perhaps we can help you there?



Lutz
Title:
Post by: newdep on May 27, 2007, 03:56:27 AM
you are early awake ;-)



I can life with the '." for now I do some manual cleaning on the list

every now and then... Im doing some webstatistics ;-) durrently the

list is between 150.000 and 80.000 entry's and the regex im using is

cleaning data.. its oke for now.. Ill hope to finetune and release this tool

(yes it becoming a tool ;-) in GUI format...;-)



Norman.