newLISP Fan Club

Forum => Anything else we might add? => Topic started by: gregben on October 23, 2004, 06:20:24 PM

Title: (replace) doesn't seem to work with "^".
Post by: gregben on October 23, 2004, 06:20:24 PM
I'm trying to replace the initial character of a string

(actually, I'm trying to do something more complex, but this is the gist of it).

I tried the following under newLisp 8.2.0 (Sparc Solaris 8) and got the following:



> (replace "a" "aaa" (upper-case $0) 0)

"AAA"

> (replace "^a" "aaa" (upper-case $0) 0)

"AAA"

> (replace "a$" "aaa" (upper-case $0) 0)

"aaA"

> (replace "^a" "aaa" (upper-case $0) 0)

"AAA"

> (replace "xa" "aaa" (upper-case $0) 0)

"aaa"



Note that I expected:

(replace "^a" "aaa" (upper-case $0) 0) to yield "Aaa".

See that "a$" works as expected.

I put in "^a" just for grins, but I expected "aaa" from it.



I suspect this is my fault, but perhaps it is a bug in pcre.c

or newLisp?
Title:
Post by: Lutz on October 23, 2004, 06:43:04 PM
The newLISP function 'replace' will replace _all_ occurences of the pattern found. To replace just the first pattern do a:



(regex "a" "aaa")



which will give you just the first occurence:



("a" 0 1) ; offset 0 length 1



Lutz
Title:
Post by: Lutz on October 23, 2004, 06:58:18 PM
(replace "^a" "aaa" (upper-case $0) 0)



on this pattern it will take "aa" and match again then take "a" and match again.



so 'replace' replaces then takes the rest of the string and repeats the whole procedure again. This behaviour is most of the time Ok wheren it looks like this:



(replace "^a" "axaa" (upper-case $0) 0)



"Axaa"



But fails when having multiple matches one after the other. Unless newLISP would start parseing the pattern itself, the only work around I see at the moment is using 'regex' or find, catching the first position/length and then doing only one replace.



Perhaps we need an additional bit in the option flag to force 'only one replace'.



Lutz
Title:
Post by: Lutz on October 23, 2004, 07:31:17 PM
and this would also work:



(replace "(a)(.*)" "aaa" (append (upper-case $1) $2) 0)



=> "Aaa"



because the search pattern is 'greedy' and 'eats' the whole string.



Lutz
Title:
Post by: gregben on October 23, 2004, 08:19:59 PM
Try this in perl:



#!/usr/bin/perl



$a = "aaa";

$a =~ s/^w/x/;

print "$a=$an";



You will get $a="xaa". This is correct in Perl,

and is what I want newLisp to do.



To anyone who's interested, the pattern

/^w/ means match any single "word" character

at the beginning of the string. This is the correct

function of the "^" character.
Title:
Post by: Lutz on October 23, 2004, 08:38:59 PM
Yes, I understand, newLISP tries to do multiple replaces, it does the first "a" then tries another with the ramaining string "aa" then with "a" as outlined in previous posts. To do the replace only once use the workarounds described.



And I can offer to add a flag in the 'option' field to tell 'replace' to do only one replacein the next version. There are still several bits left in the options flag (see manual for regex and replace).



Lutz
Title:
Post by: gregben on October 23, 2004, 08:42:51 PM
Thanks, Lutz for all your feedback. :-)



Here's a little function I wrote that  is sort of useful

without using regex or replace.



#!/usr/bin/newlisp



(define (initial-capital str)

  (nth-set 0 str (upper-case (nth 0 str)))

  str

)





(println (initial-capital "abc"))

(exit)
Title:
Post by: Lutz on October 24, 2004, 05:31:35 AM
Hi Greg,



 just finished the additional option for 'replace':



(replace "a" "aaa" (upper-case $0) 0x8000) => "Aaa"



(replace "^\w" "aaa" (upper-case $0) 0x8000) => "Aaa"



(replace {^w} "aaa" (upper-case $0) 0x8000) => "Aaa"



of course it can be combined with any other option for regex



Lutz