parse internal tokenizer and double quotes

Dmi · October 22, 2005, 03:54:23 PM

I found that 'parse', called with internal tokenizer have in mind about double quotes matching (cool!). But, it doesn't include them into returned tokens:

Code Select Expand
newLISP v.8.7.0 on linux, execute 'newlisp -h' for more info.

> (parse """)

string token too long in function parse : "" 0708x2070801"

> (parse ""abc"")
("abc") 
>

Is there a way to got (""abc"") instead of ("abc").

I trying to make function to read un-evaluated s-expression.

If parse's internal tokenizer will include double quotes where they are found in tokens, then quite good read function will be as small as 13 lines (in my version).

Either, imho, it's good when things like 'parse' will have the universal rather than special behavior. I.e., in case of 'parse' it's easy to strip quotes when they are unwanted, but pretty unavailable to restore them when they are already stripped.

Lutz · October 22, 2005, 07:21:12 PM

Use 'parse' with specifying a break string and regular expression in the break string. The 'break-string' is the specification which tells 'parse' where to break up the string.

If you use regular expressions in the break-string you have to specify an options number, which is 0 (zero) in the most simplest case, '1' for case-insensitive matching etc.

For example:

Code Select Expand

> (parse {this "is" a    sentence} {s+} 0)
("this" ""is"" "a" "sentence") 
>

The curly braces are used as string delimiters so I can freely use quotes " inside the string. The regular expression pattern: {s+} tells to break up the string at one ore more white space characters (spaces, linefeeds, tabs, etc.).

The 0 (zero) tells that {s+} should be taken as a regular expression and not as a literal string. Ther are other numbers than 0 you would use for other situations. See 'regex' for other option numbers ans their meaning.

Lutz

newLISP Fan Club

News:

parse internal tokenizer and double quotes

Dmi

Lutz