Making [text][/text] more convenient

Started by lwix, November 09, 2004, 06:19:01 PM

Previous topic - Next topic

lwix

Hi Lutz,

  This is an rfc - a request for convenience ;-)

Should the parser be made to ignore the first hard newline after

[text]? This would make it easier to write and read boxed-quoted text in scripts.



For example: here is how we currently write 2 indented paragraphs:

[text]___words words words words

words words words words etc

words words words words etc  

words words words words etc

___words words words words

words words words words etc

words words words words etc

words words words words etc

[/text]



If the first hard newline is ignored, we can instead do this

[text]

___words words words words

words words words words etc

words words words words etc  

words words words words etc

___words words words words

words words words words etc

words words words words etc

words words words words etc

[/text]



I'm guessing the first real-world difference will be felt on the board:

when we use the style for 'code'.

Question: Should this be extended to the hard newline before [/text]?

i.e. at the end of the quoted text.  I don't know.



Lucas
small\'s beautiful

Lutz

#1
So basically the [text], [/text] tags would always be on a line by itself. I am not sure ... I have to think about this a little bit longer. It is practical for the situation you describe (and I have run into the same situation many times), on the other side I have used [text], [/text]  many times in a flowing line, and on purpose too. Remember that the [text],[/text] tags are not only for the situation, where you have buffers > 2048 characters but are also used as the only delimiters, which completely suppress any other processing of escape characters, quotes etc.. I also wonder what others think?



Lutz

Sammo

#2
Please leave [text][/text] as they are. I, too, use them 'inline' when passing long text strings to newLisp.dll via Hans-Peter's NeoBook interface.

eddier

#3
I wished [text][/text] was the {}.

Instead of "", [text][/text], and {} just have "" and {}.



Just my two cents.



Eddie

lwix

#4
Quote from: "Lutz"So basically the [text], [/text] tags would always be on a line by itself. I am not sure ... I have to think about this a little bit longer...



Lutz

Not at all.



I shall be more precise: the parser would ignore the first newline only if the newline exists.



No current code will be affected except those that already have an explicit newline.  In this case, another newline will have to be added.



So the proposed change makes the following equivalent:



[text]hello world

hello world

[/text]



will parse the same as



[text]

hello world

hello world

[/text]



In the second example, because the parser finds a single newline as the first character after [text], it skips it.  It does not skip any arbitrary character, only if it is a newline.  So we can continue to use [text][/text] either way without change.  Hence, the proposal does not break the style of coding where we place [text]hello there[/text] on the same line.



My reference to changes to this board referred to the fact that it will be easier for users to write their normal comments interspersed with blocks of code.  The board code would not need changing. (if it does use [text][/text] that is.)



Lucas
small\'s beautiful

newdep

#5
Hello All...



Where [text] is very handy i must agree with eddy. That { } would be a great

alternative.  As we have in newlisp the "newlispEvalStr" function the need for

[text] [/text] is highly needed there.



As Tcl uses { } also for Eval and where " " is not recommended in Tcl. This is

for Newlisp purly a shorter way of notation.



It could be an option though to have the choise in a "define/contstant/set"

to have a user defined "Comment" like (where it must be included in

every source file at the beginning, otherwise it wont work) -->>



(setq comment '( "[text]"  "[/text"] )    



(setq comment '( "{" "}" )



(setq comment '( "[text]n" "[/text]n" )



Norman.
-- (define? (Cornflakes))

lwix

#6
Quote from: "newdep" ...

It could be an option though to have the choise in a "define/contstant/set"

to have a user defined "Comment" like (where it must be included in

every source file at the beginning, otherwise it wont work) -->>



(setq comment '( "[text]"  "[/text"] )    



(setq comment '( "{" "}" )



(setq comment '( "[text]n" "[/text]n" )



Norman.


Nice idea but there would be a large performance hit on the parser.  You could write yourself a preprocessor that does the necessary substitutions prior to running the main script.  But that would be too much work and messy ...



However, my suggestion could easily be implemented within the function or macro that is doing the processing. e.g.
(define-macro (dostuff! _str)
  ;; check for special case
  (if (= (first (eval _str)) "n")
    (nth-set 0 (eval _str) ""))

  ;; now we do stuff to the string

  nil ;; dummy return
)
As I said, it's just a convenience thing :-)

Lucas.
small\'s beautiful

eddier

#7
I see your point lwix. Also check for white space between the [text] and n.

So that (I'm using "{}" in place of "[text][/text]."



(print { ttt ; white space
Content-type: text/html

<html>
.
.
.
</html>})

equals

(print {Content-type: text/html

<html>
.
.
.
</html>})


I like it.



Eddie

Lutz

#8
- All the text delimiting tags are enclosing data, so it would not be wise to change those when reading and parseing the tags. Imagine you have the sequence [text]LFLFLFLF ... [/text], so the data starts with mutliple linefeed. If you now strip away the first  linefeed while reading and then you serialize the data again to a file or memory, then every time you parse the [text],[/text] tags you would loose characters from your data. The point is all text delimiting tags may reformat for display, but always should serialize the data in a way, they can be parsed back without change.



- We cannot really collapse {,} and [text],[/text] in to only using {,}, because as text-buffers grow very big the probability of having unbalanced {,} inside is very high. This can be observed in smaller string portions, but is hard to watch if you deal with a 50 kByte web page, which perhaps isn't even yours but read from somewhere else. The [text],[/text] tags are multicharacter on purpose, its a pain to type but the probability of having a [/text] has part of the text is very low. This is why for example in XML you use  <![cdata[ and ]]> as text data tags, they are hard to type, but pretty safe not to be part of normal text.



- {,} are convenient when using small portions of text with double quotes inside, like it happens frequently in HTML portions or TCL/TK text. In TCL/TK {,} are fine becuase if they occur inside the string, they are always balanced for correct Tcl syntax.



- The possibility of defining your own delimiter tags, doesn't work for newLISP, because in newLISP any object can be serialized to a file and should then be readable back in a standard way. If you let define tags customary, you loose that ability. Serializing data objects to a file with (save 'symbol) is very important in newLISP as a convenient, quick way to save/reload data.



To make a long story short, I think we leave it like it is.



Lutz

eddier

#9
Okidoki!



What if {} were made to handle text of indefinite length (just to be not limited to 2048 chars). Then one could use {} for inline and [text][/text] for multiline. The only use I can see for the {} buggers is for pattern matching since I can't use n, t, etc. within them. For example I cannot use

(print
  (append
    {<table>n<tr>n<td>n}
    (format {<a href="%s">%s</a>n} toGo clickable)
    (format { ...


But then again, I now use the convention



(define (<page> %w)
  (replace "`(.+?)`" %w (string (eval-string $1)) 0)))

(print (<page> [text]Content-type: text/html

<html>
...
</html>[/text]))


I don't know, so maybe I'm beating a dead snake here?



Eddie

Lutz

#10
making {,} for unlimited length would be a performance hit for {,} parseing. The [text],[/text] tags call a different function for reading unlimited buffers length and adjusting (re allocating) memory on the go, as required.



If I do this in {,} parseing, together with balance checking of inside {,} it would be a performance hit for {,}. At the moment the buffer for "," and {,} delimited tokens is allocated on the stack with 2048. I could increase that number lets say to 16384, but then I run out of stack much sooner when 'C' getToken() is called recursevely inside newLISP routines, which may happen quite often.



Lutz

eddier

#11
Tradeoffs Tradeoffs Tradeoffs ... We like small size and efficiency so we can live with it.



Thanks Lutz!



Eddie

lwix

#12
Quite true :-)

Thank you.
small\'s beautiful