Safe macros in newLISP - define-smacro

Started by itistoday, April 05, 2009, 07:54:29 PM

Previous topic - Next topic

itistoday

Something that spawned from http://newlispfanclub.alh.net/forum/viewtopic.php?t=2712">this thread was a discussion of the proper way of writing safe macros in newLISP.



As mentioned in the API for the define-macro function, it's very easy to write a macro that suffers from variable capture possibilities.  Simply naming the arguments that the macro takes is the easiest way to shoot yourself in the foot, but even if you avoid that through the use of the 'args' function, hard-to-fix http://newlispfanclub.alh.net/forum/viewtopic.php?t=2712">dangers still lurk.



From that discussion I've learned that the best way to write macros in newLISP is to:


[*] Use the default function to encapsulate them in their own namespace
  • [*] Use C-style naming conventions to avoid name conflicts with other contexts
  • [/list]


    Lutz pointed out this great function to easily generate these safe macros:


    (define (def-static s body)
        (def-new 'body (sym s s)))

    (def-static 'my-or
        (fn-macro (x y)
            (let (temp (eval x)) (if temp temp (eval y)))))

    (setq temp 5)
    (my-or nil temp) => 5


    However, I think it would be a lot easier and more in line with the current define-macro and define-function functions to have a define-smacro function that lets you use the same syntax to define safe macros.  I finally found a way to write this on my own, but it's neither pretty nor efficient:


    (define-macro (define-smacro _params)
    (let (_newCtx (_params 0))
    (eval-string (string (cons 'define-macro (cons (cons (sym _newCtx _newCtx) (rest _params)) $args))) _newCtx)
    )
    )

    ; usage:
    (define-smacro (my-or x y)
    (let (temp (eval x))
    (if temp temp (eval y))))


    You can write your macros as if you were writing normal functions, using named arguments etc., and you don't ever have to worry about something breaking from variable capture.



    The only problem so far is that in my tests using define-smacro was 11 times slower than the def-static function (repeated 10000 times).  This makes sense because you have to build an expression, turn it into a string, and then evaluate that string.  You can't use newLISP's eval function to do this because it won't place the arguments in the correct context.



    Still, maybe this function might be useful to someone.  It would be *really* nice if it could be included natively in the standard library as then it would be very efficient... (*nudge* *wink*)  After all, why write unsafe macros when you could just as easily write safe ones? This would also give newLISP great bragging rights methinks. :-)
    Get your Objective newLISP groove on.

    Kazimir Majorinc

    #1
    I think it is premature to call it safe, because it doesn't solve the kind of problem Lutz and I described - function calling itself passing local variable as free variable.


    (set 'done nil)
    (define-macro (hard-example2 f)
          (for(i 1 3)
             (unless done          ; avoiding infinite
                 (set 'done true)  ; recursion
                 (hard-example2 (lambda(x)i)))  ; danger
     
             (println i " =>" (f i)))) ; which i will be printed?
                                       ; i=1 2 3 means inner i is overshadowed

    (hard-example2 (lambda(x)x))


    If you want that, you need some kind of gensym, like in my protect2 function.




    (println "nnnNAIVE VERSIONnnn")





    (println "Test1")
    (define-macro (my-or)
       (let ((temp (eval (args 0))))
            (if temp temp (eval (args 1)))))
           
    (set 'temp 45)
    (println "(my-or temp nil) = " (my-or temp nil) " (should be 45)")
    (println "(my-or nil temp) = " (my-or nil temp) " (should be 45)")
    (println "-----------")
    (set 'value (my-or (begin (println "first arg") 1) (begin (println "second arg") 2)))
    (println "should be 1: " value)
    (println "-----------")
    (set 'value (my-or (begin (println "first arg") nil) (begin (println "second arg") 2)))
    (println "should be 2: " value)



    (println "Test 2: should be 1 1 1 1 2 3")
    (set 'done nil)
    (define-macro (hard-example f)
          (for(i 1 3)
             (unless done          ; avoiding infinite
                 (set 'done true)  ; recursion
                 (hard-example (lambda(x)i)))  ; danger
     
             (println i " =>" (f i)))) ; which i will be printed?
                                       ; i=1 2 3 means inner is overriden

    (hard-example (lambda(x)x))





    (println "nnnSMACRO VERSIONnnn")

    (define-macro (define-smacro _params)
       (let (_newCtx (_params 0))
          (eval-string (string (cons 'define-macro (cons (cons (sym _newCtx _newCtx) (rest _params)) $args))) _newCtx)
       )
    )

    (println "nnnTest1")
    (define-smacro (my-or)
       (let ((temp (eval (args 0))))
            (if temp temp (eval (args 1)))))
           
    (set 'temp 45)
    (println "(my-or temp nil) = " (my-or temp nil) " (should be 45)")
    (println "(my-or nil temp) = " (my-or nil temp) " (should be 45)")
    (println "-----------")
    (set 'value (my-or (begin (println "first arg") 1) (begin (println "second arg") 2)))
    (println "should be 1: " value)
    (println "-----------")
    (set 'value (my-or (begin (println "first arg") nil) (begin (println "second arg") 2)))
    (println "should be 2: " value)


    (println "nnnTest 2: should be 1 1 1 1 2 3")
    (set 'done nil)
    (define-smacro (hard-example f)
          (for(i 1 3)
             (unless done          ; avoiding infinite
                 (set 'done true)  ; recursion
                 (hard-example (lambda(x)i)))  ; danger
     
             (println i " =>" (f i)))) ; which i will be printed?
                                       ; i=1 2 3 means inner is overriden

    (hard-example (lambda(x)x))






    (load "http://www.instprog.com/Instprog.default-library.lsp")


    (println "nnnPROTECT1 VERSIONnnn")

    (println "nnnTest1")
    (define-macro (my-or2)
       (let ((temp (eval (args 0))))
            (if temp temp (eval (args 1)))))
           
    (set 'my-or2 (protect1 'my-or2 'my-or2 '(temp)))

    (set 'temp 45)
    (println "(my-or temp nil) = " (my-or2 temp nil) " (should be 45)")
    (println "(my-or nil temp) = " (my-or2 nil temp) " (should be 45)")
    (println "-----------")
    (set 'value (my-or2 (begin (println "first arg") 1) (begin (println "second arg") 2)))
    (println "should be 1: " value)
    (println "-----------")
    (set 'value (my-or2 (begin (println "first arg") nil) (begin (println "second arg") 2)))
    (println "should be 2: " value)


    (println "nnnTest 2: should be 1 1 1 1 2 3")
    (set 'done nil)
    (define-macro (hard-example2 f)
          (for(i 1 3)
             (unless done          ; avoiding infinite
                 (set 'done true)  ; recursion
                 (hard-example2 (lambda(x)i)))  ; danger
     
             (println i " =>" (f i)))) ; which i will be printed?
                                       ; i=1 2 3 means inner is overriden


    (set 'hard-example2 (protect1 'hard-example2 'hard-example2 '(f i x)))
    (hard-example2 (lambda(x)x))





    (println "nnnPROTECT2 VERSIONnnn")

    (println "nnnTest1")
    (define-macro (my-or2)
       (let ((temp (eval (args 0))))
            (if temp temp (eval (args 1)))))
           
    (set 'my-or2 (protect2 'my-or2 'my-or2 '(temp)))
           
    (set 'temp 45)
    (println "(my-or temp nil) = " (my-or2 temp nil) " (should be 45)")
    (println "(my-or nil temp) = " (my-or2 nil temp) " (should be 45)")
    (println "-----------")
    (set 'value (my-or2 (begin (println "first arg") 1) (begin (println "second arg") 2)))
    (println "should be 1: " value)
    (println "-----------")
    (set 'value (my-or2 (begin (println "first arg") nil) (begin (println "second arg") 2)))
    (println "should be 2: " value)

    (set 'done nil)
    (define-macro (hard-example2 f)
          (for(i 1 3)
             (unless done          ; avoiding infinite
                 (set 'done true)  ; recursion
                 (hard-example2 (lambda(x)i)))  ; danger
     
             (println i " =>" (f i)))) ; which i will be printed?
                                       ; i=1 2 3 means inner is overriden

    (println "nnnTest 2: should be 1 1 1 1 2 3")
    (set 'hard-example2 (protect2 'hard-example2 'hard-example2 '(f i x)))
    (hard-example2 (lambda(x)x))




    (exit)


    I believe you have more than enough tools and techniques to start pretty large project already and to solve problems as they come and that it is impossible that project fails because of safety of that kind.
    http://kazimirmajorinc.com/\">WWW site; http://kazimirmajorinc.blogspot.com\">blog.

    itistoday

    #2
    Quote from: "Kazimir Majorinc"I think it is premature to call it safe, because it doesn't solve the kind of problem Lutz and I described - function calling itself passing local variable as free variable.


    That's correct, I'm not claiming it is completely safe, but it is *much* safer than the current define-macro, and avoids many variable capture possibilities.



    While your protect macro is great, I wouldn't want to use it for all my macros because of how slow it is, I would just make sure that like Lutz said, to never have free variables lying around.


    QuoteI believe you have more than enough tools and techniques to start pretty large project already and to solve problems as they come and that it is impossible that project fails because of safety of that kind.


    I agree, but that doesn't mean we don't need a define-smacro. Again, why write unsafe macros when you could write more readable, faster, safer ones?
    Get your Objective newLISP groove on.

    Kazimir Majorinc

    #3
    I have protect1 as well, it provides about same level of protection as define-smacro, and result is of about same speed. Protect2 is for more protection and result is slower.


    (load "http://www.instprog.com//Instprog.default-library.lsp")

    (define-macro (define-smacro _params)
       (let (_newCtx (_params 0))
          (eval-string (string (cons 'define-macro (cons (cons (sym _newCtx _newCtx) (rest _params)) $args))) _newCtx)
       )
    )

    (define-smacro (my-or1)
       (let ((temp (eval (args 0))))
            (if temp temp (eval (args 1)))))

    (define-macro (my-or2)
       (let ((temp (eval (args 0))))
            (if temp temp (eval (args 1)))))

    (set 'my-or2 (protect1 'my-or2 'my-or2 '(temp)))

    (println (time (my-or1 3 4) 1000000)) ; => 614
    (println (time (my-or2 3 5) 1000000)) ; => 565


    Why write unsafe macros? Because they are simpler. You might prefer safety more, and it is OK. But why in core language? Smacro can - just like protect1 and 2 exist well in library where it can be checked and rechecked, improved - or abandoned without much harm. And contrary, wrong feature in core language can make lot of harm. Of course, core is faster - but it is not enough.
    http://kazimirmajorinc.com/\">WWW site; http://kazimirmajorinc.blogspot.com\">blog.

    itistoday

    #4
    Quote from: "Kazimir Majorinc"Why write unsafe macros? Because they are simpler.


    How are they simpler? There is no difference between the way you use define-macro and define-smacro.  The fact that define-smacro creates a namespace is insignificant, it's not even really a bad thing. There's nothing but advantages with define-smacro (as far as I can tell).



    Writing safe macros would be just as easy as writing unsafe ones with define-smacro.


    QuoteYou might prefer safety more, and it is OK. But why in core language? Smacro can - just like protect1 and 2 exist well in library where it can be checked and rechecked, improved - or abandoned without much harm.


    define-smacro doesn't need to be checked or improved... at least there's not much room for improvement (currently).  It's just a shorthand for placing a macro in its own namespace.



    Why should the core language be safe? Isn't that a ridiculous question? I might agree if there was some severe problem with define-smacro, but there isn't, and I'm not even suggesting that it replace define-macro.



    Edit: The advantage of having it in the core language is that it would encourage more people to adopt its use, as well as providing an efficient method to create these safe macros.


    QuoteAnd contrary, wrong feature in core language can make lot of harm. Of course, core is faster - but it is not enough.


    What is wrong about define-smacro?
    Get your Objective newLISP groove on.

    Kazimir Majorinc

    #5
    Quote from: "itistoday"How are they simpler?


    Semantics is simpler. Semantics is important for languages like Newlisp.


    Quote
    Why should the core language be safe? Isn't that a ridiculous question? I might agree if there was some severe problem with define-smacro, but there isn't, and I'm not even suggesting that it replace define-macro.


    QuoteEdit: The advantage of having it in the core language is that it would encourage more people to adopt its use, as well as providing an efficient method to create these safe macros.


    I think that as safe macros, these are not the best that can be achieved with Newlisp. My criticism is that they have more complicated semantics than necessary, they do not provide full safety, and do not protect anonymous macros and functions. Have you checked my protect1 and protect2? I think they do all of that. If they do not, something like that can be built.



    Of course, you can say that safe macros as you defined it do not harm, because it is only shorter name for things already done by contexts. It is true, but contexts are really more general concept that cannot be reduced on safe macros, and using it for safe macros is one thing, but defining safe macros through contexts in the core language is another.



    But again - as library, they fit quite well. What's wrong with library?
    http://kazimirmajorinc.com/\">WWW site; http://kazimirmajorinc.blogspot.com\">blog.

    Lutz

    #6
    Here a short definition of a 'define-smacro' as asked for by itistoday and published similar last in the 9.2 version manual:


    ; for macros/fexprs (you could also do this for normal functions)
    (define-macro (define-smacro)
        (let (temp (append (fn-macro) (list (1 (args 0)) (args 1))))
            (def-new 'temp (sym (args 0 0) (args 0 0)))))

    ; usage
    (define-smacro (foo x y z) (.....))


    Similar to earlier in this thread 'def-new' is used to copy and s-expression with all its variables into a namespace.



    The effect of using 'define-smacro' is the same as doing:


    (context 'foo)
    (define-macro (foo x y z) (...)
    (context MAIN)


    ... putting 'x','y' and 'z' into a namespace 'foo' and the function into 'foo:foo'.



    A minor disadvantage of using 'define-smacro' is, that the symbols 'x', 'y', an 'z' will also occur as symbols in the MAIN symbol space. The explicit method writing the context frame avoids this and may be preferable when writing many macros. Normally macros (really fexprs) are not used that much in newLISP so the additional amount of symbols created usinf 'define-smacro's will not make much impact on memory used.



    Kazimir earlier in this thread  observed a small (about 10%) speed difference between a normal macro and a macro contained in a default functor.



    The default functor takes a little bit longer, because it gets resolved during runtime. If instead of calling (foo x y z), you would call (foo:foo x y z), there would be no speed difference. In most cases this little difference can be neglected, and the bigger the macro the less the speed difference will be felt.



    But for the utmost performance, here is a technique to encapsulate several 'define-macro's in one namespace without using default functors, but still using a simple function-name without the colon ':'


    (context 'util)

    (define-macro (foo x y z) (...))

    (define-macro (bar x y z) (...))

    (context MAIN)

    (constant (global 'foo) util:foo)
    (constant (global 'bar) util:bar)

    ; call the macros

    (foo x y z)
    (bar a b c)


    The 'define-macro's can now be called with their simple global alias, without the 10% speed drop when using default functors and additionally they are protected.



    This is a useful method to include a ton of macros in just one namespace.



    I would use this only for core-utility functions. Functions we want to see as part of the core language. In other cases the prefix:function syntax with the visible ':' colon is an advantage, because it visibly structures the code into semantic functional areas.

    itistoday

    #7
    Quote from: "Lutz"Here a short definition of a 'define-smacro' as asked for by itistoday and published similar last in the 9.2 version manual:


    ; for macros/fexprs (you could also do this for normal functions)
    (define-macro (define-smacro)
        (let (temp (append (fn-macro) (list (1 (args 0)) (args 1))))
            (def-new 'temp (sym (args 0 0) (args 0 0)))))

    ; usage
    (define-smacro (foo x y z) (.....))


    Thanks Lutz, I'm surprised I didn't see that possibility, I guess it's because I wasn't very familiar with the 'def-new' function.  Would you consider adding a native version to the core at some point in the future? I would write it myself but it will take me a while to learn the innards of newLISP and at least right now I'm strapped for time.. (also, no offense, but your indentation style throws me off a bit, I've never seen it before..)


    QuoteThe effect of using 'define-smacro' is the same as doing:


    (context 'foo)
    (define-macro (foo x y z) (...)
    (context MAIN)


    ... putting 'x','y' and 'z' into a namespace 'foo' and the function into 'foo:foo'.


    While playing around with it, I made the define-smacro symbol global and attempted to use it while in another context and got some weird results, was wondering if you could explain what's going on here (newLISP 10.0.3):


    > (define-smacro (mtest x y) (println "x=" (eval x) " y=" (eval y)))
    mtest:mtest
    > (context 'FOO)
    FOO
    FOO> mtest:mtest
    (lambda-macro (mtest:x mtest:y) (println "x=" (eval mtest:x) " y=" (eval mtest:y)))
    FOO> (define-smacro (test x y) (println "x=" (eval x) " y=" (eval y)))
    test:test
    FOO> test:test

    ERR: context expected : FOO:test
    > test:test
    (lambda-macro (FOO:x FOO:y) (println "x=" (eval FOO:x) " y=" (eval FOO:y)))
    > FOO:test
    nil
    > (context 'FOO)
    FOO
    FOO> (test 1 2)

    ERR: invalid function : (FOO:test 1 2)
    > (context 'FOO)
    FOO
    FOO> (test:test 1 2)

    ERR: context expected : FOO:test


    Quote from: "Lutz"A minor disadvantage of using 'define-smacro' is, that the symbols 'x', 'y', an 'z' will also occur as symbols in the MAIN symbol space. The explicit method writing the context frame avoids this and may be preferable when writing many macros. Normally macros (really fexprs) are not used that much in newLISP so the additional amount of symbols created usinf 'define-smacro's will not make much impact on memory used.


    I'm not sure I follow you here, when I used define-smacro as shown above, all of the parameters 'x', 'y', etc seemed to be in the proper mtest context, it was only when I tried to define a safe macro while in a context other than MAIN that weird things happened.
    Get your Objective newLISP groove on.

    Lutz

    #8
    The current 'define-smacro' should only be used from the MAIN context.



    When the following code is parsed/translated:


    (define-smacro (foo x y z) ..)

    All symbols 'foo', 'x' and 'y' are first created in the current context. When the expression is evaluated, 'foo:foo', 'foo:x' and 'foo:y' are created. 'x' and 'y exist now in both namespaces MAIN and 'foo'.



    In your experiments inside of the 'mtest' namespace 'define-smacro' is trying to make a context out of 'mtest:test' which rightfully causes the error message, because 'test' already exists locally in 'mtest' after parsing the 'define-smacro' expression.



    A native implementation would easily avoid duplicates and and create the global static macro from wherever you want, but this is not desirable from a code hygienic point of view in bigger projects and programming teams. A global function/macro created using 'define-smacro' feels like an extension to the core language because its global.



    A programmer working in his dedicated assigned namespaces shouldn't be allowed to define new global symbols in the MAIN space. The MAIN space is maintained by somebody else, who is probably the lead programmer in the team, knowing about the entire architecture of the program. It is asking for name-clashes in the MAIN space if several programmers write global macros from inside their namespaces.



    ps: see also here: http://www.newlisp.org/newlisp_manual.html#context_rules">http://www.newlisp.org/newlisp_manual.h ... text_rules">http://www.newlisp.org/newlisp_manual.html#context_rules



    ps2: we could change the current definition to:


    (define-macro (define-smacro)
        (if (!= MAIN (context)) (throw-error "Not in MAIN"))
        (let (temp (append (fn-macro) (list (1 (args 0)) (args 1))))
            (def-new 'temp (sym (args 0 0) (args 0 0)))))

    itistoday

    #9
    Quote from: "Lutz"The current 'define-smacro' should only be used from the MAIN context.



    When the following code is parsed/translated:


    (define-smacro (foo x y z) ..)

    All symbols 'foo', 'x' and 'y' are first created in the current context. When the expression is evaluated, 'foo:foo', 'foo:x' and 'foo:y' are created. 'x' and 'y exist now in both namespaces MAIN and 'foo'.


    OK, after looking at the output of (symbols) I see that now, but yeah, that really doesn't seem like much of an issue (to me at least), as it doesn't affect anything other than a  tiny amount of memory.


    Quote from: "Lutz"In your experiments inside of the 'mtest' namespace 'define-smacro' is trying to make a context out of 'mtest:test' which rightfully causes the error message, because 'test' already exists locally in 'mtest' after parsing the 'define-smacro' expression.


    I think you may have misread, mtest is the macro being created, FOO is the context that I attempted to create an unrelated 'test' macro in, and that seemed to fail for reasons that I still don't understand. Is it just that newLISP doesn't like creating contexts while not in the MAIN namespace..?  Attempting to evaluate 'test' in the FOO context generated an odd error.


    Quote from: "Lutz"A native implementation would easily avoid duplicates and and create the global static macro from wherever you want, but this is not desirable from a code hygienic point of view in bigger projects and programming teams. A global function/macro created using 'define-smacro' feels like an extension to the core language because its global.


    I did not suggest that define-smacro make the macros it creates global, it would in fact be no different from what you have defined, except optimized in C.



    My request for a native define-smacro is simply based on two things: it would encourage the use of it (which is a good thing for the reasons previously stated), and it would speed it up a bit.
    Get your Objective newLISP groove on.

    Lutz

    #10
    Sorry, I misread some of it. The first error is the pre-existence of 'FOO:test' from parsing as explained. It then assumes then that the to the left of the colon in 'test:test' is a variable 'FOO:test' holding a context:


    (set 'BAR:test 123)
    (context 'FOO)
    (set 'test BAR)

    test:test => 123 ; refers to BAR:test


    The next phenomenon that 'x' and 'y' are from context 'FOO' not 'test:x' and 'test:y' has to do with rules for 'def-new', which will only move variables of subexpressions in source to the new context in target if coming from the source context, but 'temp', as it was defined in MAIN has not the source context referred to by (sym <symbol> <context>). 'def-new' must behave this way, so it will not move external references relative to the source context. E.g:


    (context 'FOO)
    (set 'foo '(x y MAIN:z))

    (def-new 'foo 'BAR:bar)
    (context 'MAIN)

    (context BAR)
    bar ;=> (x y MAIN:z)


    'x' and 'y' are moved from source FOO to target BAR, but MAIN:z was external to the source and is not moved.





    To make a long story short, the current 'define-smacro' can only be used from MAIN.

    itistoday

    #11
    OK gotcha, thanks for the explanation!
    Get your Objective newLISP groove on.