Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - hartrock

#91
Thanks: it works like a charm :-)

Could be a speed record from making a proposal to getting it done... :-)
#92
[update] -> 10.5.6.



During writing some code to ease navigation in lists, I've stumbled about something. Let's speak code for itself.



Assumed there is a definition for ref-parent and some example list:



;; nil for non-existing parent

(define (ref-parent r)
  (if (null? r) nil (0 -1 r)))
;;
(setq l_nested '("1" "2" "3"
                  ("4" "5" "6"
                   ("7" "8" "9"))
                 "10" "11" "12"
                  ("13" "14" "15"
                   ("16" "17" "18")))
      )

Now it's possible to eval:

> (l_nested (ref-parent (ref "14" l_nested)))
("13" "14" "15" ("16" "17" "18"))

; but it is not possible to eval:

> (l_nested (ref-parent (ref "11" l_nested)))

ERR: missing argument

Reason is, that:

> (ref-parent (ref "14" l_nested))
(7)

; but:

> (ref-parent (ref "11" l_nested))
()

: here '() is not being accepted as deref argument.



What about introducing '() as allowed deref argument, which dereferences list itself to which it is applied to?

Then the following would work:

> (l_nested (ref-parent (ref "11" l_nested)))
("1" "2" "3" ("4" "5" "6" ("7" "8" "9")) "10" "11" "12" ("13" "14" "15" ("16" "17" "18")))


This seems to be a neat unification regarding dereferencing lists by refs.



Because ref'ing a non-existing element gives:

> (ref "not there" l_nested)
nil

; '() seems to be free for this purpose.

If there is:

> (ref-all "not there" l_nested)
()

, we get a '(). But this is no problem, because for dereferencing refs gotten this way, we have to iterate through this list, which would do nothing here.



Are there any problems with this proposal, which I don't see?



What do you think?



Feedback to this proposal is appreciated (as usual).
#93
Quote from: "Lutz"Looking at the pointer values passed and assigned and the resultStackIdx, I see that the sequence of assignments is correct. First: args = arrayList(args, FALSE) second: *(++resultStackIdx) = args



But incrementing resultStackIdx only works the first time, the second time entering the apply statement, resultStackIdx doesn't increment and is stuck at its old value forever, no matter how often I enter the apply statement. Now the old pointer is overwritten again and again and has never a chance to get processed by popResult() higher up in evaluateExpression() for freeing cells and memory.



So it seems not to be a sequence point problem, but a problem with the ++ operator in gcc but only under certain circumstances.

It's not a gcc problem.

The macro:
newlisp.h:#define pushResult(A) (*(++resultStackIdx) = (UINT)(A))

, called as:
pushResult(args = arrayList(args, FALSE));
, which will be expanded to:
(*(++resultStackIdx) = (unsigned long)(args = arrayList(args, 0)));
, has side effects! As far as I know the compiler need not assume this possibility, so it can freely ++resultStackIdx before calling arrayList().

Problem is, that resultStackIdx will be changed by both ++resultStackIdx and calling arrayList(). Call chain is arrayList() -> copyCell() -> popResult() (last one changes resultStackIdx).

So it depends on the sequence of changing resultStackIdx what happens: moving arrayList() call into an extra statement before ++resultStackIdx, enforces the correct one (like already done).



As a rule it could be stated: safest is to pushResult() only a pointer already computed (it's not visible which call could have side effects to global resultStackIdx, and there may be a change in called function later). A single developer may break this rule, if he knows, that there are and will be no side effects from a call ;-)
#94
newLISP newS / not a gcc bug
November 22, 2013, 09:55:53 PM
[update2] gcc is not guilty; see reply below. Wrong code example has led me to the problem...

[update] Wrong code example removed.



Note:

It seems to be difficult to create an example with minimal code to trigger this bug (have tried a while); so I don't know how to bug report this to the gcc maintainers (any idea?).



[update2] No wonder this is difficult, if there is no bug...
#95
newLISP newS / Re: newLISP Development Release v.10.5.5
November 22, 2013, 11:40:08 AM
Quote from: "Lutz"Thanks, yes it is definitely a sequence point issue. The following works on all platforms and is the way it is coded now:



if(args->type == CELL_ARRAY)
    {
    args = arrayList(args, FALSE);
    pushResult(args);
    }

This is the way I had it coded first and tested everywhere. Then changed it to the short form which still worked on my OSX 10.9 Clang based development system. But would break on gcc based distribution compiles and was not tested again :(

Where do you see the undefined behavior regarding sequence points? I've tried to find it in
Quote from: "hartrock"
(*(++resultStackIdx) = (unsigned long)(args = arrayList(args, 0)));

with no success (so far).

From what I've read, it seems to be, that (at least near) return from a function is a sequence point: this is the reason why I think, the assignments should work from right to left after calling the function(and they must not use the old value of args). My understanding may be wrong here, though (it's not the easiest topic).

This topic is of practical interest for C programmers in sense of 'how to code and what to avoid' rules.

If there is no undefined behavior, then gcc has a bug here, which would be of interest, too.


Quote from: "Lutz"
To your question about memory management of the stuff pointed to by args pointer:

...

Thanks for your explanations: this helps in understanding the workings of the interpreter.
#96
newLISP newS / [fix] apply mem leak
November 21, 2013, 04:49:48 PM
After some trial and error I've found a solution for apply, which works for me.



Try this at the begin of p_apply():
CELL * p_apply(CELL * params)
{
CELL * expr;
CELL * args;
CELL * cell;
CELL * result;
CELL * func;
ssize_t count, cnt;
UINT * resultIdxSave;
CELL * tmp;

func = evaluateExpression(params);

cell = copyCell(func);
expr = makeCell(CELL_EXPRESSION, (UINT)cell);
params = getEvalDefault(params->next, &args);

 if(args->type == CELL_ARRAY) {
   /* pushResult(args = arrayList(args, FALSE)); */
   /* pushResult((args = (arrayList(args, FALSE)))); */ /* does not work, too */
   /* but this is OK */
   tmp = arrayList(args, FALSE);
   pushResult(args = tmp);
 }
;; ...

After finding the macro:
newlisp.h:#define pushResult(A) (*(++resultStackIdx) = (UINT)(A))

, which will be expanded to:
(*(++resultStackIdx) = (unsigned long)(args = arrayList(args, 0)));
and reading some discussions about C sequence points related to assignments, I nevertheless think, that this should work. But possibly the compilers are pushing the old value of args onto the stack.



Another topic:

What remains unclear to me: when/how will the args pointer, which will be overwritten by the arrayList generated list, be free'd?
#97
newLISP newS / Re: newLISP Development Release v.10.5.5
November 20, 2013, 06:55:03 PM
Quote from: "Lutz"There is no cell leak or memory leak :)



array-list creates a new list but it gets popped off from the result stack in evaluateExpression() after the function returns. The cleanupResults() happens only for the results from the individual apply operations when it had the reduce parameter during the execution of apply. This is necessary to avoid result stack overflow on long lists or arrays. All other minor result stack cleanup is done after function return.

Thanks for the explanation: this is good for understanding interpreter implementation.



But there is a leak somewhere...
#98
newLISP newS / Re: newLISP Development Release v.10.5.5
November 20, 2013, 06:41:35 PM
More condensed (triggering the same problem):
(set 'a (array 10000 (sequence 0 10000)))
(time (apply + (0 1000 a)) 10000)

Note: 10000 times may be over too fast.
#99
newLISP newS / Re: newLISP Development Release v.10.5.5
November 20, 2013, 06:20:30 PM
Try this:

(set 'a (array 10000 (sequence 0 10000)))
;;
(define (til-stats cont divisor)
  (let ((tilSize (/ (length cont) divisor))
        (tils (array divisor)))
    (dotimes
     (chunkIx divisor)
     (let (off (* chunkIx tilSize))
       (++ (tils chunkIx) (apply + (off tilSize cont)))))
    tils))
(define (centil-stats cont)
  (til-stats cont 10))
;;
(time (centil-stats a) 10000)

This triggers a mem leak increasing very fast.
#100
newLISP newS / Re: newLISP Development Release v.10.5.5
November 20, 2013, 05:52:31 PM
I can confirm, that running your example gives no mem leak.



On the other side I trust the 'top' command, if running two variants -- one with, one without array-list (no other changes) -- of code from me:

 PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM     TIME+ COMMAND          
21814 sr        20   0 22260 3308 1144 R  99.4  0.2   2:46.76 newlisp          
21815 sr        20   0 26612 7796 1144 R  99.1  0.4   2:45.19 newlisp          

The code is too long for posting it as an example: I try to reproduce the effect with minimal code (stay tuned).
#101
There seems to be a memory leak in your apply implementation for arrays without using array-list: apply after removing array-list leads to continuously increasing mem usage in some code of mine.



From diffing the code against 10.5.4 I think, the culprit may be not popping additional list, if there is an array arg (see ';;->' markers):

CELL * p_apply(CELL * params)
{
CELL * expr;
CELL * args;
CELL * cell;
CELL * result;
CELL * func;
ssize_t count, cnt;
UINT * resultIdxSave;

func = evaluateExpression(params);

cell = copyCell(func);
expr = makeCell(CELL_EXPRESSION, (UINT)cell);
params = getEvalDefault(params->next, &args);

if(args->type == CELL_ARRAY)
        pushResult(args = arrayList(args, FALSE));
;;-> here a list will be generated, but ..
if(args->type != CELL_EXPRESSION)
    {
    if(isNil(args))
        {
        pushResult(expr);
        return(copyCell(evaluateExpression(expr)));
        }
    else
        return(errorProcExt(ERR_LIST_EXPECTED, args));
    }

if(params != nilCell)
    getInteger(params, (UINT *)&count);
else count = -1;
if(count < 2) count = MAX_LONG;

resultIdxSave = resultStackIdx + 2;
;;-> .. is this taken into account here?
...

Just an idea seeing this code the first time...
#102
Hello Ted,



I've sent you an email with the code via the User Panel. Because it's my first mail here, and it stays in the Outbox - perhaps diminishing there, if you have seen it? - I think it cannot hurt to give you a hint.



Best regards,

Stephan
#103
Quote from: "TedWalther"
Can you explain your issue a bit better hartrock?


My original motivation for starting this has been twofold:

[*] Being able to enter the interpreter instead of exit:

    → possible by reset or throw (don't know the differences so far) together with error handler (and starting with v10.5.5 reset even without error handler).
  • [*] Having a separation between CL args for interpreter and script:

         → possible by correctly using exit or the former mechanism for entering the interpreter during development.
  • [/list]

    So this thread has changed to [HowTo].



    Next is to cleanup it a bit to avoid confusion: most important, removing the stupid idea with using special opt '--' for separating CL args between interpreter and script (coming into mind without knowing first mechanism above and not knowing special semantics of '--')).

    [4. update]: Cleanup done.



    BTW: I have extended/patched getopts.lsp (injected from the outside after loading it) for having shortlongopt: this leads to a grouping of corresponding short and long opt in usage:
    sr@free:~/Ideas/Spiel$ ./gol -h
    Usage: gol [options]
      -c, --cycles INT                  num of cycles
      -r, --rounds INT                  num of rounds per cycle
      -a, --players-at-all INT          players at all
      -g, --players-per-game INT        players per game
      -s, --start-balance INT           start balance for each player
      -l, --lower-risk-limit FLOAT      [opt] 0.0 <= FLOAT <= 1.0 (default: 0.0)
      -u, --upper-risk-limit FLOAT      [opt] 0.0 <= FLOAT <= 1.0 (default: 1.0)
      -d, --distribution-per-cycle FLOAT  [opt] 0.0 <= FLOAT <= 1.0 (default: 0.0) fraction of balance to be distributed
      -h, --help                        Print this help message

    (printing could be improved further). May be you are interested.
    #104
    Ted you are right!

    I haven't known this '--' behavior.



    From 'man getopts':
    Quote
    Each  parameter not starting with a `-', and not a required argument of

           a previous option, is a non-option parameter. Each  parameter  after  a

           `--' parameter is always interpreted as a non-option parameter.
     If the

           environment variable POSIXLY_CORRECT is set, or  if  the  short  option

           string  started with a `+', all remaining parameters are interpreted as

           non-option parameters as soon as  the  first  non-option  parameter  is

           found.


    Exactly this is what getopts does:
    #!/usr/bin/newlisp

    (module "getopts.lsp")

    (shortopt "a" (println "'a' opt") nil "works as expected")
    (shortopt "h" (println (getopts:usage)) nil "usage")

    ;; Do *not* try this (action won't be triggered):
    ;; (shortopt "-" (println "'-' opt") nil "stops parsing CLI opts") ; Do *not* try this!

    (getopts (2 (main-args)))

    (println "After calling getopts.")
    (exit)


    sr@free:~/NewLisp$ newlisp getopts_bug.lsp -a --
    'a' opt
    After calling getopts.
    sr@free:~/NewLisp$ newlisp getopts_bug.lsp -- -a
    After calling getopts.
    sr@free:~/NewLisp$

    In the first case '-a' triggers its action; in the second case it will be ignored.
    #105
    Anything else we might add? / Re: apply for arrays?
    November 17, 2013, 07:56:27 PM
    Quote from: "Lutz"Both map and apply also take arrays in the next version, but will do array -> list conversion internally to keep code changes / additions to a minimum leveraging existing code.

    This sounds good!



    There are some functions for which there are different performance characteristics for lists and arrays: e.g. an assoc for arrays should be slower for large arrays compared to lists. In such cases there is a trade-off, because there are two possibilities for making such a function applicable for the other data type:

    [*] Use the same name: comfortable to use (unification), but may create performance surprises, if list arg gets an array or vice-versa;
  • [*] use another name: less comfortable to use, but you have it explicit, that a list or an array is expected, and exclude the other data type.
  • [/list]

    My point has been - under the assumption, that arrays are specially made just for performance reasons -, that this kind of trade-off needn't be applied for apply, if it will be implemented accordingly. So far I had seen array-list as a result of the will to make computation effort visible (you explicitely have to convert), and not as something, which has to be there.