newlisp -x does not play well with -m or -s

Started by CaveGuy, January 08, 2015, 03:39:47 PM

Previous topic - Next topic

CaveGuy

problem: newlisp -x linked programs can eat all available memory, blow their stack and DOS a system unless they are constrained. With -m and -s not playing nice with -x, constraint is hard to achieve.



test.lsp
(println "maxheap " (sys-info 1) " maxstack " (sys-info 5))
(exit)


newlisp -x test.lsp test

chmod 755 ... and run ./test

maxheap 576460752303423488 maxstack 2048

max heap is huge and stack is assumed small for this example.



newlisp -m256 -s4096 test.lsp

maxheap 8388608 maxstack 4096


maxheap is more reasonable and maxstack is larger



newlisp -m256 -s4096 -x test.lsp test2

returns with no error and never makes test2



newlisp -x test.lsp test2 -m 256 -s 4096

also returns without an error yet never makes test2



This problem became apparent to me when one of my linked programs went into an endless loop.

It ate a lot of memory, DOSing the server before it was discovered and killed off.

It was during the migration from: newLISP v.10.1.0 on Win32 IPv4 using link.lsp to: newLISP v.10.6.2 64-bit on Linux IPv4/6 UTF-8 libffi [Linux 3.13.0-43-generic on x86_64] using -x that got me started.



I also tested  newLISP v.10.6.0 32-bit on Linux IPv4/6 libffi, options: newlisp -h on [Linux 3.13.0-39-generic on i686] -x worked the same [no surprise].
Bob the Caveguy aka Lord High Fixer.

rrq

#1
Notably, in the embedded executable the source is loaded before other command line options, whereas it otherwise is loaded after the options (will, in the order it occurs on the command line).

E.g., if you make the "main body" be in a prompt-event (ending with an exit) rather than performed as part of source loading, you can provide -m and -s to the embedded executable before this "main body" is performed. But, I couldn't then make it avoid the NewLISP blurb.

In any case, it would be nice if -m and -s options also could be provided in the source; e.g. as some variant of reset.

Lutz

#2
yes, as Ralph says the linked code gets loaded before command line parameters are read. The linked source is treated like init.lsp.



But you could do the following trick:

; linked-in test.lsp
(define (run)
    (println (sys-info))
    (exit)
)

now this works:

~> newlisp -x test.lsp test
~> chmod 755 test
~> ./test -s10240 -m10 -e'(run)'
(505 327680 409 3 0 10240 0 57739 10600 1411)
~>

CaveGuy

#3
I like Ralph idea "In any case, it would be nice if -m and -s options also could be provided in the source; e.g. as some variant of reset."



The ability to provide a "modified" (main-args) to the (reset) function just might do the trick.

either as a modified main-args list (reset '("-s10240" "-m10")) or as a command string (reset  "-s10240 -m10")



While the test example works using your command example when there are no main-args to be processed, the simple uppercase.lsp example is ugly.



If one could save off any desired input args then (reset new-args) to adjust the environment that would be great !



My desire is to keep the command line clean: ~# uppercase  "string to convert"

and not ~# ./uppercase -s10240 -m10 -e'(run "string to convert")'
(define (run instr)
   (println (upper-case instr))
   (println "main-args " (main-args))
   (println "heap " (sys-info 1) " stack " (sys-info 5))
   (exit))

produced when linked:

STRING TO CONVERT

main-args ("./uppercase" "-s10240" "-m10" "-e(run "string to convert")")

heap 327680 stack 10240



Which although it works, provides a mod_rewrite challenge to make it user friendly.
Bob the Caveguy aka Lord High Fixer.

Lutz

#4
In upcoming release version 10.6.2, the reset function can be used to change the maximum cell memory used by changing the max cell count during program run.



see here: http://www.newlisp.org/downloads/development/inprogress/newlisp_manual.html#reset">http://www.newlisp.org/downloads/develo ... html#reset">http://www.newlisp.org/downloads/development/inprogress/newlisp_manual.html#reset



This can be done at any point in a program without restarting the system. During program run sys-info could be used to watch the current cell count.



I did not implement resetting the the maximum stack usage, which would require a partial system restart and is much more involved. The current reset addition was very small.

CaveGuy

#5
thanks for the quick responce, close but no cigar :)


test.lsp
(println "sys-info " (setq a (sys-info)))
(reset 512)
(for (x 1 1000) (setq a (append (list x) a)))
(println "sys-info " (sys-info))

(exit)


:~# newlisp

newLISP v.10.6.2 64-bit on Linux IPv4/6 UTF-8 libffi, options: newlisp -h

> (load "test.lsp")

sys-info (448 576460752303423488 410 4 0 2048 0 6131 10602 1409)

sys-info (1456 512 411 3 0 2048 0 6131 10602 1409)



oops 1456 used with a max of 512 and no error, not even a blink ....


test2.lsp
(println "sys-info " (sys-info))
(setq a '())
(for (x 1 10000) (setq a (append (list x 1 2) a)))
(println "sys-info " (sys-info))
(exit)


~# newlisp -m1

newLISP v.10.6.2 64-bit on Linux IPv4/6 UTF-8 libffi, options: newlisp -h

> (load "test2.lsp")

sys-info (445 32768 409 3 0 2048 0 6421 10602 1409)

ERR: not enough memory



error as expected, when -m set from command line.
Bob the Caveguy aka Lord High Fixer.

Lutz

#6
try linking with test2.lsp, there is a minimum (somewhere less than 32768) preallocated, which will not be hit specifying 512  cells. Do 32768 in test2.lsp and try again. Memory gets allocated in bigger chunks and a  new allocation will not happen until a chunk used.

CaveGuy

#7
Thanks ..


test3.lsp
(println "sys-info " (sys-info))
(setq a '())
(reset 32768)
(println "sys-info " (sys-info))
(for (x 1 10000) (setq a (append (list x 1 2) a)))
(println "sys-info " (sys-info))
(exit)

~# newlisp -x test.lsp test3

~# ./test3

sys-info (489 576460752303423488 409 2 0 2048 0 6871 10602 1409)

sys-info (490 32768 410 2 0 2048 0 6871 10602 1409)

ERR: not enough memory



Great it works! my-bad the error was in my testing.



Any thoughts on passing in a new/modified arg-list and restarting with (reset arg-list).

Is would be like (reset true) only with a new arg-list substituted.



BTW: any previous lazyman's use of (reset 1) as a short cut for (reset true)

now should be something more cryptic like (reset (not nil)) so the heap size is not set to 1 :)
Bob the Caveguy aka Lord High Fixer.

protozen

#8
Cool, nice that we'll be able to set the cell memory, but the having access to the stack size would open up others options for dynamic constraints.

rrq

#9
A different take on this problem is to use the execve function to "morph" into a different executable, basically like exec-ing a different command line but without forking. A morph pre-amble is not overly complicated, e.g.:
(module "unix.lsp")
(import unix:library "execve")

(define (morph cmdlist envlist)
  (let ((argv (append cmdlist '(0))) (envp (append (or envlist '()) '(0))))
    (execve (first argv)
            (pack (dup "lu " (length argv)) argv)
            (pack (dup "lu " (length envp)) envp))))

Note that it's crucial to pack the string arrays for execve in a way that ensures their perstence, which the above seems to do.



The following is a (subsequent) test of morphing by re-invoking this interpreter with new command line arguments, and a final "-e(X)" for the main activity
(when (!= (main-args -1) "-e(X)")
  (println "Before: " (sys-info))
  (let (p (or (real-path (main-args 0) true) (main-args 0)))
    (morph (extend (list p "-s8000" "-m20000") (1 (main-args)) '("-e(X)"))  '()))
  (println "comes to here only if morphing fails")
  (exit 0))

(define (X)
  (println "After: " (sys-info))
  (exit 0))

You may put those two code blocks into the one file, say morph.lsp, and also create an embedded executable, say morph, and run them:
newlisp -x morph.lsp morph
chmod a+x morph
newlisp morph.lsp
./morph


The morphing in this way re-executes the embedded interpreter with new command line arguments, like adding -s and -m arguments, etc., but requires that the main script activity is postponed to be invoked via a final -e, which also is used as flag for morphing or not. If you don't flag it, you may get an interesting morphing loop.



Maybe there could be a morph primitive (perhaps under a better name), although I have no idea whether it's portable enough for that.