I'm using newLISP for filtering large number of
huge ascii text files (one line equals one record).
The filter is making simple changes to the input
text line (record) and printing it to a tmp file
and then when finished replacing the original
file (after it was closed of course).
After running for an hour or so it finally
crashes with a message to the effect it is
out of memory in function such and such.
I'm not using any recursive algorithm so
I don't think it is the stack running out of
space. I am running a huge number of
huge files through the filter (using "for"
loops to read the archive directory files).
The filter is making changes to these
archive files and this is why it can run
for hours.
I didn't save the exact error message but
I can start the job again and get the message
verbatim if you need it.
i'd be happy to run something similar on my machine to see if it does the same. Need some similar code, though...!
Do you have an email that I can send you the
code?
It is basically a generic filter that anyone can use.
The configuration file customizes it for handling
your particular flat files.
You simply specify the match column regular expressions
and the replace column RE's and replacement text.
So this could be a useful utility if I can eliminate the out
of memory crash.
John, Send it to me too -- you know my email. --Ricky
Yes, I think there is something going on. After about 10 minutes Virtual Memory usage has gone from 50MB to 300MB... Increasing by about 10MB every two seconds. If you ctrl-c and enter the debugger while this is running you get:
(sys-info)
(12191358 268435456 383 16 28 2048 9100 131)
and then a minute later:
(sys-info)
(12653445 268435456 383 16 28 2048 9100 131)
I'm no boffin, but I can see the first number increasing inexorably. After some time presumably this would exceed some limit?
How, though, do you know where all these extra bytes are going, though? Ah, there I need some advice...
I'm not sure I understand what you are
asking.
Unless I'm unwittingly recursively calling
something and growing the stack what else
could be causing this memory leak? Is
it my code or the newLISP garbage collector?
TBTW - hanks for taking the time to run it.
I think your line-xform is 'growing the stack' (no idea what that means but it sounds good). The number of cells is larger at the end of the function than at the start...
Edit: No, actually my previous post shows that it's not the stack that's growing. It's the number of cells that's growing inexorably...
What data should I use to test this with? Made-up data or yours? Yours would be better, but if I need to make up any data, please give me some specs that you think relevant. Thanks! --Rick
John,
In the for loop, in routine line-xform, you use n as the index; then inside that loop you re-assign n (in numerous places) which is screwing you to high hell. Use lets not setqs, and you will be fine -- probably not blow out memory anymore.
--Ricky
I forgot about dynamic scoping.
I wrote this when late at night when
I was half asleep.
Thanks!
glad you identified the problem..
It's cool stuff. Do you write much about newLISP these days? (I'm assuming you're John "newLISP in 21 minutes" Small I suppose...)
cheers
Funny you should ask that cormullion. I also asked that about a month ago here (//http).
The answer is YES, he is the John "newLISP in 21 minutes" Small. :-)
Yes, I thought he might be - and he has told me he is as well!
I've just been trying to get him to write more stuff for us - 21 minutes is OK, but I'm ready for another half hour now! :-)
Quote from: "cormullion"
I've just been trying to get him to write more stuff for us - 21 minutes is OK, but I'm ready for another half hour now! :-)
I agree. John writes very well (as you do also cormullion), so another half-hour would be welcome.
I think I should warm up on newLISP again
especially on contexts and def-new and the
new unify etc.
I'll try to then write about these additional subjects
1. lexically scoped programming in newLISP
codified with examples (and perhaps macros)
to make it as easily as rolling off a log.
2. a closer look at macros (newLISP macros
have real potential for exploitation I think
and needs very carefully study to see exactly
how potent it can be.)
3. explore the unify and see if we can fashion
continuations through CPS (continuation passing style)
perhaps using macros in order to facilitate non deterministic
programming. This is where we have to have
lexical scoping or it will be hopeless. If we can do
something useful in the way of non determinism (easily)
then newLISP starts to become compelling. Otherwise
unify is just a nice way to get around "let" and maybe
cracking and mapping structures (which is worthy discussing
anyway). If we can get non determinism working than
folding this back onto macros can become extremely
interesting as we build non deterministic JIT macros.
I also want to rewrite the flatreplace.lsp or read
what Rick Hanson sent me as soon as I can get
a free minute which might be 2 weeks from now.
I want to test this memory problem again. Even
though my code was buggy the memory problem
should not have been a problem since I was not
recursively calling anything. I suspect the problem
is in Windows memory allocation since a port to
PHP of flatreplace also died several hours into
the batch job.