newLISP core development expectations

Started by noah, July 22, 2006, 02:32:51 PM

Previous topic - Next topic

noah

Hello, Lutz.



What do you think about QA testing basic newLISP functionality? For example, a set of tests of newLISP functions called under normal, boundary, and error conditions. I do not have lots of time, but I can offer some time toward that effort.



What do you think about module pre-installation tests? newLISP's module installation process could verify that a target system meets the module requirements before newLISP installs the module. For example, a useful test of the ps module is a check that Ghostscript is available on the target system.



If you wanted, you could standardize how developers create pre-installation module tests. One way is to specify fields in the newLISPdoc format that another program uses automatically to create a pre-installation test for a documented module (for example, a test of the module's dependencies on other modules).



-Noah

Lutz

#1
You touched on a topic which is very important for newLISP. Having a better QA process for the newLISP executable and also for the modules would be great.



Currently we only have the files qa-dot and qa-comma in the distribution to test for  'dot' and 'comma' countries (in decimals).



These files only test 'normal' conditions, and those only barely. We definitely need more boundary and error-conditions testing. Many of the bugs reported are the cause of using wrong parameter types, missing parameters or are caused by some other kind of  boundary condition. The recent (rand n -1) error was a typical example and there could be more of those, but I just don't get the time to write a comprehensive QA suite.



If you want to help on this, that would be great!



Lutz

noah

#2
OK.



Do you think its a good idea to include dependency information in the newLISPdoc fields? Should the newLISPdoc format include sufficient information from which to generate test suites, or is that outside the purpose of newLISPdoc?



-Noah

Lutz

#3
QuoteShould the newLISPdoc format include sufficient information from which to generate test suites ...


Initially newlispdoc will not go that far, type specs for @param tags will not be enforced, but that could be added later.



In the beginning I was thinking of just enhancing what is already there in qa-dot and qa-comma, adding more test cases in each test-xxx function.



Lutz

noah

#4
OK, that's where I'll concentrate my efforts.



-Noah

Lutz

#5
Another document which needs some editing improvement is:

http://newlisp.org/MemoryManagement.html">http://newlisp.org/MemoryManagement.html

It is referenced more often on the net recently. When I was re-reading it this morning, I realized how clumsy many of the sentences are, which makes this page harder to understand than it really is. I guess I have learned something lately from the edits of Michael and Noah, so I can at least see that improvement is possible ;)



Michael just sent me letter 'D' edits for the reference section of the manual - thanks. I am travelling today, but they will get online tonight or tomorrow morning as rev 4.



Lutz

cormullion

#6
I have an idea. Why not put this section onto Wikipedia, as a separate section in the newLISP entry. That way, we (or rather everyone who's interested) can edit it as much as they want, whenever they want, simultaneously (or nearly), finding typos or adding profound thoughts. You can always revert ill-considered edits that you disagree with. :-) You get a free audit trail too. Then when it's settled down, extract it back into its own web page...



Or perhaps - much better idea, but I can't provide server space for it - get a similar version running for newLISP: http://www.mediawiki.org/wiki/MediaWiki">//http://www.mediawiki.org/wiki/MediaWiki

Lutz

#7
I would like to keep it private! Definitely not on the public Wikipedia site, only accessible and editable by a selected group of people from this forum, which would be at this moment the three of you: Cormullion, Michael and Noah.



I also can offer to host it in a small newlisp wiki at newisp.org, as it is only a small thing.



Lutz



ps: I think I can have something online tomorrow. Unfortunately I cannot login to my site from the Airport WiFi.

tom

#8
want to use this?



http://perpetualnewbie.info/doku/">http://perpetualnewbie.info/doku/



It's just been sitting there until now.

noah

#9
Hello.



Lutz, I'm half done with a first suggested edit to the Memory Management document. I started this morning, after reading your post mentioning it.



-Noah

noah

#10
Hi, Lutz.



Here's my first suggested edit to the Memory Management page. The edit contains two alternative paragraphs that cover the discussion of the non-use of one-bit (non)sticky counts. Let me know what's what about that; I really didn't know how to interpret the source paragraph.



I attempted a grammar/syntax edit here, but it crept over to a deep edit. Let me know if any mistakes, I'll happily rework what I've posted here.



Michael, Cormullion, if you have suggestions to improve or correct this suggested edit, feel free to post them to this thread or message them to me.



-Noah




Automatic Memory Management in newLISP
Lutz Mueller, 2005-09-26 rev 12

newLISP and any other interactive language system will constantly generate new memory objects during expression evaluation. The new memory objects are intermediate evaluation results, reassigned memory objects, or memory objects whose content was changed. If newLISP did not delete these objects, it would eventually run out of available memory.

In order to understand newLISP's automatic memory management, it is necessary to first review the traditional methods employed by other languages.

Traditional automatic memory management (Garbage Collection)

In most programming languages, a process registers allocated memory, and another process finds and recycles the unused parts of the allocated memory pool. The recycling process can be triggered by some memory allocation limit or can be scheduled to happen between evaluation steps. This form of automatic memory management is called Garbage Collection.

Traditional garbage collection schemes employ one of two algorithms:

(1) The mark and sweep algorithm registers each allocated memory object. A mark phase periodically flags each object in the allocated memory pool. A named object (a variable) directly or indirectly references each memory object in the system. The sweep phase frees the memory of the marked objects when they are no longer in use.

(2) A reference-counting scheme registers each allocated memory object together with a count of references to the object. This reference count gets incremented or decremented during expression evaluation. Whenever an object's reference count reaches zero, the object's allocated memory is freed.

Over time, many elaborate garbage collection schemes have been attempted using these algorithms. The first garbage collection algorithms appeared in LISP. The inventors of the Smalltalk language used more elaborate garbage collection schemes. The history of Smalltalk-80 is an exciting account of the challenges of implementing memory management in an interactive programming language; see [Glenn Krasner, 1983 Smalltalk-80, Bits of History, Words of Advice]. A more recent overview of garbage collection methods can be found in [Richard Jones, Rafael Lins, 1996 Garbage Collection, Algorithms for Automatic Dynamic Memory Management].

One reference only (ORO) memory management

Memory management in newLISP does not rely on a garbage collection algorithm. Memory is not marked or reference-counted. Instead, a decision whether to delete a newly created memory object is made right after the memory object is created.

{Empirical studies of LISP have shown that most LISP cells are not shared and so can be reclaimed during the evaluation process. Aside from some optimizations for primitives like set, define, and eval, newLISP deletes memory objects containing intermediate evaluation results once it reaches a higher evaluation level. newLISP does this by pushing a reference to each created memory object onto a result stack. When newLISP reaches a higher evaluation level, it removes the last evaluation result's reference from the result stack and deletes the evaluation result's memory object. This should not be confused with one-bit reference counting. ORO memory management does not set bits to mark objects as sticky. }

- OR -

{ Empirical studies of LISP have shown that most LISP cells are not shared and so can be reclaimed during the evaluation process. newLISP deletes memory objects containing intermediate evaluation results once it reaches a higher evaluation level. newLISP does this by pushing a reference to each created memory object onto a result stack. When newLISP reaches a higher evaluation level, it removes the last evaluation result's reference from the result stack and deletes the evaluation result's memory object. This should not be confused with one-bit reference counting. newLISP's memory management does not set bits to mark objects as sticky, aside from its optimizations for primitives like set, define, and eval. }

newLISP follows a one reference only (ORO) rule. Every memory object not referenced by a symbol or context reference is obsolete once newLISP reaches a higher evaluation level during expression evaluation. Objects in newLISP (excluding symbols and contexts) are passed by value to other functions. As a result, each newLISP object only requires one reference.
 
newLISP's ORO rule has advantages. It simplifies not only memory management but also other aspects of the newLISP language. For example, while users of traditional LISP have to distinguish between equality of copied memory objects and equality of references to memory objects, newLISP users do not.

newLISP's ORO rule also has disadvantages. It forces newLISP to constantly allocate and then free LISP cells. newLISP optimizes this process by allocating large chunks of cell memory from the host operating system. newLISP will request LISP cells from a free cell list and then recycle those cells back into that list. As a result, only a few CPU instructions (pointer assignments) are needed to unlink a free cell or to re-insert a deleted cell.

The overall effect of ORO memory management is a faster evaluation time and a smaller memory and disk footprint than traditional LISP's can offer. The lack of garbage collection in newLISP more than compensates for its high frequency of cell creation/deletion. Note that under error conditions, newLISP will employ a mark and sweep algorithm to free un-referenced cells.

Performance considerations with value passing

Passing parameters by value (memory copying) instead of by reference poses a potential disadvantage when dealing with large lists. For practical purposes, however, the overhead needed to copy a large list is negligible compared to the processing done on the list. Nevertheless, to achieve maximum performance, newLISP offers a group of destructive functions that can efficiently create and modify large lists. While cons and set-nth return a new memory object of the changed list, push, pop and nth-set change the existing list and only return a copy of the list elements that they added or removed. In order for any function to operate destructively on a large list, the large list must be passed by reference. If a list is assigned to a context (a namespace) in newLISP, then newLISP can pass the list by reference.  newLISP contexts are the best choice when passing big lists or string buffers by reference.

In general, the speed of ORO memory management more than compensates for the overhead required to pass parameters by value.

Memory and data types in newLISP

The memory objects of newLISP strings are allocated from and freed to the host's OS whenever newLISP recycles the cells from its allocated chunks of cell memory. This means that newLISP handles cell memory more efficiently than string memory. As a result, it is often better to use symbols than strings for efficient text processing. For example, when handling natural language it is more efficient to handle natural language words as individual symbols in a separated name-space, rather than as a single string. The 'spam-filter' program in the newLISP source distribution uses this method. newLISP can handle millions of symbols without degrading performance.

Programmers coming from other programming languages frequently overlook that symbols in LISP can act as more than just variables or object references. The symbol is a useful data type in itself, which in many cases can replace the string data type.

Integer numbers and double floating-point numbers are stored directly in newLISP's LISP cells and do not need a separate memory allocation cycle.

For efficiency during matrix operations like matrix multiplication or inversion, newLISP allocates non-cell memory objects for matrices, converts the results to LISP cells, and then frees the matrix memory objects.
 
References
- Glenn Krasner, 1983. Smalltalk-80, Bits of History, Words of Advice
Addison-Wesley Publishing Company

- Richard Jones, Rafael Lins, 1996 Garbage Collection, Algorithms for Automatic Dynamic Memory Management
John Wiley & Sons



cormullion

#11
Looks like an excellent edit, Noah. Can't see how it could be improved. I was interested in the bit about symbols being more than just variables or object references (I hadn't remembered that). An expansion would be beneficial to me, although possibly belonging elsewhere than the Memory management section.



By the way, I liked Tom's  DocuWiki. That might be the way forward eventually...?

Lutz

#12
Thanks to Noah and Corumullion, I will go thru this later today and then post it.



I also just uploaded revision 4 of the User's Manual and Reference. Thanks to Michael for all the letter 'D' edits.



Last night I tried to install dokuwiki, but had problems it recognizing the data directory. All permissions, ownerships where correct but I couldn't get it to work. I liked the fact that it had a revision mechanism and probably a way to see differences in edits too.



But I also like the way things are working now, with more clearly defined ownership of docs, and you using the tools you are used to.



Lutz

Lutz

#13
I also want to add this paragraph in the last section:


QuoteArrays in newLISP are LISP cells allocated in memory in a linear fashion for faster random access when using indices. Only a subset of the list functions can be used on arrays. Automatic memory management in newLISP handles arrays similar to lists.


and wonder if this can be phrased better?



Lutz

Lutz

#14
Just uploaded rev 13 of: http://newlisp.org/MemoryManagement.html">http://newlisp.org/MemoryManagement.html .  I also added a summary/abstract paragraph at the top. Any suggestions for better wording are appreciated.



This page has been frequently referenced and cited during the last few days. Thanks Noah for your editing help, it will give this page more impact.



Lutz