Hi:
Lutz and I have been having a conversation of off the forum and we've
decided that it is worth sharing.
FYI: I have written an application in elisp to create a newlisp syntax mode
for emacs. Part of this project was to provide on-demand documentation
from a keyword context in three different forms:
1)Verbose documentation in a popup window.
2)Verbose documentation in a temporary buffer
3)One-liner description of the interface in the echo area.
To do this I had to make some modifications to newlisp_manual.html.
Here is what I wrote to Lutz:
"""
Lutz said:
> Regarding the doc strings: it should not be very difficult to write a script
> for extracting them from newlisp_manual.html, which strictly uses a limited
> HTML subset. You also could just use 'lynx -dump newlisp_manual.html' to
> generate a pure text file, which then is even easier to parse.
I said:
Actually, I've used that method. But I couldn't really find a
pattern to use as a delimiter, so I manually inserted some
markup - specifically a pseudo-tag <break>.
Do you have any ideas for a search pattern?
I think that, given some time
(:-) and guidance),
I could come up with a method that generated a file that could
be used as an intermediary for any number of editors. Certainly
emacs and vim, and for emacs a text file parsed by emacs itself
would get around some of the problems with escaping.
Alternatively here's an approach that would use the website and
the documentation directly:
Rebol has a feature called load/markup that auto-parses a document or
a string into an array of alternating tag and string datatypes - all
it takes is:
load/markup http://www.newlisp.org/downloads/newlisp_manual.html
from the command line.
Anything like that in newlisp?
Maybe you would like to take this discussion to the forum?
:-) I've got the time.
" " "
Lutz replied again:
"""
finding some kind of standard way to parse the newlisp_manual.html into
pieces sounds like a wonderful idea. This idea should definitely be brougth
to the discussion group.
Some short newLISP script using regular expressions should do the thing. As
you mentioned, perhaps some addtions/changes to the manual will facilitate
it further.
Mention this idea on the discussion group.There are several people
experimenting with newLISP development environments based on emacs, vi, gtk,
etc. They all could benefit on a method to quickly extract relevant help
from the manual.
"""
I'll add a couple of other thoughts:
1)There is one - situation where keyword documentation is combined and that is for the arithmetic operators.
2)One should be thinking about ways to include documentation for
user libraries, third-party contributions etc.
I'm sure there's many other ideas.
thanks
tim
Sounds good. I've had some skirmishes with this myself. For the TextMate bundle I tinkered with, I wrote something like this:
; load the whole manual. Gulp.
(set 'file (read-file "/usr/share/newlisp/doc/newlisp_manual.html"))
; we're looking for the selected text
(set 'func-name "atan2")
; find the matching bit with regex
(set 'doc-section (find (string {(<h2><span>)(} func-name {)(</span></h2>)(.*?)(<h2><span>)} ) file 4))
; found it, output it to Show as HTML
(if doc-section
(println $1 $2 $3 $4)
(println "couldn't find it"))
(exit)
TextMate has a nice HTML window available for online documents, so no need to strip out the markup.
It's obviously a sledgehammer approach. I'm looking forward to seeing the scalpel version.
[/code]
<GRIN> That easy huh?
I don't grok it all, but you have a pattern to parse on right?
it was looking for five stretches of text:
<h2><span>
atan2
</span></h2>
.*?
<h2><span>
and giving you back the first four. The fifth pattern was the start of the next function so unwanted.
In fact, looking at the manual again, the source text is different now, there's a span class="function" to cater for now.
I dunno, I'm no regex wiz...
Quote from: "cormullion"
In fact, looking at the manual again, the source text is different now, there's a span class="function" to cater for now.
I'm seeing this this pattern consistently:
<a></a>
<h2><span>*function-name-here*</span></h2>
NOTE: This forum is obfuscating the anchor name and span class attributes, but I see a usable patern emerging
Quote
I dunno, I'm no regex wiz...
:-) Me neither, but count yer blessings, regexes are more of a headache
in elisp
See http://www.johnsons-web.com/demo/newlisp/parse-nl-docs.r.txt
The following labels=>
char-entities: ;; data structure
clean: ;; subroutine
parse-all: ;; subroutine
are the operational components. It's kind of quick-and-dirty, but
I tried to write it in a way that a newlisper could easily follow,
and provided some documentational comments.
If one can follow the logic:
1)I'd appreciate Lutz evaluating the accuracy of the logic.
2)It should be easy to write a newlisp script to accomplish the same.
See http://www.johnsons-web.com/demo/newlisp/newlisp-docs.txt
For the output