Parsing C Header Files

Started by konrad, September 16, 2007, 06:36:34 PM

Previous topic - Next topic

konrad

Is anyone trying to parse C header files to auto-import modules. what the build in (import ...) function does is a good start but not generally sufficient as in addition to functions most C libraries also expose a large number of constants and structure definitions, in some cases using preprocessor Macros as well.



for completely smooth use of C libraries it would be handy to parse this stuff out and auto create it inside a lisp Context. removing annoying prefixes would also be handy : ).  Note one common pattern which is currently awkward from newlisp is where the calling code is expected to reserve memory for a complex structure and pass its address into the function.



I have to admit that my C foo is rather weak but this is a project I'm interested in., and would rather not re-invent the wheel if someone is already working on it.



regards



Konrad.

Jeff

#1
It's not too difficult to use the built in functionality.  Simply scanning for function prototypes should be sufficient.  It's not like FFIs in CL where you have to map each type to a newLISP type.



Typically, if you need regular access to data in a composite C type, you would define a function that accepts a pointer to the struct and use unpack or get-foo functions to pull the data out that you need.
Jeff

=====

Old programmers don\'t die. They just parse on...



http://artfulcode.net\">Artful code

oofoe

#2
I'm interested in automatic header interpretation myself. The SDL libraries aren't terribly complicated, but it's a pain to sift through all that manually, especially the event return structure...



Don't people write M4 macro processor replacements before breakfast in Lispland? ;-) Maybe I should try that, but after lunch.
Testing can show the presence of bugs, but not their absence.

oofoe

#3
I'm not so fast and awesome as to provide a replacement for m4, but here's a start at a tokenizer. Doesn't handle quotes, but that's not far away:



; ch2.lsp
; jrlf 2007-09-17
; Convert C .h header files to NewLisp import statements.


(context 'importer)

(setq whitespace " trn"     ; Whitespace to ignore.
      break '("#" ";" "(" ")") ; Characters to tokenize on.
)


(define (nocomment text)
  "( text -- cleaned) Removes comments and newlines. A bit of a cheat."

  (replace "/\*.+?\*/" (replace "[n|r]+" text " " 0) "" 0)
)


(define (tokenize text)
  "( text -- (token...)) Break text up into tokens."
  ; XXX Doesn't deal with quotes properly yet.

  (letn ((found '())
         (hold "")
         (quoting nil)
         (cleaned (nocomment text))
         )

        (dotimes (x (length cleaned))
          (letn ((c (cleaned x)))
                (if (<= 0 (find c whitespace))
                      (begin
                       (if (not (null? hold))
                           (setq found (append found (list hold))
                                 hold "")))
                      (<= 0 (find c break))
                      (begin
                       (if (not (null? hold))
                           (setq found (append found (list hold))
                                 hold ""))
                       (setq found (append found (list c)))
                       )
                      (setq hold (string hold c))
                  )
            )
          )
        found
        )
)


(define (read filename namespace)
  "( filename namespace -- code) Create library import code."


  (tokenize (read-file filename))
)

(context MAIN)


(println (importer:read "SDL_image.h" 'IMG))
[/code]
Testing can show the presence of bugs, but not their absence.

konrad

#4
Quote from: "Jeff"It's not too difficult to use the built in functionality.  Simply scanning for function prototypes should be sufficient.  It's not like FFIs in CL where you have to map each type to a newLISP type.



Typically, if you need regular access to data in a composite C type, you would define a function that accepts a pointer to the struct and use unpack or get-foo functions to pull the data out that you need.


The biggest issue is when you need to allocae memory for the data structure ahead of time, and there is a lot of functions which expect the caller to do the memory allocation and pass in addresses to be written to.