Finding the attributes of Chinese filenames (Win32)

Started by axtens, June 21, 2010, 07:32:56 PM

Previous topic - Next topic

axtens

The following NewLISP code shows me the file attributes of files under Win32. However, some of the filenames retrieved have Chinese characters in the name. When the GetFileAttributesA function encounters them, it gives me a -1 for the attribute. I looked at GetFileAttributesW but don't know how to make the contents of the fname available to the function in a form it recognises.



How does one handle this situation?


(define (get-archive-flag file-name)
    (if (not GetFileAttributesA)
        (begin
        (import "kernel32.DLL" "GetFileAttributesA")
        )
    )
    (setq fname file-name file-attrib (GetFileAttributesA (address fname)))  
    (append fname " " ( string file-attrib))    
)

; walks a disk directory and prints all path-file names
;
(define (show-tree dir)
    (if (directory dir)
        (dolist (nde (directory dir))
            (if (and (directory? (append dir "/" nde))
                (!= nde ".") (!= nde ".."))
                (show-tree (append dir "/" nde))
                (println (get-archive-flag (append dir "/" nde)))
            )
        )
    )
)

(show-tree "z:\working files\Cathy")

m35

#1
Is there a reason you can't use the built in file-info function?



If you have to use the GetFileAttributes function with unicode file names, you can use the UTF-8 version of newLISP, along with this function to convert the UTF-8 paths to UTF-16 which can then be passed to GetFileAttributesW.



(constant 'SIZEOF_WCHAR 2) ; assumption

(define (utf8->16 lpMultiByteStr , cchWideChar lpWideCharStr ret)

    ; calculate the size of buffer (in WCHAR's)
    (setq cchWideChar (MultiByteToWideChar
        CP_UTF8 ; from UTF-8
        0       ; no flags necessary
        lpMultiByteStr
        -1      ; convert until NULL is encountered
        0
        0
    ))
   
    ; allocate the buffer
    (setq lpWideCharStr (dup " " (* cchWideChar SIZEOF_WCHAR)))
   
    ; convert
    (setq ret (MultiByteToWideChar
        CP_UTF8 ; from UTF-8
        0       ; no flags necessary
        lpMultiByteStr
        -1      ; convert until NULL is encountered
        lpWideCharStr
        cchWideChar
    ))
    (if (> ret 0) lpWideCharStr nil)
)

axtens

#2
Wow, cool code!


QuoteIs there a reason you can't use the built in file-info function?


Perhaps I'm not seeing something that's right in front of me, but the manual doesn't say anything about returning the 'archive bit' status when using file-info.

m35

#3
Quote from: "axtens"Wow, cool code!


QuoteIs there a reason you can't use the built in file-info function?


Perhaps I'm not seeing something that's right in front of me, but the manual doesn't say anything about returning the 'archive bit' status when using file-info.

Oh ok, then you're right to use the Win32 api to get that platform specific attribute.



Playing around with unicode on Windows can be tricky. Wish I could direct you to a good comprehensive source of info about how to deal with it, but I've never seen comprehensive info like that (had to figure it out on my own). Maybe http://newlispfanclub.alh.net/forum/viewtopic.php?f=9&t=1694">this thread I wrote years ago might also help a bit (the functionality described has since been integrated directly into newLISP, thus making it obsolete--but it's a nice reference).

axtens

#4
@m35, your help is very much appreciated.



It seems a little weird to me doing the slice on the reverse of the bits but I couldn't find any bit_and functionality anywhere (quickly).



Thanks,

Bruce.


;code from m35
(constant 'SIZEOF_WCHAR 2) ; assumption
(constant 'CP_UTF8 65001)

(define (utf8->16 lpMultiByteStr , cchWideChar lpWideCharStr ret)
(if (not MultiByteToWideChar)
(begin
(import "kernel32.DLL" "MultiByteToWideChar")
)
)
; calculate the size of buffer (in WCHAR's)
(setq cchWideChar
(
MultiByteToWideChar
CP_UTF8 ; from UTF-8
0       ; no flags necessary
lpMultiByteStr
-1      ; convert until NULL is encountered
0
0
)
)
   
; allocate the buffer
(setq lpWideCharStr (dup " " (* cchWideChar SIZEOF_WCHAR)))
   
; convert
(setq ret
(
MultiByteToWideChar
CP_UTF8 ; from UTF-8
0       ; no flags necessary
lpMultiByteStr
-1      ; convert until NULL is encountered
lpWideCharStr
cchWideChar
)
)
(if (> ret 0) lpWideCharStr nil)
)

; resets the Win32 archive flag on a file
; By CaveGuy 2009

(define (get-archive-flag file-name)
(if (not GetFileAttributesW)
(begin
(import "kernel32.DLL" "GetFileAttributesW")
)
)
(setq fname file-name
file-attrib (GetFileAttributesW (utf8->16 fname))
)  
file-attrib  
)

; walks a disk directory and prints all path-file names
;
(define (show-tree dir)
(if (directory dir)
(dolist (nde (directory dir))
(if (and (directory? (append dir "/" nde)) (!= nde ".") (!= nde "..") )
(show-tree (append dir "/" nde))
(begin
(setq fname (append dir "/" nde))
(setq fflag (get-archive-flag fname))
(setq fbits (bits fflag))
(if (= (slice (reverse fbits) 5 1) "1") (println fname))
)
)
)
)
)

(show-tree "//iibt-spare/temp/Scans")

m35

#5
Quote from: "axtens"
It seems a little weird to me doing the slice on the reverse of the bits but I couldn't find any bit_and functionality anywhere (quickly).

In case you haven't stumbled across them yet, http://www.newlisp.org/downloads/newlisp_manual.html#bit_operators">here are the bit operators.