Read-line and Mac text files?

Started by cormullion, December 12, 2005, 12:56:09 AM

Previous topic - Next topic

cormullion

Can I use read-line to read old-style Mac text files? These use ASCII 13 as line separator, rather than 10 or 13/10.  The manual implies that I can't. These files usually have to be run through (UNIX) tr before they can be piped to other UNIX commands:



  tr "r" "n"  temp.mif > temp1.mif

Lutz

#1
You cannot use 'read-line' for old style Mac files with carriage-returns (ASCII 13) as only line separator, but there is a work around:

(parse (read-file "oldmactextfile") "r")


This will read the file and break it up into a list of lines with the line terminator stripped:



> (write-file "test" "abcde13wxyz13qwert13")
17
> (parse (read-file "test") "r")
("abcde" "wxyz" "qwert" "")
>


Lutz

Lutz

#2
... and I forgot: If the file is very large, then reading it in one chunk might not be the appropiate thing to do. In this case there is a feature in 'read-buffer' that lets you wait for string:

> (open "test" "read")
3
> (read-buffer 3 'line 256 "r")
6
> line
"abcder"
> (read-buffer 3 'line 256 "r")
5
> line
"wxyzr"
> (read-buffer 3 'line 256 "r")
6
> line
"qwertr"
> (chop line)
"qwert"
>

On the last line it is shown how to 'chop' off the last character of the line. See the manual for details.



Lutz

cormullion

#3
Thanks - ideally read-line would detect it automatically :-) , but when I'm certain that a file is ASCII 13-delimited I'll be able to read it...

cormullion

#4
How would I make a "pipe" that reads large files where the lines are delimited by "r" (13)?



I haven't managed to make read-buffer work following your suggestion... I want to read STDIN I think...

Lutz

#5
This works:



> (write-file "junk" "abc13def13xyz13")
12
> (exit)
~> cat prog
(while (read-buffer 0 'buff 256 "r") (println "=>" buff))

~> cat junk | newlisp prog
=>abc
=>def
=>xyz
~>


The first line makes a file with ascii 13 delimited lines. The newlisp program 'prog' contains the line:



(while (read-buffer 0 'buff 256 "r") (println "=>" buff))



I use 0 for the stdin file handle, then I pipe the file 'junk' through it doing:



cat junk | newlisp prog



If the file is not too long, you also could just read it in in one chunk and then split it using (parse buff {r} 0):



> (parse (read-file "junk") {r} 0)
("abc" "def" "xyz" "")
>


Lutz





Lutz

cormullion

#6
Thanks - forgot that 0 meant STDIN! :-)