Parsing through an csv file

Started by nix, May 03, 2013, 06:53:37 AM

Previous topic - Next topic

nix

Hello,



   I am trying to parse through sn csv file using http://static.artfulcode.net/newlisp/csv.lsp.html#CSV_parse-file">http://static.artfulcode.net/newlisp/cs ... parse-file">http://static.artfulcode.net/newlisp/csv.lsp.html#CSV_parse-file.



I am very new at NewLisp so please bear with me. Below in the codeblock is what i have started.
(load '/path/csv.lsp')
(define (csv_path ("/path/file.csv")))
(csv_path)
(CSV:parse-string("LastSeen")


So I can get newlisp to parse through the whole file and its really fast much faster then python.



So what I am trying to do is parse through the csv file then I would like to filter (like how excel does) on the columns and grab certain "strings" and print them out via stdout.



Any help would be very helpful, also if someone could point me in the right direction for a mailing list or usenet group so i can lean on until I get used to programming in NewLisp.



Thanks in advance

cormullion

#1
It looks like you need to work through the documentation a bit... :)  But here's something to get started:


(load "csv.lsp")
(define csv_path "file.csv")
(set 'l (CSV:parse-file csv_path))


Now that you have stored the results of the parse-file function in a list, you're free to process the list. A simple way to do this is to go through line by line and look for what you want:


(dolist (e l)
  (if (find "godot" e)
      (println e)))


There are plenty of other things you can try, but keep it simple for now!



This forum is the best place for newLISP help!

rickyboy

#2
Looks like you can use the module this way.  Suppose you have a comma separated values (CSV) file called file.csv which has these contents.


1,2,3,4
Bob,"Accting Dept",20
"Sue","Personnel",132
"Ted","IT", 42,"More stuff for some odd reason"

Then in newLISP, after you load the CSV module (csv.lsp), just say


> (CSV:parse-file "file.csv")
(("1" "2" "3" "4")
 ("Bob" "Accting Dept" "20")
 ("Sue" "Personnel" "132")
 ("Ted" "IT" " 42" "More stuff for some odd reason"))

Notice that the contents of your csv file are now converted to lists for easier processing in newLISP (LISP meaning "LISt Processing" :).



Now, it is your job to manipulate these newLISP lists to get the info you want.  I hope that makes sense.



Here's a caveat though.  Be careful of a bug I just found in testing the module csv.lsp.  It will not handle a corner case: this one is when your input csv line contains no value in a column.  As soon as the CSV parser encounters the first such nil, it will stop processing that line.  Compare the following evaluations at the REPL to see this problem.


> (CSV:parse-string "1,2,3,4")
(("1" "2" "3" "4"))
> (CSV:parse-string ",2,3,4")
(())
> (CSV:parse-string "1,,2,,,6,,,,")
(("1"))
(λx. x x) (λx. x x)

rickyboy

#3
cormullion is right ... and fast on the trigger!
(λx. x x) (λx. x x)

cormullion

#4
you were busy doing some proper testing... :)

rickyboy

#5
Empty field bug fix



The bug demo/fix needs a test csv file.


$ cat > test-for-empty-fields.csv
1,2,3,four,"five",6,,,,
,2,,4,5,,,9,10
,,3,4,5,,,,,
,,,4,,,,9,

Fire up newLISP and recreate the bug with the test file:


$ newlisp csv.lsp
newLISP v.10.4.5 on OSX IPv4/6 UTF-8 libffi, execute 'newlisp -h' for more info.

> (CSV:parse-file "test-for-empty-fields.csv")
(("1" "2" "3" "four" "five" "6" "" "" "")
 ()
 ("")
 ("" ""))

Fix it.  This is a hot (online) fix, of course.  However, for a permanent fix, you will need to replace the definition of regex-token-empty in csv.lsp with the following definition:


> (context 'CSV)
CSV> (define (regex-token-empty delimiter) (format "^%s" delimiter))
CSV> (context 'MAIN)
> ;; Now let's try the same test.
> (CSV:parse-file "test-for-empty-fields.csv")
(("1" "2" "3" "four" "five" "6" "" "" "")
 ("" "2" "" "4" "5" "" "" "9" "10")
 ("" "" "3" "4" "5" "" "" "" "")
 ("" "" "" "4" "" "" "" "9"))
> ;; Looks much better. :)

Also, my first test input file file.csv works the same as it did before the fix.  Yes!  (Whew! :)


> ;; Regression test:
> (CSV:parse-file "file.csv")
(("1" "2" "3" "4")
 ("Bob" "Accting Dept" "20")
 ("Sue" "Personnel" "132")
 ("Ted" "IT" " 42" "More stuff for some odd reason"))
(λx. x x) (λx. x x)

cormullion

#6
How's your Git?



https://github.com/kanendosei/artful-newlisp/blob/master/csv.lsp">//https://github.com/kanendosei/artful-newlisp/blob/master/csv.lsp



:)

rickyboy

#7
Git outta here! :-)

<rimshot/>
(λx. x x) (λx. x x)

rickyboy

#8
Now, I'm truly a hipster coder.  Gee thanks, cormullion.  :/

https://github.com/kanendosei/artful-newlisp/pull/1">https://github.com/kanendosei/artful-newlisp/pull/1
(λx. x x) (λx. x x)

cormullion

#9
Impressed! That'll wake Kanen up... :)