pure newLISP database

Started by Lutz, January 11, 2009, 05:38:14 PM

Previous topic - Next topic

Lutz

http://unbalanced-parentheses.nfshost.com/index.cgi">http://unbalanced-parentheses.nfshost.com/index.cgi



A great specification for a newLISP database API. It could be implemented transparently with several back-ends: a newLISP native, SQLite, MySQL etc. Code written to this API would be portable. Something like:



(nldb:back-end "native")

or:

(nldb:back-end "SQLite")



... etc., would be used to select the appropriate database implementation. Because your API is relational, it would  be easy to translate it to SQL in the back-end module.



"native" would be the first implementation, as it exactly defines how the API should behave. It would be complete in-memory with some nldb:save and nldb:load command to store and restore it to and from disk. It would be fast even with ten's of thousands of records.

cormullion

#1
Thanks! :)



You can find the code here:



http://unbalanced-parentheses.nfshost.com/downloads/nldb.nl">//http://unbalanced-parentheses.nfshost.com/downloads/nldb.nl



and there's the simple periodic table database here:



http://unbalanced-parentheses.nfshost.com/downloads/elements.nldb">//http://unbalanced-parentheses.nfshost.com/downloads/elements.nldb



I haven't finished the newlispdoc yet... :)



As for speed - well, preliminary results show that it's not as quick as an sqlite version (which is compiled C, after all), but not too bad considering the inefficient coding... :)

newdep

#2
Aaa nice one !



Regarding speed...it should be a fast as newlisp itself is, it nothing

more then a big list afterall ;-)



I remember seeing another db in this forum from last year..

cant find it but that was a nice one too...
-- (define? (Cornflakes))

cormullion

#3
Quote from: "newdep"I remember seeing another db in this forum from last year..

cant find it but that was a nice one too...


Yes, it was my starting point http://www.alh.net/newlisp/phpbb/viewtopic.php?p=10216">//http://www.alh.net/newlisp/phpbb/viewtopic.php?p=10216. I was hoping that he would develop it like he said he intended, but then he went off to Common Lisp-land or somewhere...



I ended up simplifying his model a bit, too. I like simplifying stuff... :)

unixtechie

#4
1. This is an exciting topic. The fact is THERE IS A SIMPLE DB TECHNIQUE/architecture that is roughly an order of magnitude faster than an SQL database in its typical applications (e.g. as a backend to a web application).



We can do a different and efficient newlisp-native DB if we code operations on columns only, rather than on tables of rows of data as it is done in SQL.



In fact, there exists currently a very interesting research project, a smallish database (think of MySQL before it achieved its corporate status), which implements column-based operations internally, but presents the user with SQL as a query language.

This database is open-source and available for download. My test run of TPC-H confirmed that this DB is roughly 10 times faster than MySQL, and it also eats up less memory: one won't be able to run the 1GB-scaled version of the benchmark on a machine with 1 GB of memory for MySQL, but it would run easily for the column-oriented DB.



It's easy to understand and prototype column operations even with shell utilities (and it's possible to do them in NewLisp, of course).



But this is only a first part of the story





2. There exists a simpler, more logical language to describe operations on data than SQL. Advocated by a university professor from the Low Contries, this approach seems to be dying now, after the demise of its author.

The language today lives only in an open-source tool that allows one to create databases and operations on them in this simpler language - and then translates them into SQL (for any of a wide number of subformats, e.g. Postgress, MySQL, Oracle, and even slqite).



The fact is the database product that the author of this concept created was also column-oriented, and therefore this simpler language is perfectly matched for a column-oriented approach.





So, this is a very interesting task, and I'd do this native scripted DB for NewLisp as a number of column-oriented operations, which might be presented to a user either or both as raw column operations, or as a set of SQL statements.

cormullion

#5
Interesting - but I'm not sure I know how to proceed down that path at the moment. More research required.



At the moment, unfortunately, performance is less than sqlite on the same dataset - although not too much less, considering... There are some less than ideal things going on behind the scenes..



Lutz - how are association lists stored in memory? For example, is storing a hundred of these:


((No 1)
  (AtomicWeight 1.0079)
  (Name "Hydrogen")
  (Symbol "H")
  (MP -259)
  (BP -253)
  (Density 0.09)
  (EarthCrust 0.14)
  (DiscoveryYear 1776)
  (Group 1)
  (IonizationEnergy 13.5984))


a lot worse than storing a hundred of these:


(1 1.0079 "Hydrogen" "H" -259 -253 0.09 0.14 1776 1 13.5984)

or is it not as inefficient as one would expect? It's all those repetitions of the keys that I suspect would be inefficient... With a few thousand rows, that's a lot of identical keys! The design at the moment assumes that this is something to be avoided...

didi

#6
After trying different other ways i see that nldb is the best solution for my application - hope i can show it to you soon .



Problem :


 (define (save-db filename)
;; Save the database in the named file.
 (let ((save-list tables))
    (push 'tables save-list)
    (apply save (cons filename save-list))
    (println "saved the database in " filename)))


This seems not to work, there is always only an empty list saved :

(set 'nldb:tables '()) - even if i  see the table with the (show) .

Don't think this is win-xp depending.

cormullion

#7
Puzzling. It looks OK here. Try this, to see what's different on your system:


newLISP v.10.0.0 on OSX IPv4 UTF-8, execute 'newlisp -h' for more info.

> (load "nldb.nl")
MAIN
> (context nldb)
nldb
nldb> (create-table 'test1 '(one two three))
((one two three))
nldb> (add-row 'test1 '("a" "b" "c"))
((one two three) ("a" "b" "c"))
nldb> (create-table 'test2 '(one two three))
((one two three))
nldb> (add-row 'test2 '("d" "e" "f"))
((one two three) ("d" "e" "f"))
nldb> tables
(test1 test2)
nldb> (save-db "test.nldb")
saved the database in test.nldb
"test.nldb"
nldb> (! "cat test.nldb")

(set 'nldb:tables '(nldb:test1 nldb:test2))

(set 'nldb:test1 '(
  (nldb:one nldb:two nldb:three)
  ("a" "b" "c")))

(set 'nldb:test2 '(
  (nldb:one nldb:two nldb:three)
  ("d" "e" "f")))

0
nldb>

didi

#8
Thanks cormullion - this seems to work :
QuoteMicrosoft Windows XP [Version 5.1.2600]

(C) Copyright 1985-2001 Microsoft Corp.



C:ShortNotizer>newlisp

newLISP v.10.0.0 on Win32 IPv4, execute 'newlisp -h' for more info.



> ( load "nldb.lsp" )

(lambda () (println (dup "_" 60) "n" "Contents of database " (context) "n" " "



  (length nldb:tables) " table"

  (if (> (length nldb:tables) 1)

   "s" "") ": " nldb:tables "n")

 (dolist (nldb:table nldb:tables)

  (println " Table:  " nldb:table)

  (println " Columns " (first (eval nldb:table)))

  (println " Rows:   " (length (rest (eval nldb:table))))

  (dolist (nldb:row (rest (eval nldb:table)))

   (println nldb:row))

  (println))

 (println (dup "_" 60))

 (sym (date)))

>

> ( context nldb )

nldb

nldb>

nldb> ( create-table 'test1 '( one two three ))

((one two three))

nldb> ( add-row 'test1 '("a" "b" "c" ))

((one two three) ("a" "b" "c"))

nldb> ( create-table 'test2 '( one two three ))

((one two three))

nldb> ( add-row 'test2 '("d" "e" "f" ))

((one two three) ("d" "e" "f"))

nldb> tables

(test1 test2)

nldb> ( save-db "test.nldb" )

saved the database in test.nldb

"test.nldb"

nldb> ( ! "type test.nldb" )

(set 'nldb:tables '(nldb:test1 nldb:test2))



(set 'nldb:test1 '(

  (nldb:one nldb:two nldb:three)

  ("a" "b" "c")))



(set 'nldb:test2 '(

  (nldb:one nldb:two nldb:three)

  ("d" "e" "f")))



0

nldb>


ok - so it should work ,  now i try this in my progam ..

didi

#9
OK - it's clear ,  i mixed  MAIN and nldb - context . Now  It works !

cormullion

#10
I do that too.... :) I might be doing things the wrong way, but I find I can either work in the context all the time and drop the prefixes or work in other contexts and use the prefixes everywhere.



I think I've added a (context MAIN) at the end since uploading it, which I think is 'the correct thing' to do.