CGI and State

Started by protozen, July 09, 2015, 06:11:25 PM

Previous topic - Next topic

protozen

What is the best way to manage state when using CGI for web apps. Say I want to store program and session data, not using cookies - but the concept of session data from other web frameworks. I need to maintain persistent data in memory between requests.



Creating a newlisp server using newlisp -c , is suppose to create a stateful server right?



"After each transaction, when a connection closes, newLISP will go through a reset process, reinitialize stack and signals and go to the MAIN context. Only the contents of program and variable symbols will be preserved."



But during CGI processing and the connection is closed the reset happens. Which approach should I be looking into to solve my problem? Thanks a bunch.

rrq

#1
The CGI handling spawns CGI scripts as separate processes, which need to terminate for each request, so to retain session state across a series of requests requires a separate session server.



I suppose the "traditional" set up would be a "back end database server" capable of holding session state, with the transitional business logic in CGI scripts. Also, if your session notion relates to the chaining of links, then response pages need to have the session id on all in-session links, and if it rather relates to user login, then the user id may be used as session key.



If you have only a few concurrent sessions, and complex or sizeable session state, you might consider the alternative architecture of spawning separate session servers, with the CGI scripts merely mediating the communication.

protozen

#2
Thanks that is what I figured. I think I'll end up with a session server of some sort.

TedWalther

#3
Are you thinking of using FastCGI, protozen?  I still haven't figured out a way to do shared state, although perhaps using newlisp itself as the web server, it can spawn processes as needed with state already initialized.
Cavemen in bearskins invaded the ivory towers of Artificial Intelligence.  Nine months later, they left with a baby named newLISP.  The women of the ivory towers wept and wailed.  \"Abomination!\" they cried.

rrq

#4
The problem of course is two-way: it's both that a request handling needs to be primed with the then current state, and that at the end, the new current state is captured and preserved unto the next request in the same session. With a session server, the main difficulty is really to dispatch requests to the right session server.



I suppose, you can (re-) write HTTP handling in newlisp so as to spawn request handlers (rather than using "popen") and then use the data sharing capabilities of newlisp to let the main process hold session state between request handlings. The spawned child automatically gains a copy of the server data, and it would send its session update to the server in addition to providing the HTTP response (before terminating).



Whilst the process control side of this borders the trivial, there's unfortunately a sizeable amount of code needed to provide the HTTP handling in all its variations, or even a useful subset, but it might be an option for a specific web application, if not for a framework.

protozen

#5
The concept of an application server for newlisp is really appealing, for small web applications that can be embedded into a single executable or package. I've looked at existing frameworks to see if any of them fit, but Artfulcode web handling modules are the closest I found. Since I don't need a full blown server to handle a small number of requests, I think this should work out fine. I've looked at fcgi and other interfaces to scale up and found some old attempts, but it's really not the direction I'm looking in. Newlisp is wonderfully handy and having a small web server / framework natively would be nice. I may consider working on such a beast or porting something like WEBrick from ruby, if it becomes necessary. I'm only looking at this out of interest on my own personal proof of concept programs.



Thanks guys

TedWalther

#6
I'll probably be implementing a FCGI frontend for newlisp soon.  Why?  Because it allows newlisp to be easily dropped in to an existing setup like hostmonster, and allows the advantages of shared state.  Apache, Nginx, and the new OpenBSD http server all support FastCGI.  In fact, the OpenBSD http daemon ONLY supports FastCGI.  So having the FastCGI interface has benefits, letting newLisp run on the base OpenBSD system, plus the speedup of using FastCGI for the frameworks.  I'll be looking more closely at Dragonfly and newLisp on Rockets, and may integrate them.



I've been going through the code to "slowcgi", it is fairly small and tight, may be suitable to repurpose for newLisp.



I got the LibreSSL library working in newLisp by FFI, integrating it natively is a bit farther away.  Be nice though, not only for things like get-url, but then we could do secure email too, with the TLSSTART standard.  I'm guessing, but I think IMAP also implements something like TLSSTART, not just SMTP.



The benefit of using slowcgi as a base, all the existing CGI code will work, and any new FastCGI will be the same as CGI code, but maybe with one or two more calls added in.
Cavemen in bearskins invaded the ivory towers of Artificial Intelligence.  Nine months later, they left with a baby named newLISP.  The women of the ivory towers wept and wailed.  \"Abomination!\" they cried.

rrq

#7
So, I of course couldn't resist the temptation of jotting together a small newlisp web application runner of my own, and, if you want to pucker your mind, it's available at http://www.realthing.com.au/files/newlisp/ranwar.tgz">//http://www.realthing.com.au/files/newlisp/ranwar.tgz.



On linux, you extract only the Makefile, then "make run", which will first extract service/lsptar.lsp for you, and then use this to run the tgz with newlisp. It'll service port 10001.



Or, you untar it all, and just point newlisp to service/front-end.lsp.

TedWalther

#8
Ralph, interesting code, clean and pleasurable to read.  I didn't have time to get too deep into it.  In future, could you tar it up inside its own directory, so that it untars into its own directory instead of the current directory?



Does it use newlisps -httpd mode?  Is it a backend for mod_lisp or mod_fastcgi?
Cavemen in bearskins invaded the ivory towers of Artificial Intelligence.  Nine months later, they left with a baby named newLISP.  The women of the ivory towers wept and wailed.  \"Abomination!\" they cried.

rrq

#9
Good point. I'll fix the tar. Thanks.



No, ranwar doesn't use -http mode, but rather implements the front-end itself. This is in order to allow multiple concurrent control connections as well as multiple concurrent client connections. Its built-in (configurable) limit says at most 10 concurrent request handlers, where the real limitation is that each connection needs at least 2 file descriptors.



In fact, the front-end doesn't reach the REPL at all. One could drop into REPL and have connection handling logic as a prompt-event or command-event handler, but it'd be moderately convoluted. Alternatively one could include stdin as a special control connection, and fake a REPL loop that way.



All in all, it's a complete HTTP service with all request handling in newlisp (and not conforming to any CGI spec version). However, as the main connection handling is in no way protected against malicious clients, I'd advice against using it as a public web server without improving on that. Also, the mime type discovery should probably utilize e.g. the "mimetype" program rather than the DIY approach, although it'd then be a less "pure newlisp code" solution.

rrq

#10
Just a note, that I've updated the distribution tgz to unpack into a directory, which you cd into and type "make run". That run directory holds the Makefile, lsptar.lsp, and the application runner, which is still packed into a tgz, and it unpacks like before into the several sub directories.



In addition to "run", the Makefile has the "pack" target for re-packing the application runner, and the "dist" target for re-packing a distribution into the parent directory, named by the run directory (plus .tgz).