Print Page - Threading question

Title: Threading question
Post by: Jeff on March 27, 2008, 10:43:35 AM

I saw a really interesting project (http://supertech.csail.mit.edu/cilk/) and had the idea to write something similar for newlisp. The idea is that there is a spawn keyword that automatically handles the creation of a thread and returns the result of the forked evaluation by setting a variable. Here is my implementation:

Code Select Expand
#!/usr/bin/env newlisp

(context 'spawn)

(set '_procs '())
(set '_counter (semaphore))

(define-macro (spawn:spawn)
  (println (semaphore _counter))
  (if (> 4 (semaphore _counter)) (spawn:sync))
  (semaphore _counter (+ 1 (semaphore _counter)))
  (let ((target-symbol (args 0))
        (expr (args 1))
        (shared (share)))
    (push
      (list shared target-symbol
        (letex ((body expr))
          (fork
            (begin
              (set 'spawn:_procs '())
              (share shared body)
              (exit)))))
      _procs -1)))

(define (spawn:sync)
  (let ((proc (pop _procs)))
    (when proc
      (letex ((shared (proc 0))
              (target (proc 1))
              (pid (proc 2)))
        (wait-pid pid)
        (semaphore _counter (- (semaphore _counter) 1))
        (set target (share shared))
        (share nil shared)))))

(context MAIN)

(define (sync)
  (while (spawn:sync)))

(constant (global 'sync))

(define (fib n , x y)
  (if (< n 2) 2
      (begin
        (spawn 'x (fib (- n 1)))
        (spawn 'y (fib (- n 2)))
        (sync)
        (+ x y))))

(define (fib2 n)
  (if (< n 2) 2
      (+ (fib2 (- n 1)) (fib2 (- n 2)))))

(println (fib 10))
(println (fib2 10))

You can see in spawn:spawn that it attempts to limit the number of forks, but it is not respecting the value and it gets quickly out of control. I can't think what is wrong. fib and fib2 (a regular version) both come up with correct results, so the synching is working. If I don't have a control for the number of processes, osx refuses to start more procs. Can anyone see what I'm missing here?

Title:
Post by: Jeff on March 27, 2008, 04:25:12 PM

Never mind, I see what's going on. I wasn't thinking :).

Title:
Post by: cormullion on March 28, 2008, 12:48:19 AM

You'll have to tell us someday - I spent 5 minutes deleting all the newLISP processes it produced. :)

Title:
Post by: newdep on March 28, 2008, 07:38:31 AM

hahahaha..

Title:
Post by: Jeff on March 28, 2008, 07:59:44 AM

Proc a spawns proc b. Proc b must spawn proc c to complete. If the max proc limit is hit, proc b can't spawn proc c, but proc a is still waiting on proc b before it can complete.

Title:
Post by: Lutz on March 28, 2008, 10:34:28 AM

Very interesting experiments! I wonder if it would be easier to use Unix local domain sockets instead of shared memory and semaphores to exchange data between parent and child processes.

You also wouldn't be limited to the segment size of shared memory (usually 4k) and avoid grid lock conditions when using semaphores.

Using shared memory is faster, but considering the other overhead incurred by forking, and considering the fact, that only result data gets passed once per forked process, the slower speed of local domain sockets is of no concern.

Title:
Post by: Jeff on March 28, 2008, 10:44:45 AM

I am working on a scheduler that will use piped worker threads to handle calculations and determine dynamically where to send work. The problem there is that subprocs don't inherit environment.

Title:
Post by: Lutz on March 28, 2008, 12:08:24 PM

~~Quote~~The problem there is that subprocs don't inherit environment.

Actually a 'fork'ed process inherits all of it:

Code Select Expand
> (env "AAAAA" "99999")
true
> (env)
("MANPATH=/opt/local/man:/usr/share/man:/usr/local/share/man:/usr/X11/man" "TERM_PROGRAM=Apple_Terminal" 
 "TERM=xterm-color" "SHELL=/bin/bash" "TMPDIR=/var/folders/da/dagS+gNgHr0tmLJJxS+VpE+++TI/-Tmp-/" 
..... some info deleted for security reasons
"NEWLISPDIR=/usr/share/newlisp" 
 "AAAAA=99999")
> (fork (println (env)))
592
> ("MANPATH=/opt/local/man:/usr/share/man:/usr/local/share/man:/usr/X11/man" "TERM_PROGRAM=Apple_Terminal" 
 "TERM=xterm-color" "SHELL=/bin/bash" "TMPDIR=/var/folders/da/dagS+gNgHr0tmLJJxS+VpE+++TI/-Tmp-/" 
..... some info deleted for security reasons
"NEWLISPDIR=/usr/share/newlisp" 
 "AAAAA=99999")
>

Title:
Post by: Jeff on March 29, 2008, 02:48:52 PM

I know. But I need to have the entire env with each fork; spawn can't know what functions will be called in subsequent forks.

Title:
Post by: Lutz on March 30, 2008, 06:41:37 AM

As I understand, the 'spawn' in the Cilk interface, each spawned process only needs the variable and stack environment from the parent, which passes it via the 'fork' call. The parent only needs to communicate with the spawned child once to receive the forked function result via shared memory.

I don't think the parent needs to know about any variable changes the child is making. Of course spawned code should be pure functional and should not change variables which are free non-local in the called function.

It also should be possible to implement Cilk without semaphores, because the danger of both, parent and spawned child accessing the shared memory should never occur.

This is how I would implement it:

The parent gets a shared memory segment for each child it spawns, passes it to the child and keeps a list of shares and pids for all spawned children. Whenever a child returns, the parent knows from the pid which shared memory segment to access. As the spawned childs may spawn other childs in turn recursively (as is the case in the fibonacci function) each spawned child should first purge the spawned child list it inherits from the parent, because it will create it's own spawned-child list.

A shared memory synchronization problem never occurs. Before a child returns it writes to the shared memory segment then the child exits. The parent then reads the shared memory segment and releases it.

Use the 1 option as in: (wait-pid pid 1) to make it non-blocking (WNOHANG). This way you can scan all spawned childs continuously in a loop. On OS X 'wait-pid' returns a hex 65000 when a child has finished and a 2 if nothing happened. For the pid where (wait-pid pid 1) returns the exit status, you lookup the shared memory address, read it, then release it.

When testing with fib use a small number because (fib 10) already branches into 177 processes a (fib 5) into 8.

ps: in reality nobody would use Cilk to run this fibonacci algorithm, but it makes a good test case to check the integrity of the implementation for recursive spawns.

Title:
Post by: Jeff on March 30, 2008, 07:38:20 AM

I assumed semaphores were synchronized. Anyway, yes- the fork only needs the parent's env for symbol lookup; a fork may call any function in the parent's scope. I used semaphores because all forks will need to know the total number of procs running. Each process needs to be able to decide whether to spawn a new proc or just run the code in the current proc and a semaphore seemed the easiest way.

Title:
Post by: Lutz on March 30, 2008, 09:13:11 AM

~~Quote~~Each process needs to be able to decide whether to spawn a new proc or just run the code in the current proc

Depending on the task to spawn, even if your are out of processor cores, it may still be and advantage to spawn/fork, e.g. for retrieving web pages, serving tcp/ip requests etc., where much of the task is just waiting, and running several processes on the same core still gives you an advantage.

But if you must know the number of processes running, I wonder if there is some sort of libc system call to inquire about this number?

The BSD man page for ps mentions a function pstat, but I couldn't find out anything about it.

ps: perhaps some sort of sysctl() call, see: man 3 sysctl

Title:
Post by: Lutz on March 30, 2008, 09:28:13 AM

for the number of max processes you can run on Mac OS X:

Code Select Expand
(int (last (parse (first (exec "sysctl kern.maxproc"))))) => 532

but still doesn't give us the number of procs running :-(

ps: or use kern.maxprocperuid for max per user

Title:
Post by: Jeff on March 30, 2008, 03:29:59 PM

The optimal number of processes to maintain is going to be around the number of cpus. If there is a constant scheduler thread, then less one for that.

I really need a mutex lock or a "synchronized", shared block of memory. Something that can be modified without knowledge of what other processes might also be accessing that block.

newLISP Fan Club

Forum => Anything else we might add? => Topic started by: Jeff on March 27, 2008, 10:43:35 AM