Segmentation Fault running NewLisp "Daemon mode"

Started by newdep, July 17, 2004, 03:53:36 AM

Previous topic - Next topic

newdep

Hi Lutz,



Running Newlisp 8009.



Segmentation Fault occeurs when connecting to Newlisp when its running

in Daemon Mode.



Example #1:



bash-2.05b$ newlisp -L -l -d 50000 &

[1] 705



--- Now I telnet to "localhost 50000"

--- Newlisp prompt

--- >(exit)



[1]+  Segmentation fault      newlisp -L -l -d 50000

bash-2.05b$







Example #2:



bash-2.05b$ newlisp -L -l -d 50001 &

[1] 713



--- Now I telnet to "localhost 50001"

--- Newlisp prompt

--- >(exit)



bash-2.05b$ fg

newlisp -L -l -d 50001

Segmentation fault

bash-2.05b$







Hope you can catch the bugger...



Norman...
-- (define? (Cornflakes))

newdep

#1
Hello Lutz,





Also in version 8.0.10 the Segmentation fault occeurs on the daemon site.



When Newlisp is running in -d mode (not in -p mode)

and the remote client connects and ONLY types (exit)  on the first prompt

then NewLisp dumps with Segmentation Fault.



When the client presses first ENTER and THEN types the (exit) ..its oke..



(little issue i think)





Norman.
-- (define? (Cornflakes))

Lutz

#2
This bug has been there for a long time and does not occur on Win32 and BSD. It is not limited to doing an (exit) right away, but can also occur in other circumstances.



If you have any idea how to fix this, help would be appreciated ;)



Lutz



ps: use only one of -L or -l options, if you use both the last one will win

eddier

#3
Works fine using the Linux 2.6.x kernel. Cannot remember the last digit. What kernel are you using?



Eddie

Lutz

#4
I am not sure, have to check, whatever Mandrake 9.2 uses. Sometimes you have to exit and reconnect from the client several times to provoke the error, as if it is some timing problem.



Lutz

eddier

#5
Ok. I see. After two tries I got the segment fault.



Mandrake's latest kernel is probably 2.4.x. I think all stable distributions except maybe turbo Linux use this kernel. On install, you can choose the 2.2.x or the 2.4.x kernel, it defaults to 2.2.x.



I'm using Debian testing. For a client this is ok. For the server side I would talk to someone who deals with security. However, I've noticed everything is much faster with the 2.6 kernel. EVERYTHING!



I've run 2.2.x (Mandrake), 2.4.x (Debian), FreeBSD, NetBSD and 2.6.x (Debian) on this same machine (AMD 2400+ with 512M memory). I noticed that the 2.6.x kernel has a much snappier response than 2.4.x and even FreeBSD. I wonder if that holds as well on Intel machines?



I like FreeBSD for a server and 2.6.x as a client.



Eddie

newdep

#6
Im running here slackwre 2.4.20 / 2.4.26...



Lets see if i can find anything rlated to this issue in the code....



Regards, Norman.
-- (define? (Cornflakes))

newdep

#7
Oke doing some tracing on my linux machine ..here is the output of the 1st oke and the 2nd dumps..



It stops after the filecontrol64 (fcntl64) the second time. this is 1 daemon sessions btw...



Looks like there is filecontrol happening on a variable wich isnt there..



Perhpas you see it quicker ;-)







### FIRST ###



accept(3, {sin_family=AF_INET, sin_port=htons(34398), sin_addr=inet_addr("127.0.0.1")}}, [16]) = 1

getpeername(1, {sin_family=AF_INET, sin_port=htons(34398), sin_addr=inet_addr("127.0.0.1")}}, [16]) = 0

time(NULL)                              = 1090333867

open("/etc/localtime", O_RDONLY)        = 4

fstat64(4, {st_mode=S_IFREG|0644, st_size=1074, ...}) = 0

old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40015000

read(4, "TZifrr"..., 4096) = 1074

close(4)                                = 0

munmap(0x40015000, 4096)                = 0

fcntl64(1, F_GETFL)                     = 0x2 (flags O_RDWR)

fstat64(1, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0

old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40015000

_llseek(1, 0, 0xbffff570, SEEK_CUR)     = -1 ESPIPE (Illegal seek)

munmap(0x40015000, 4096)                = 0

write(1, "newLISP v8.0.10 Copyright (c) 20"..., 70) = 70

ioctl(1, SNDCTL_TMR_TIMEBASE, 0xbffff700) = -1 EINVAL (Invalid argument)

write(1, "n> ", 3)                     = 3

read(1, "(", 1)                         = 1

read(1, "e", 1)                         = 1

read(1, "x", 1)                         = 1

read(1, "i", 1)                         = 1

read(1, "t", 1)                         = 1

read(1, ")", 1)                         = 1

read(1, "r", 1)                        = 1

read(1, "n", 1)                        = 1

close(1)                                = 0

old_mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40015000

write(-1, "n", 1)                      = -1 EBADF (Bad file descriptor)

close(1)                                = -1 EBADF (Bad file descriptor)





### SECOND ###

accept(3, {sin_family=AF_INET, sin_port=htons(34399), sin_addr=inet_addr("127.0.0.1")}}, [16]) = 1

getpeername(1, {sin_family=AF_INET, sin_port=htons(34399), sin_addr=inet_addr("127.0.0.1")}}, [16]) = 0

time(NULL)                              = 1090333877

fcntl64(1, F_GETFL)                     = 0x2 (flags O_RDWR)

fstat64(1, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0

old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40017000

_llseek(1, 0, 0xbffff570, SEEK_CUR)     = -1 ESPIPE (Illegal seek)

munmap(0x40017000, 4096)                = 0

write(1, "n> ", 3)                     = 3

read(1, "(", 1)                         = 1

read(1, "e", 1)                         = 1

read(1, "x", 1)                         = 1

read(1, "i", 1)                         = 1

read(1, "t", 1)                         = 1

read(1, ")", 1)                         = 1

read(1, "r", 1)                        = 1

read(1, "n", 1)                        = 1

close(1)                                = 0

old_mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40017000

write(-1, "n> ", 3)                    = -1 EBADF (Bad file descriptor)

close(1)                                = -1 EBADF (Bad file descriptor)

accept(3, {sin_family=AF_INET, sin_port=htons(34400), sin_addr=inet_addr("127.0.0.1")}}, [16]) = 1

getpeername(1, {sin_family=AF_INET, sin_port=htons(34400), sin_addr=inet_addr("127.0.0.1")}}, [16]) = 0

time(NULL)                              = 1090333887

fcntl64(1, F_GETFL)                     = 0x2 (flags O_RDWR)

--- SIGSEGV (Segmentation fault) ---

+++ killed by SIGSEGV +++
-- (define? (Cornflakes))

Lutz

#8
thanks for the trace Norman, I think I found the problem



Lutz

Lutz

#9
Version 8.0.11 in http://newlisp.org/downloads/development/">http://newlisp.org/downloads/development/ solves this problem.



Lutz

newdep

#10
Luts thanx for the quick fix...but now in rel 8.0.11 the -d function does not daemon anymore.. drops out after 1 connection exits...



Now it looks like when the clients (exit) from the daemon the daemon re-binds

to the port too quickly...because it has closed it..and fails and exits...



But your right, the segmentation is gone ;-)



PS: perhpas 1 hint for enhancement, when 1 client is connected you can close the "listener" so no more clients can connect and you will keep your current session... that way you dont "pressure" newlisp on the sockets...and siply re-open the listener when the client has exit...
-- (define? (Cornflakes))

Lutz

#11
This is crazy, nn my system Mandrake Linux 9.2 with kernel 2.4.22 it is working Ok.



I then added



deleteInetSession(sock);

close(sock);



after:



connection = accept(sock, (struct sockaddr *) &dest_sin, &dest_sin_len);



in the function: FILE * serverFD(int port, int reconnect) in file nl-sock.c closing the listen socket after accepting a connection, and now it also exits right away on my side and also breaks it on BSD, which all doesn't make much sense :(



Lutz

Lutz

#12
Norman, I wonder if this http://newlisp.org/downloads/development/Norman/">http://newlisp.org/downloads/development/Norman/



make any difference in the -d mode on your Linux system?



Lutz

newdep

#13
Hi Lutz,



Did a quick test but no changes...although in -d mode after the first connect

from the client and disconnect the daemon also quits...(nicely) but does not run as daemon anymore...



Norman.
-- (define? (Cornflakes))

eddier

#14
Doesn't break on Debian 2.6.x



Eddie