qa-float crash

newdep · September 26, 2009, 12:56:53 PM

aaa Fixed it !

Lutz i sent you a PM on this... Ill post the solution inhere when you checked it..

Look Mam... no hands!

> (/ 0 0)

ERR: division by zero in function /

> (div 0)

ERR: division by zero in function div

> (div 0 0)

nan

> (log 0)

-inf

> (sqrt -1)

nan

> (div 0)

inf

>

Lutz · September 26, 2009, 01:24:22 PM

yes, seems to be solved. This will avoid the compiler warning:

Code Select Expand
#ifdef OS2
    case SIGFPE:
        errorProc(ERR_MATH);
        break;
#endif

so it runs qa-float (the one in 10.1.6 checking signed inf) well?

I will make a either a development release for 10.1.6, or perhaps wait until the next Release update and post just the affected files in the development directory.

newdep · September 26, 2009, 01:39:14 PM

It seems that the very first time a NaN or Inf orrceur it returns

the "ERR: division by zero" message.. The next time you run the same

function again it returns the NaN or Inf..

So somehwere still the ERR: is in the way...

The qa-float now stops at ERR:

Code Select Expand
* fresh startup *

> (sqrt -1)

ERR: division by zero in function sqrt

> (sqrt -1)
nan

Code Select Expand

* fresh startup *

> (div 0)

ERR: division by zero in function div
> (div 0)
inf
>

Lutz · September 26, 2009, 02:17:26 PM

you have this in line 343 in function setupAllSignals() ?

Code Select Expand
#ifdef OS2
setupSignalHandler(SIGFPE, signal_handler);
#endif

Lutz · September 26, 2009, 02:19:52 PM

... perhaps you just take "errorProc(...)" out and let it catch it doing nothing:

Code Select Expand
#ifdef OS2
    case SIGFPE:
        break;
#endif

in line 412

newdep · September 26, 2009, 02:31:11 PM

No that doesnt work, I read somewhere that actualy catching the SIGFPE you need

a longjmp or create a function... I think thats now happening with the errorProc action..

Defining directly the PrintErrorMessage(...) only doesnt work..leaving it empty with a

break causes the real SIGFPE again.. So i need to re-route the Signal and clear

the ERR befor its displaying the NaN...

* added *

what does the return(nilCell); do in the errorProcAll ? I think i need that in

the SIGFPE.. Or a Signal Reset ?

newdep · September 26, 2009, 02:56:10 PM

Just wanted to see what happend with the signals actualy,

The first time when newlisp starts and seeing a division by zero it reports the ERR:

And I get a trap Number 8 (which is SIGFPE).. The second time NO signal! but directly

the "inf".. Is this just dumb Luck? Or is there realy something in between? The secondtime its not from the SIGFPE else I would have seems the Singal message again...Mmmm

newLisp v 10.1.6 ........

> (div 0)

Signal = 8

ERR: division by zero in function div

> (div 0)

inf

>

newdep · September 26, 2009, 03:13:54 PM

Ill try a different signal handler tomorrow..

GNU writes this about SIGFPE..

~~Quote~~— Macro: int SIGFPE

The SIGFPE signal reports a fatal arithmetic error. Although the name is derived from "floating-point exception", this signal actually covers all arithmetic errors, including division by zero and overflow. If a program stores integer data in a location which is then used in a floating-point operation, this often causes an "invalid operation" exception, because the processor cannot recognize the data as a floating-point number. Actual floating-point exceptions are a complicated subject because there are many types of exceptions with subtly different meanings, and the SIGFPE signal doesn't distinguish between them. The IEEE Standard for Binary Floating-Point Arithmetic (ANSI/IEEE Std 754-1985 and ANSI/IEEE Std 854-1987) defines various floating-point exceptions and requires conforming computer systems to report their occurrences. However, this standard does not specify how the exceptions are reported, or what kinds of handling and control the operating system can offer to the programmer.

BSD systems provide the SIGFPE handler with an extra argument that distinguishes various causes of the exception. In order to access this argument, you must define the handler to accept two arguments, which means you must cast it to a one-argument function type in order to establish the handler. The GNU library does provide this extra argument, but the value is meaningful only on operating systems that provide the information (BSD systems and GNU systems).

FPE_INTOVF_TRAP

Integer overflow (impossible in a C program unless you enable overflow trapping in a hardware-specific fashion).

FPE_INTDIV_TRAP

Integer division by zero.

FPE_SUBRNG_TRAP

Subscript-range (something that C programs never check for).

FPE_FLTOVF_TRAP

Floating overflow trap.

FPE_FLTDIV_TRAP

Floating/decimal division by zero.

FPE_FLTUND_TRAP

Floating underflow trap. (Trapping on floating underflow is not normally enabled.)

FPE_DECOVF_TRAP

Decimal overflow trap. (Only a few machines have decimal arithmetic and C never uses it.)

newdep · September 27, 2009, 12:25:57 PM

Hi Lutz,

This works out of the box on my OS2 machine, no strange things at all.

In newlisp I still get the very first time the SIGFPE occeurs the ERR:... and then

after the second time the nan or inf..

Perhpas you know where the "ERR:" mixup could be in newlisp?

Because I cant find it...;-)

Code Select Expand
#include <stdio>
#include <float>
#include <signal>
#include <math>
#include <setjmp>

/* testing NaN and Inf return */


/* store stack */
jmp_buf errorJump;
int errorReg = 0;

void signal_handler(int sig)
{
switch(sig)
	{
	case SIGFPE: 
/*	signal(SIGFPE,SIG_DFL); */
	printf("%s", "SIGFPE!n"); 
	longjmp(errorJump,errorReg);	
	break;	
	default: return;	
	}

}

int main ()
{

	/* save stack */
	setjmp(errorJump);

	/* nan-inf go through sigfpe */
	signal(SIGFPE, signal_handler); 

	double nfloat;

	nfloat = (sqrt (-1));
	printf("sqrt=%fn", nfloat );

	nfloat = (log (0));
	printf("log=%fn", nfloat );

	nfloat /= 0;
	printf("div=%fn",  nfloat ); 

}

Code Select Expand

[E:PROGNLnewlisp-10.1.6]f
SIGFPE!
sqrt=nan
log=-inf
div=-inf

Here i removed the errorProc function and only used the longjmp.

That returns the first time nothing and then the nan. Thats also

not like my example, seems there is still some code messing around

in the results inside newlisp? But i cant fint it..

Code Select Expand
> (sqrt -1)
> (sqrt -1)
nan
>

newdep · September 28, 2009, 02:07:21 AM

I found the double entry that causes the problem..

Its inside the errorreg check...

Mmm actualy its a setjmp longjmp issue

where the int var is 0 or 1..because of the amount

of jmp's used now..

Code Select Expand
if((errorReg = setjmp(errorJump)) != 0) 
    {
    printf("ErrorReg2=%dn", errorReg);

    if(errorReg && (errorEvent != nilSymbol) ) 
        executeSymbol(errorEvent, NULL, NULL);
    else  exit(-1);

    goto AFTER_ERROR_ENTRY;
    }

I first though there might be a difference in the gcc or OS2 regarding the setjmp

behaviour so i tested with this -> http://www-personal.umich.edu/~williams/archive/computation/setjmp-fpmode.html">//http://www-personal.umich.edu/~williams/archive/computation/setjmp-fpmode.html

But thats identical on both my Linux and OS2..

a closer look returns this flow during newlisp ->

first setjmp = 0 (=errorReg) in the funtion above, then the (div 0) appears.

The errorReg in the signal_handler of SIGFPE sees the errorReg = 0 (initial)

Because there is a longjmp the next setjmp gets a 1 (from the longjmp).

so the errorReg checkup does "goto AFTER_ERROR_ENTRY" with a new

setjmp but thatone returns ofcourse 1 (due to the last longjmp)..

At this point the signal_handler & the jmp_buf are both 1 at the stack is the same.

Oke.. im looking inside the code now for a fix because these "saved stacks" need to be in sync ;-)

newdep · September 28, 2009, 04:47:09 AM

I could cheet by putting a SIGNAL trigger like sqrt(-1); inside the C code.

But thats not what I would like to see, also not sure if the stacks are

in sync...

newdep · September 28, 2009, 05:11:58 AM

I dont see any workable way currently without cheeting on the SIGFPE

.. perhpas you have an extra clue ?

This is how the SIGFPE adjustment to 10.1.6 now looks ->

this is inside setupallsignals ->

Code Select Expand
#ifdef OS2
setupSignalHandler(SIGFPE, signal_handler);
/* force a SIGFPE trigger when newlisp starts */
/* this is to activate the NaN Inf returns!   */
(sqrt (-1));
/**********************************************/

this is inside the signal_handler

Code Select Expand
#ifdef OS2
     /* SIGFPE must be forced for a NaN Inf */
	/* the longjmp returns 1 to setjmp when set */
	case SIGFPE:
		longjmp(errorJump,errorReg);
		break;
#endif

the output of qa-float is this ->

Code Select Expand
operation on NaN result in NaN                 
-----------------------------------------------
                     (NaN? (mul 1 aNan)) => true
                     (NaN? (div 1 aNan)) => true
                     (NaN? (add 1 aNan)) => true
                     (NaN? (sub 1 aNan)) => true
                       (NaN? (sin aNan)) => true
                       (NaN? (cos aNan)) => true
                       (NaN? (tan aNan)) => true
                      (NaN? (atan aNan)) => true

comparison with NaN is always nil              
-----------------------------------------------
                        (not (<1> true
                        (not (> 1 aNan)) => true
                       (not (>= 1 aNan)) => true
                       (not (<1> true
                     (not (= aNan aNan)) => true

NaN is not equal to itself                     
-----------------------------------------------
                     (not (= aNan aNan)) => true

integer operations assume NaN as 0             
-----------------------------------------------
                        (= (- 1 aNan) 1) => true
                        (= (+ 1 aNan) 1) => true
                        (= (* 1 aNan) 0) => true
         (not (catch (/ 1 aNan) 'error)) => true
                         (= (>> aNan) 0) => true
                         (= (<<aNan> true

integer operations assume inf as max-int       
-----------------------------------------------
      (= (* 1 aInf) 9223372036854775807) => true
      (= (- aInf 1) 9223372036854775806) => true
     (= (+ aInf 1) -9223372036854775808) => true

FP division by inf results in 0                
-----------------------------------------------
                        (= (/ 1 aInf) 0) => true
                      (= (div 1 aInf) 0) => true

inf specials                                   
-----------------------------------------------
                           (= aInf aInf) => true
                  (NaN? (sub aInf aInf)) => true

retain sign of -0.0                            
-----------------------------------------------
        (= (set 'tiny (div -1 aInf)) -0) => true
                      (= (sqrt tiny) -0) => true

inf is signed too                              
-----------------------------------------------
                  (= aNegInf (div -1 0)) => true
                  (!= aNegInf (div 1 0)) => true

mod with 0 divisor is NaN                      
-----------------------------------------------
                       (NaN? (mod 10 0)) => true

% with 0 divisor throws error                  
-----------------------------------------------
           (not (catch (% 10 0) 'error)) => true

support of subnormals: (0 4.940656458e-324) => (0 4.940656458e-324)
machine epsilon: 1.110223025e-16 => 1.110223025e-16

newdep · September 28, 2009, 05:58:06 AM

forgot the extra setjmp, this too i added..the extra errorReg = setjmp(errorJump); call.

Code Select Expand
if((errorReg = setjmp(errorJump)) != 0) 
    {
    if(errorReg && (errorEvent != nilSymbol) ) 
        executeSymbol(errorEvent, NULL, NULL);
    else exit(-1);
    goto AFTER_ERROR_ENTRY;
    }

errorReg = setjmp(errorJump);
setupAllSignals();

Code Select Expand

Lutz · September 28, 2009, 06:48:34 AM

~~Quote~~Code Select Expand #ifdef OS2 /* SIGFPE must be forced for a NaN Inf */ /* the longjmp returns 1 to setjmp when set */ case SIGFPE: longjmp(errorJump,errorReg); break; #endif

The setjmp() will return only 1 if errorReg was 1, but on program start and after reset it is set to 0, and I think 0 is, what setjmp() when doing the longjmp(). If it would make setjmp() return a 1, then we would see "Not enough memory" reported as error, which is defined as 1.

Can you try this?

Code Select Expand
#ifdef OS2
   case SIGFPE:
      longjmp(errorJump,0);
      break;
#endif

I believe it also will work.

newdep · September 28, 2009, 07:16:18 AM

No that results in the same "double" effect..

Also when moving it to 1 its of no use, there is always a mismatch

in the jmp_buf content.

How about sigsetjmp and siglongjmp and sigset_buf ?

This is how I see the flow in newlisp now with the sigfpe involved,

correct me here if im wrong ;-) only helps finding the itch...

Code Select Expand


main()
   |
errorReg = 0
setjmp(errorJump) 
   |
setupAllsignals init (NOT SIGFPE, because its only triggered on exception)
   |
(sqrt -1)  (on the newlisp console)
   |
SIGFPE trigger with errorReg = 0 (from the first fresh init)
longJump(errorJump,errorReg)  (initial stack with errorReg = 0)
   |
errorReg = 1 (is always 1 when returns from LongJump!)
setjmp(erroJump) != 0
(no return on console (sqrt -1) because errorReg is now 1 which is a NEW stack)
   |
errorReg = setjmp(errorJump)  (is now 1 because of longjmp)
setupAllSignals (no trigger for SIGFPE)
   |
(sqrt -1)
   |
SIGFPE trigger with errorReg = 1 (new errorReg value from previous setjmp)
  |
(return "nan") (because the jmp_buf stack is now in sync)

newLISP Fan Club

News:

qa-float crash

newdep

Lutz

newdep

Lutz

Lutz

newdep

newdep

newdep

newdep

newdep

newdep

newdep

newdep

Lutz

newdep