Student's t-distribution (inverse) in newLISP

Started by rickyboy, March 21, 2012, 12:51:20 PM

Previous topic - Next topic

rickyboy

Hello everyone,



I'm in need of a handy function to help me compute confidence intervals based on the Student's t-distribution.  It is easy to do in Excel, for instance, since they provide a Student t inverse; they call it TINV.  See https://courses.washington.edu/dphs568/course/excel-t.htm">//https://courses.washington.edu/dphs568/course/excel-t.htm for more info.



I'd love to have something in newLISP like Excel's TINV or a way to compute such confidence intervals straightaway, because newLISP will be way better than Excel at generating the parametrized computations of the CIs that I have in mind.



Has anyone done this in newLISP already?  I hope so.  Thanks!
(λx. x x) (λx. x x)

Lutz

#1
One-tailed probabilities of the t-Distribution for degrees of freedom df:


; one-tailed probablity of Student's t
(define (prob-t t df)
    (let (bta (betai (div (add 1 (div (pow t) df))) (div df 2) 0.5))
        (if  
            (> t 0) (sub 1 (mul 0.5 bta))
            (< t 0) (mul 0.5 bta)
            0.5)
))

rickyboy

#2
Thanks, Lutz.  Your function prob-t returns the area under the t-distro from -infinity to t (at df degrees of freedom, course).  This is very useful!



Here's Excel's TDIST expressed in newLISP, which uses your function prob-t:
(define (TDIST t df tails) (mul tails (sub 1 (prob-t t df))))

> (TDIST 2.29 83 1) # =TDIST(2.29,83,1) in Excel
0.01227952285
> (TDIST 2.29 83 2) # =TDIST(2.29,83,2) in Excel
0.02455904571

¡Grácias, hermano!
(λx. x x) (λx. x x)

rickyboy

#3
Addendum: I still needed to build those confidence intervals, so I really needed something like Excel's TINV (as I mentioned in my initial post).  TINV is a form of the inverse of the prob-t function Lutz was so kind to provide for us.  Here is the blurb about TINV from Excel's help file (assuming, quite naturally, that Fair Use applies here):


QuoteTINV returns that value t, such that P(|X| > t) = probability where X is a random variable that follows the t-distribution and P(|X| > t) = P(X < -t or X > t).

<confession>Since I am not an expert at these kind of functions, I did what any good professional should do: shamelessly steal somebody's code who freely licenses it.  :)  In this case, the "victim" is http://compbio.ucdenver.edu/Hunter_lab/Hunter/">Dr. Lawrence Hunter.  He wrote this very nice stats package for Common Lisp called http://compbio.ucdenver.edu/Hunter_lab/Hunter/cl-statistics.lisp">cl-statistics.lisp.</confession>



Prof. Hunter coded a general function to compute critical values from a cdf, he called it find-critical-value.  I just ported it to newLISP and made a couple of slight modifications.  Here is a newLISP version of it.


(define (find-critical-value p-function p-value
                             (x-tolerance 0.000001) (y-tolerance 0.000001))
  "Only works if `p-function' is monotonically decreasing over x >= 0."
  (letn (x-low 0.0 fx-low (p-function x-low)
         x-high 1.0 fx-high (p-function x-high)
         x-diff 1.0 y-diff 1.0)
    (until (>= fx-low p-value fx-high)
      (setq x-low x-high
            fx-low fx-high
            x-high (mul 2.0 x-high)
            fx-high (p-function x-high)))
    (catch
     (while true
       (letn (x-mid (div (add x-low x-high) 2.0)
              fx-mid (p-function x-mid)
              y-diff (abs (sub fx-mid p-value))
              x-diff (sub x-high x-low))
         (if (or (< x-diff x-tolerance)
                 (< y-diff y-tolerance))
             (throw x-mid))
         (if (< p-value fx-mid)
             (setq x-low x-mid fx-low fx-mid)
             (setq x-high x-mid fx-high fx-mid)))))))

This function takes as its first argument a function p-function for which it needs to find a critical value at p-value, and returns a value x which satisfies the equation x = (p-function p-value).  It finds such an x by starting from 0 and searching forward (to the right of 0 on the x-axis) until it has determined an interval in which x exists.  (Not getting into too much detail, but it is based on the idea of the Intermediate Value Theorem.)  Then it searches this interval closing in on x by binary search (again relying on the idea of the IVT).  The caveat in the usage of this function is that it assumes the function p-function is continuous and monotonically decreasing.  The IVT requires continuity but not monotonicity.  The fact that the cdfs this stat package deals with are piecewise monotonic is a happy coincidence, since that makes the code of find-critical-value all the more simpler (and the inverses are actually functions too, not just relations in general).



Armed with find-critical-value, we can write a "crit" function which complements Lutz's prob-t, which I call crit-t taking after Lutz's naming convention (cf. newLISP's crit-chi2 and crit-z).


(define (crit-t p df)
  (letn (p* (if (< p 0.5) (sub 1 p) p)
         cv (find-critical-value (fn (x) (sub 1 (prob-t x df))) (sub 1 p*)))
    (if (< p 0.5) (sub cv) cv)))

The function prob-t is *not* monotonically decreasing, so we don't feed *it* to find-critical-value without transforming it.  It turns out that the function (fn (x) (sub 1 (prob-t x df))) on the interval [0,infinity] *is* monotonically decreasing.  We only need to ask find-critical-value to give us the critical value x for 1-p instead of p, under (fn (x) (sub 1 (prob-t x df))), to get the x we want.  [Why? because if p = (prob-t x df), then 1 - p = 1 - (prob-t x df).]  The rest of the code is there to push out the negative critical values, since x < 0 when p < 1/2.



Finally we can write our TINV function.


(define (TINV alpha df)
  (if (<= 0 alpha 1)
      (crit-t (sub 1 (mul 0.5 alpha)) df)
      (throw-error "TINV: First argument (alpha) must be between 0 and 1.")))

Its interface is exactly like Excel's TINV and even throws an error, as Excel does, if the caller uses an alpha (first argument) outside of [0, 1].  TINV is like crit-t, except that TINV takes alpha as an argument which corresponds to the percentile, whereas crit-t just takes the percentile (i.e. 1 - alpha/2).
(λx. x x) (λx. x x)

Lutz

#4
This weekend I added prob-t, crit-t and prob-f, crit-f as built-ins. A they are all related, it was natural to do them all and they share the Newton approximation algorithm with crit-chi2. See here:



http://www.newlisp.org/downloads/development/inprogress/">http://www.newlisp.org/downloads/develo ... nprogress/">http://www.newlisp.org/downloads/development/inprogress/



There will be a development release of this in a week or two.

rickyboy

#5
Wow.  Thanks, man!
(λx. x x) (λx. x x)