Hello everyone,
I'm in need of a handy function to help me compute confidence intervals based on the Student's t-distribution. It is easy to do in Excel, for instance, since they provide a Student t inverse; they call it TINV. See //https://courses.washington.edu/dphs568/course/excel-t.htm for more info.
I'd love to have something in newLISP like Excel's TINV or a way to compute such confidence intervals straightaway, because newLISP will be way better than Excel at generating the parametrized computations of the CIs that I have in mind.
Has anyone done this in newLISP already? I hope so. Thanks!
One-tailed probabilities of the t-Distribution for degrees of freedom df:
; one-tailed probablity of Student's t
(define (prob-t t df)
(let (bta (betai (div (add 1 (div (pow t) df))) (div df 2) 0.5))
(if
(> t 0) (sub 1 (mul 0.5 bta))
(< t 0) (mul 0.5 bta)
0.5)
))
Thanks, Lutz. Your function prob-t returns the area under the t-distro from -infinity to t (at df degrees of freedom, course). This is very useful!
Here's Excel's TDIST expressed in newLISP, which uses your function prob-t:
(define (TDIST t df tails) (mul tails (sub 1 (prob-t t df))))
> (TDIST 2.29 83 1) # =TDIST(2.29,83,1) in Excel
0.01227952285
> (TDIST 2.29 83 2) # =TDIST(2.29,83,2) in Excel
0.02455904571
¡Grácias, hermano!
Addendum: I still needed to build those confidence intervals, so I really needed something like Excel's TINV (as I mentioned in my initial post). TINV is a form of the inverse of the prob-t function Lutz was so kind to provide for us. Here is the blurb about TINV from Excel's help file (assuming, quite naturally, that Fair Use applies here):
Quote
TINV returns that value t, such that P(|X| > t) = probability where X is a random variable that follows the t-distribution and P(|X| > t) = P(X < -t or X > t).
<confession>Since I am not an expert at these kind of functions, I did what any good professional should do: shamelessly steal somebody's code who freely licenses it. :) In this case, the "victim" is Dr. Lawrence Hunter (//http). He wrote this very nice stats package for Common Lisp called cl-statistics.lisp (//http).</confession>
Prof. Hunter coded a general function to compute critical values from a cdf, he called it find-critical-value. I just ported it to newLISP and made a couple of slight modifications. Here is a newLISP version of it.
(define (find-critical-value p-function p-value
(x-tolerance 0.000001) (y-tolerance 0.000001))
"Only works if `p-function' is monotonically decreasing over x >= 0."
(letn (x-low 0.0 fx-low (p-function x-low)
x-high 1.0 fx-high (p-function x-high)
x-diff 1.0 y-diff 1.0)
(until (>= fx-low p-value fx-high)
(setq x-low x-high
fx-low fx-high
x-high (mul 2.0 x-high)
fx-high (p-function x-high)))
(catch
(while true
(letn (x-mid (div (add x-low x-high) 2.0)
fx-mid (p-function x-mid)
y-diff (abs (sub fx-mid p-value))
x-diff (sub x-high x-low))
(if (or (< x-diff x-tolerance)
(< y-diff y-tolerance))
(throw x-mid))
(if (< p-value fx-mid)
(setq x-low x-mid fx-low fx-mid)
(setq x-high x-mid fx-high fx-mid)))))))
This function takes as its first argument a function p-function for which it needs to find a critical value at p-value, and returns a value x which satisfies the equation x = (p-function p-value). It finds such an x by starting from 0 and searching forward (to the right of 0 on the x-axis) until it has determined an interval in which x exists. (Not getting into too much detail, but it is based on the idea of the Intermediate Value Theorem.) Then it searches this interval closing in on x by binary search (again relying on the idea of the IVT). The caveat in the usage of this function is that it assumes the function p-function is continuous and monotonically decreasing. The IVT requires continuity but not monotonicity. The fact that the cdfs this stat package deals with are piecewise monotonic is a happy coincidence, since that makes the code of find-critical-value all the more simpler (and the inverses are actually functions too, not just relations in general).
Armed with find-critical-value, we can write a "crit" function which complements Lutz's prob-t, which I call crit-t taking after Lutz's naming convention (cf. newLISP's crit-chi2 and crit-z).
(define (crit-t p df)
(letn (p* (if (< p 0.5) (sub 1 p) p)
cv (find-critical-value (fn (x) (sub 1 (prob-t x df))) (sub 1 p*)))
(if (< p 0.5) (sub cv) cv)))
The function prob-t is *not* monotonically decreasing, so we don't feed *it* to find-critical-value without transforming it. It turns out that the function (fn (x) (sub 1 (prob-t x df))) on the interval [0,infinity] *is* monotonically decreasing. We only need to ask find-critical-value to give us the critical value x for 1-p instead of p, under (fn (x) (sub 1 (prob-t x df))), to get the x we want. [Why? because if p = (prob-t x df), then 1 - p = 1 - (prob-t x df).] The rest of the code is there to push out the negative critical values, since x < 0 when p < 1/2.
Finally we can write our TINV function.
(define (TINV alpha df)
(if (<= 0 alpha 1)
(crit-t (sub 1 (mul 0.5 alpha)) df)
(throw-error "TINV: First argument (alpha) must be between 0 and 1.")))
Its interface is exactly like Excel's TINV and even throws an error, as Excel does, if the caller uses an alpha (first argument) outside of [0, 1]. TINV is like crit-t, except that TINV takes alpha as an argument which corresponds to the percentile, whereas crit-t just takes the percentile (i.e. 1 - alpha/2).
This weekend I added prob-t, crit-t and prob-f, crit-f as built-ins. A they are all related, it was natural to do them all and they share the Newton approximation algorithm with crit-chi2. See here:
http://www.newlisp.org/downloads/development/inprogress/
There will be a development release of this in a week or two.
Wow. Thanks, man!