| ldTweedie {mgcv} | R Documentation |
A function to evaluate the log of the Tweedie density for variance powers between 1 and 2, inclusive.
Also evaluates first and second derivatives of log density w.r.t. its scale parameter, phi, and p,
or w.r.t. rho=log(phi) and theta where p = (a+b*exp(theta))/(1+exp(theta)).
ldTweedie(y,mu=y,p=1.5,phi=1,rho=NA,theta=NA,a=1.001,b=1.999,all.derivs=FALSE)
y |
values at which to evaluate density. |
mu |
corresponding means (either of same length as |
p |
the variance of |
phi |
The scale parameter. Variance of |
rho |
optional log scale parameter. Over-rides |
theta |
parameter such that |
a |
lower limit parameter (>1) used in definition of |
b |
upper limit parameter (<2) used in definition of |
all.derivs |
if |
A Tweedie random variable with 1<p<2 is a sum of N gamma random variables
where N has a Poisson distribution. The p=1 case is a generalization of a Poisson distribution and is a discrete
distribution supported on integer multiples of the scale parameter. For 1<p<2 the distribution is supported on the
positive reals with a point mass at zero. p=2 is a gamma distribution. As p gets very close to 1 the continuous
distribution begins to converge on the discretely supported limit at p=1.
ldTweedie is based on the series evaluation method of Dunn and Smyth (2005). Without
the restriction on p the calculation of Tweedie densities is less straightforward. If you really need this
case then the tweedie package is the place to start.
The rho, theta parameterization is useful for optimization of p and phi, in order to keep p
bounded well away from 1 and 2, and phi positive. The derivatives near p=1 tend to infinity.
Note that if p and phi (or theta and rho) both contain only a single unique value, then the underlying
code is able to use buffering to avoid repeated calls to expensive log gamma, di-gamma and tri-gamma functions (mu can still be a vector of different values). This is much faster than is possible when these parameters are vectors with different values.
A matrix with 6 columns, or 10 if all.derivs=TRUE. The first is the log density of y (log probability if p=1).
The second and third are the first and second derivatives of the log density w.r.t. phi. 4th and 5th
columns are first and second derivative w.r.t. p, final column is second derivative w.r.t. phi and p.
If rho and theta were supplied then derivatives are w.r.t. these. In this case, and if all.derivs=TRUE then the 7th colmn is the derivative w.r.t. mu, the 8th is the 2nd derivative w.r.t. mu, the 9th is the mixed derivative w.r.t. theta andmu and the 10th is the mixed derivative w.r.t. rho and mu.
Simon N. Wood simon.wood@r-project.org
Dunn, P.K. and G.K. Smith (2005) Series evaluation of Tweedie exponential dispersion model densities. Statistics and Computing 15:267-280
Tweedie, M. C. K. (1984). An index which distinguishes between some important exponential families. Statistics: Applications and New Directions. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference (Eds. J. K. Ghosh and J. Roy), pp. 579-604. Calcutta: Indian Statistical Institute.
library(mgcv)
## convergence to Poisson illustrated
## notice how p>1.1 is OK
y <- seq(1e-10,10,length=1000)
p <- c(1.0001,1.001,1.01,1.1,1.2,1.5,1.8,2)
phi <- .5
fy <- exp(ldTweedie(y,mu=2,p=p[1],phi=phi)[,1])
plot(y,fy,type="l",ylim=c(0,3),main="Tweedie density as p changes")
for (i in 2:length(p)) {
fy <- exp(ldTweedie(y,mu=2,p=p[i],phi=phi)[,1])
lines(y,fy,col=i)
}