The logarithmic series distribution, as the name suggests, is based on the standard power series expansion of the natural logarithm function. It is also sometimes known more simply as the logarithmic distribution.
The logarithmic series distribution with shape parameter \( p \in (0, 1) \) is a discrete distribution on \( \N_+ \) with probability density function \( f \) given by \[ f(n) = \frac{1}{-\ln(1 - p)} \frac{p^n}{n}, \quad n \in \N_+ \]
Recall that the standard power series for \( -\ln(1 - p) \), obtained by integrating the geometric series \( \sum_{n=0}^\infty p^n = 1 \big/ (1 - p) \), is \[ -\ln(1 - p) = \sum_{n=1}^\infty \frac{p^n}{n}, \quad p \in (0, 1) \] For the properties, consider the function \( x \mapsto p^x \big/ x \) on \( [1, \infty) \). The first derivative is \[ \frac{p^x [x \ln(p) - 1]}{x^2} \] which is negative, and the second derivative is \[ \frac{p^x \left[x^2 \ln^2(p) - 2 x \ln(p) + 2\right]}{x^3} \] which is positive
Open the Special Distribution Simulator and select the logarithmic series distribution. Vary the parameter and note the shape of the probability density function. For selected values of the parameter, run the simulation 1000 times and compare the empirical density function to the probability density function.
The distribution function and the quantile function do not have simple, closed forms in terms of the standard elementary functions.
Open the quantile app and select the logarithmic series distribution. Vary the parameter and note the shape of the distribution and probability density functions. For selected values of the parameters, compute the median and the first and third quartiles.
Suppose again that random variable \( N \) has the logarithmic series distribution with shape parameter \( p \in (0, 1) \). Recall that the permutation formula is \( n^{(k)} = n (n - 1) \cdots (n - k + 1) \) for \( n \in \R \) and \( k \in \N \). The factorial moments of \( N \) are \( \E\left(N^{(k)}\right) \) for \( k \in \N \).
The factorial moments of \( N \) are given by \[ \E\left(N^{(k)}\right) = \frac{(k - 1)!}{-\ln(1 - p)} \left(\frac{p}{1 - p}\right)^k, \quad k \in \N_+\]
Recall that a power series can be differentialed term by term within the open interval of convergence. Hence \begin{align} \E\left(N^{(k)}\right) & = \sum_{n=1}^\infty n^{(k)} \frac{1}{-\ln(1 - p)} \frac{p^n}{n} = \frac{p^k}{-\ln(1 - p)} \sum_{n=k}^\infty n^{(k)} \frac{p^{n-k}}{n} \\ & = \frac{p^k}{-\ln(1 - p)} \sum_{n=k}^\infty \frac{d^k}{dp^k} \frac{p^n}{n} = \frac{p^k}{-\ln(1 - p)} \frac{d^k}{dp^k} \sum_{n=1}^\infty \frac{p^n}{n} \\ = & \frac{p^k}{-\ln(1 - p)} \frac{d^k}{dp^k} [-\ln(1 - p)] = \frac{p^k}{-\ln(1 - p)} (k - 1)! (1 - p)^{-k} \end{align}
The mean and variance of \( N \) are
Open the special distribution simulator and select the logarithmic series distribution. Vary the parameter and note the shape of the mean \( \pm \) standard deviation bar. For selected values of the parameter, run the simulation 1000 times and compare the empirical mean and standard deviation to the distribution mean and standard deviation.
The probability generating function \( P \) of \( N \) is given by \[ P(t) = \E\left(t^N\right) = \frac{\ln(1 - p t)}{\ln(1 - p)}, \quad \left|t\right| \lt \frac{1}{p} \]
The factorial moments in can also be obtained from the probability generating function, since \( P^{(k)}(1) = \E\left(N^{(k)}\right) \) for \( k \in \N_+ \).
Naturally, the limits of the logarithmic series distribution with respect to the parameter \( p \) are of interest.
The logarithmic series distribution with shape parameter \( p \in (0, 1) \) converges to point mass at 1 as \( p \downarrow 0 \).
The logarithmic series distribution is a power series distribution associated with the function \( g(p) = -\ln(1 - p) \) for \( p \in [0, 1) \).
The moment results in actually follow from general results for power series distributions. The compound Poisson distribution based on the logarithmic series distribution gives a negative binomial distribution.
Suppose that \( \bs{X} = (X_1, X_2, \ldots) \) is a sequence of independent random variables each with the logarithmic series distribution with parameter \( p \in (0, 1) \). Suppose also that \( N \) is independent of \( \bs{X} \) and has the Poisson distribution with rate parameter \( r \in (0, \infty) \). Then \( Y = \sum_{i = 1}^N X_i \) has the negative binomial distribution on \( \N \) with parameters \( 1 - p \) and \( -r \big/\ln(1 - p) \)
The PGF of \( Y \) is \( Q \circ P \), where \( P \) is the PGF of the logarithmic series distribution in , and where \( Q \) is the PGF of the Poisson distribution so that \( Q(s) = e^{r(s - 1)} \) for \( s \in \R \). Thus we have \[ (Q \circ P)(t) = \exp \left(r \left[\frac{\ln(1 - p t)}{\ln(1 - p)} - 1\right]\right), \quad \left|t\right| \lt \frac{1}{p} \] With a little algebra, this can be written in the form \[ (Q \circ P)(t) = \left(\frac{1 - p}{1 - p t}\right)^{-r / \ln(1 - p)}, \quad \left|t\right| \lt \frac{1}{p} \] which is the PGF of the negative binomial distribution with parameters \( 1 - p \) and \( -r \big/ \ln(1 - p) \).