 # Tutorial on Introduction to biostatistics

Maximum Likelihood Estimation and Likelihood Ratio test revisited

1.    Introduction

Maximum likelihood Estimation is an important aspect of frequentist approach which was introduced by RA Fisher . Maximum Likelihood estimation method helps us to find the estimator for the unknown population parameter. There are other methods of estimation also available such as Least Square Estimation and Bayesian Estimation methods but Maximum Likelihood Estimation is the widely used method to estimate the parameters. This paper provides an overview of Maximum Likelihood Method with example to calculate a Maximum Likelihood Estimator from a sample data set.

2.    Maximum Likelihood Estimation Method

Let X is a random variable with probability mass function P(X/θ) where θ is the parameter of the distribution. Let X1, X2…..Xn be the observation from the given sample. Then the joint probability or likelihood function is defined as

P(X1….Xn/ θ) = P(X1/ θ)  x P(X2/ θ)……. P(Xn/ θ) …………(1)

Equation (1) is also a likelihood function and can be written as

n

L(θ ) = πP(Xi; θ) = P(x1, θ).  P(x2, θ)….. P(xn, θ) …………(2)

I=1

The maximum likelihood estimator θ is defined the value of the parameter θ which maximizes the likelihood function. It will be easier to maximize the log of the likelihood function instead of the likelihood function directly

n

Log L(θ) = ∑ log P(Xi, θ)  …………………………………………………..(3)

I=1

Let us take the case of a binomial variable X with one instancex1and one unknown parameter P.  Then the above equations (1) to (3) can be written as follows

P(X1/ θ) = P(X1/ θ) ……………………………………………………….(4)

L(θ ) = πP(Xi; θ) = P(x1, θ)………………………………….……...(5)

I=1

Log L(θ) = ∑ log P(Xi, θ)  …………………………………………………(6)

I=1

Here the Binomial variable X takes value 0,1 (Success or failure in a trail) and n is the total number of trails. For example if there are 10 trails and we get 3 success out of that 10 trails then the probability of observing 3 successes in 10 trails is given by

P(X=3,p) =   10   p3(1-p)7…………………………………………………(7)

3

We need to find out the value of p which maximizes the above equation (7) (Likelihood function L(p,3))

L(p,3)  =        10   p3(1-p)7………………………………………………….……(8)

3

Log L(P,3) =Log(  10   p3(1-p)7)…………………………………………………(9)

3

= 3 log p + 7 log(1-p) +log   10   ………………….………….(10)

3

We need the maximum for the equation (8) or (9) which will be the maximum likelihood estimate for the parameter p. The table-1 below includes the maximum value of p for each value of X= 1, 2,..10 and n=10. For X=3 and n=10 and p=0.3, the likelihood equation (9) attains maximum i.e. 0.267.

Hence the value of p=0.3 is the maximum likelihood estimate for the parameter p and p=X/n is the maximum likelihood estimator for p.

Table-1: Likelihood values for the (Binomial Parameter n=10 and variable x= 1, 2, 3, 4, 5, 6, 7, 8, 9, 10) at different values for the parameter p 

 x(Total  number of success in n trails)=np n(number of trails) n! x! n-x! n!/x!(n-x)! Probability of success in a single trial(p ranges from 0

From the above we can also find out the likelihood curve reaches a minimum at p=0.5 (success or failure) ,n=10 and x=5 !!

3.    Asymptotic (when the sample size is large) Properties of Maximum Likelihood estimators

1.       Sufficiency

Maximum Likelihood Estimators (MLE) has sufficient information about the unknown population parameter.

2.       Consistency

When the sample size is sufficiently large, probability of p = p will be 1 (when n tends to    infinity)

3.       Asymptotic normality

4.       Efficiency

MLE attains the Cramer Rao lower bound because of the fact that the MLE is consistent and asymptotically normal

Likelihood ratio test 

If we would like to test the hypothesis that the x follows a distribution with parameter θ1 against an alternative hypothesis θ2 then the likelihood ratio will help to us to test whether the parameters θ1 and θ2 are similar or not

Where L is the likelihood ratio and L will take value between 0 and 1.

We need to compute the χ2 to test the Likelihood ratio using the formula χ2 = 2 L. if theoretical value of the χ2 is greater than the calculated value i.e p>0.05, then we accept the null hypothesis that θ1 and θ2 are similar.

Conclusion

This paper revisited the maximum likelihood estimation test with examples and also description the likelihood ratio test.

References

. Fisher, R. A. (1925, July). Theory of statistical estimation. In Mathematical Proceedings of the Cambridge Philosophical Society (Vol. 22, No. 05, pp. 700-725). Cambridge University Press.

. Myung, I. J. (2003). Tutorial on maximum likelihood estimation. Journal of mathematical Psychology, 47(1), 90-100.

. Self, S. G., & Liang, K. Y. (1987). Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. Journal of the American Statistical Association, 82(398), 605-610.

. Kiefer, J., & Wolfowitz, J. (1956). Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. The Annals of Mathematical Statistics, 887-906.

. Woolf, B. (1957). THE LOG LIKELIHOOD RATIO TEST (THE GTEST). Annals of human genetics, 21(4), 397-409.