# Score test

Rao's score test, also known as the score test or the Lagrange multiplier test (LM test) in econometrics,[1][2] is a statistical test of a simple null hypothesis that a parameter of interest ${\displaystyle \theta }$ is equal to some particular value ${\displaystyle \theta _{0}}$. It is the most powerful test when the true value of ${\displaystyle \theta }$ is close to ${\displaystyle \theta _{0}}$. The main advantage of the score test is that it does not require an estimate of the information under the alternative hypothesis or unconstrained maximum likelihood. This constitutes a potential advantage in comparison to other tests, such as the Wald test and the generalized likelihood ratio test (GLRT). This makes testing feasible when the unconstrained maximum likelihood estimate is a boundary point in the parameter space.

## Single parameter test

### The statistic

Let ${\displaystyle L}$ be the likelihood function which depends on a univariate parameter ${\displaystyle \theta }$ and let ${\displaystyle x}$ be the data. The score ${\displaystyle U(\theta )}$ is defined as

${\displaystyle U(\theta )={\frac {\partial \log L(\theta \mid x)}{\partial \theta }}.}$
${\displaystyle I(\theta )=-\operatorname {E} \left[\left.{\frac {\partial ^{2}}{\partial \theta ^{2}}}\log L(X;\theta )\right|\theta \right]\,.}$

The statistic to test ${\displaystyle {\mathcal {H}}_{0}:\theta =\theta _{0}}$ is ${\displaystyle S(\theta _{0})={\frac {U(\theta _{0})^{2}}{I(\theta _{0})}}}$

which has an asymptotic distribution of ${\displaystyle \chi _{1}^{2}}$, when ${\displaystyle {\mathcal {H}}_{0}}$ is true.

#### Note on notation

Note that some texts use an alternative notation, in which the statistic ${\displaystyle S^{*}(\theta )={\sqrt {S(\theta )}}}$ is tested against a normal distribution. This approach is equivalent and gives identical results.

### As most powerful test for small deviations

${\displaystyle \left({\frac {\partial \log L(\theta \mid x)}{\partial \theta }}\right)_{\theta =\theta _{0}}\geq C}$

Where ${\displaystyle L}$ is the likelihood function, ${\displaystyle \theta _{0}}$ is the value of the parameter of interest under the null hypothesis, and ${\displaystyle C}$ is a constant set depending on the size of the test desired (i.e. the probability of rejecting ${\displaystyle H_{0}}$ if ${\displaystyle H_{0}}$ is true; see Type I error).

The score test is the most powerful test for small deviations from ${\displaystyle H_{0}}$. To see this, consider testing ${\displaystyle \theta =\theta _{0}}$ versus ${\displaystyle \theta =\theta _{0}+h}$. By the Neyman–Pearson lemma, the most powerful test has the form

${\displaystyle {\frac {L(\theta _{0}+h\mid x)}{L(\theta _{0}\mid x)}}\geq K;}$

Taking the log of both sides yields

${\displaystyle \log L(\theta _{0}+h\mid x)-\log L(\theta _{0}\mid x)\geq \log K.}$

The score test follows making the substitution (by Taylor series expansion)

${\displaystyle \log L(\theta _{0}+h\mid x)\approx \log L(\theta _{0}\mid x)+h\times \left({\frac {\partial \log L(\theta \mid x)}{\partial \theta }}\right)_{\theta =\theta _{0}}}$

and identifying the ${\displaystyle C}$ above with ${\displaystyle \log(K)}$.

### Relationship with other hypothesis tests

The likelihood ratio test, the Wald test, and the Score test are asymptotically equivalent tests of hypotheses.[4][5] When testing nested models, the statistics for each test converge to a Chi-squared distribution with degrees of freedom equal to the difference in degrees of freedom in the two models.

## Multiple parameters

A more general score test can be derived when there is more than one parameter. Suppose that ${\displaystyle {\hat {\theta }}_{0}}$ is the maximum likelihood estimate of ${\displaystyle \theta }$ under the null hypothesis ${\displaystyle H_{0}}$ while ${\displaystyle U}$ and ${\displaystyle I}$ are respectively, the score and the Fisher information matrices under the alternative hypothesis. Then

${\displaystyle U^{T}({\hat {\theta }}_{0})I^{-1}({\hat {\theta }}_{0})U({\hat {\theta }}_{0})\sim \chi _{k}^{2}}$

asymptotically under ${\displaystyle H_{0}}$, where ${\displaystyle k}$ is the number of constraints imposed by the null hypothesis and

${\displaystyle U({\hat {\theta }}_{0})={\frac {\partial \log L({\hat {\theta }}_{0}\mid x)}{\partial \theta }}}$

and

${\displaystyle I({\hat {\theta }}_{0})=-E\left({\frac {\partial ^{2}\log L({\hat {\theta }}_{0}\mid x)}{\partial \theta \partial \theta '}}\right).}$

This can be used to test ${\displaystyle H_{0}}$.

## Special cases

In many situations, the score statistic reduces to another commonly used statistic.[6]

When the data follows a normal distribution, the score statistic is the same as the t statistic.[clarification needed]

When the data consists of binary observations, the score statistic is the same as the chi-squared statistic in the Pearson's chi-squared test.

When the data consists of failure time data in two groups, the score statistic for the Cox partial likelihood is the same as the log-rank statistic in the log-rank test. Hence the log-rank test for difference in survival between two groups is most powerful when the proportional hazards assumption holds.