Notes on U Statistic

2024-03-24 statistics 评论字数统计: 6.4k(字) 阅读时长: 39(分)

Notes for A Class of Statistics with Asymptotically Normal Distribution by Wassily Hoeffding.

You can view the lecture notes on U-statistics.

Basic Concepts and Motivating Examples
U-statistic
- Notations
- Definition: U-statistic
Asymptotic Normality of U-statistic
Applications to particular statistics
References

Basic Concepts and Motivating Examples

Functional, Kernel and Order

Suppose that $\Phi(x_1,…,x_m)$ is a function of $m$ random variables and $X_1,…,X_n$ are i.i.d. observations from c.d.f. $F(x)$. Assume $n>m$.

Make inference about
$$\theta = \theta(F)=E_{F}[\Phi(X_1,…,X_m)]$$

Note: $\theta$ is a functional of the distribution $F$. Functional is a function of a function which maps a function to a real (or complex) number.

To use all the samples, we can use the U-statistic
$$\begin{align}
U &= \left(^n_m\right)^{-1}\Sigma^ \prime \Phi(X_{i_1},…,X_{i_m})\\
&= \frac{1}{n(n-1)…(n-m+1)}\sum_{1\leq i_1<…<i_m\leq n}\Phi(X_{i_1},…,X_{i_m})
\end{align}$$
where $\Sigma^\prime$ stands for summation over all permutations of $m$ integers $i_1,…,i_m$ such that $1\leq i_1<…<i_m\leq n$.

Here, $U$ is a function of the sample $X_1,…,X_n$ called a U-statistic. The $\Phi$ function is called the kernel and $m$ is its order.

Symmetry of Kernel

For convenience, we assume that $\Phi$ is symmetric in its arguments, which means
$$\Phi(…,x_{i},…,x_{j},…)=\Phi(…,x_{j},…,x_{i},…)$$

But it is not necessary to assume the symmetry of the kernel. Because we can always convert a non-symmetric kernel to a symmetric one.

If $\Phi(x_1,…,x_m)$ is not symmetric, there exist a symmetric $\Phi_0(x_1,…,x_m)$ such that
$$\Phi_0(x_1,…,x_m)=\frac{1}{m!}\sum \Phi(x_{\alpha_1},…,x_{\alpha_m})$$
where the sum is over all permutations of $m$ integers $1,…,m$.

Motivating Examples

Example 1: Let $m=1$ and $\Phi(x)=x$. Then, $U=\sum_{i=1}^nX_i/n$ is the sample mean.

Example 2: Let $m=2$ and $\Phi(x_1,x_2)=x_1^2-x_1x_2$. Then, $U=\sum_{1\leq i<j\leq n}(X_i^2-X_iX_j)/(n(n-1))$ is the sample variance. A symmetric kernel can be $\Phi_0(x_1,x_2)=(x_1-x_2)^2/2$.

Example 3: Let $m=2$ and $\Phi(x_1, x_2)=I(x_1+x_2>0)$. Then, $U=\sum_{1\leq i<j\leq n} I(x_i+x_j>0)/(n(n-1))$ is related to one sample Wilcoxon statistics.

Asymptotic Normality of U-statistic: A Simple Version

Define $\zeta_k=\mathbb{V}(\Phi_k(X_1,…,X_k)), k=1,…,m$. Suppose that the kernel $\Phi$ satisfies $\mathbb{E}\Phi^2(X_1,…,X_m)<\infty$. Assume that $0<\zeta_1<\infty$. Then,
$$\frac{U-\theta}{\sqrt{\mathbb{V}(U)}}\xrightarrow{d}N(0,1)$$
where $\mathbb{V}(U)=\frac{1}{n}m^2\zeta_1+O(n^{-2})$.

U-statistic

Notations

$X_1,…,X_n$: $n$ independent random vectors with the same d.f. $F(x)=F(x^{(1)},…,x^{(r)})$.
$X_\nu=(X_\nu^{(1)},…,X_\nu^{(r)})$: $r$-dimensional random vector.
$x_1,…,x_n$: sample of $n$ $r$-dimensional vectors.
$\Phi(x_1,…,x_m)$: a symmetric function of $m(\leq n)$ vector arguments.
$\theta=\theta(F)$: functional of $F$.

Definition: U-statistic

Consider the function of the sample,

$$U\left(x_{1}, \cdots, x_{n}\right)=\left(\begin{array}{c}
n \tag{4.4}\\
m
\end{array}\right)^{-1} \Sigma^{\prime} \Phi\left(x_{\alpha_{1}}, \cdots, x_{\alpha_{m}}\right)$$

where the kernel $\Phi$ is symmetric in its $m$ vector arguments and the sum $\Sigma^{\prime}$ is extended over all subscripts $\alpha$ such that

$$1 \leq \alpha_{1}<\alpha_{2}<\cdots<\alpha_{m} \leq n$$

Eq (4.4) can be driven from
$$\begin{equation*}
U=U\left(x_{1}, \cdots, x_{n}\right)=\frac{\Phi_0 \left(x_{\alpha_{1}}, \cdots, x_{\alpha_{m}}\right)}{n(n-1) \cdots(n-m+1)} \Sigma^{\prime \prime} , \tag{4.1}
\end{equation*}$$

where $\Sigma^{\prime \prime}$ stands for summation over all permutations $\left(\alpha_{1}, \cdots, \alpha_{m}\right)$ of $m$ integers such that

$$\begin{equation*}
1 \leq \alpha_{i} \leq n, \quad \alpha_{i} \neq \alpha_{j} \text { if } i \neq j, \quad(i, j=1, \cdots, m) \tag{4.2}
\end{equation*}$$

Asymptotic Normality of U-statistic

The Variance of a U-statistic

The Unbiased Estimator and Its Variance

If $\theta=\theta(F)$, we have
$$E{U}=E\left\lbrace\Phi\left(X_{1}, \cdots, X_{m}\right)\right\rbrace=\theta$$

Let
$$\begin{align}
\Phi_{c}\left(x_{1}, \cdots, x_{c}\right)&=E\left\lbrace\Phi\left(x_{1}, \cdots, x_{c}, X_{c+1}, \cdots, X_{m}\right)\right\rbrace, \tag{5.2}\\(c&=1, \cdots, m),
\end{align}$$

where $x_{1}, \cdots, x_{c}$ are arbitrary fixed vectors and the expected value is taken with respect to the random vectors $X_{c+1}, \cdots, X_{m}$. Then

$$\begin{equation*}
\Phi_{c-1}\left(x_{1}, \cdots, x_{c-1}\right)=E\left\lbrace\Phi_{c}\left(x_{1}, \cdots, x_{c-1}, X_{c}\right)\right\rbrace \tag{5.3}
\end{equation*}$$

and

$$\begin{equation*}
E\left\lbrace\Phi_{c}\left(X_{1}, \cdots, X_{c}\right)\right\rbrace=\theta, \quad(c=1, \cdots, m) . \tag{5.4}
\end{equation*}$$

Define

$$\begin{align}
& \Psi\left(x_{1}, \cdots, x_{m}\right)=\Phi\left(x_{1}, \cdots, x_{m}\right)-\theta \tag{5.5}\\
& \Psi_{c}\left(x_{1}, \cdots, x_{c}\right)=\Phi_{c}\left(x_{1}, \cdots, x_{c}\right)-\theta, \quad(c=1, \cdots, m) . \tag{5.6}
\end{align}$$

We have

$$\begin{align}
\Psi_{c-1}\left(x_{1}, \cdots, x_{c-1}\right)=E\left\lbrace\Psi_{c}\left(x_{1}, \cdots, x_{c-1}, X_{c}\right)\right\rbrace \tag{5.7}\\
E\left\lbrace\Psi_{c}\left(X_{1}, \cdots, X_{c}\right)\right\rbrace=E\left\lbrace\Psi\left(X_{1}, \cdots, X_{m}\right)\right\rbrace=0, \quad(c=1, \cdots, m) . \tag{5.8}
\end{align}$$

Suppose that the variance of $\Psi_{c}\left(X_{1}, \cdots, X_{c}\right)$ exists, and let

$$\begin{equation*}
\zeta_{0}=0, \quad \zeta_{c}=E\left\lbrace\Psi_{c}^{2}\left(X_{1}, \cdots, X_{c}\right)\right\rbrace, \quad(c=1, \cdots, m) \tag{5.9}
\end{equation*}$$

We have

$$\begin{equation*}
\zeta_{c}=E\left\lbrace\Phi_{c}^{2}\left(X_{1}, \cdots, X_{c}\right)\right\rbrace-\theta^{2} \tag{5.10}
\end{equation*}$$

Stationary Order of a Functional

If, for some parent distribution $F=F_{0}$ and some integer $d$, we have $\zeta_{d}\left(F_{0}\right)=0$, this means that $\Psi_{d}\left(X_{1}, \cdots, X_{d}\right)=0$ with probability 1. By (5.7) and (5.9), $\zeta_{d}=0$ implies $\zeta_{1}=\cdots=\zeta_{d-1}=0$.

If $\zeta_{1}\left(F_{0}\right)=0$, we shall say that the regular functional $\theta(F)$ is stationary for $F=F_{0}$. If

$$\begin{equation*}
\zeta_{1}\left(F_{0}\right)=\cdots=\zeta_{d}\left(F_{0}\right)=0, \quad \zeta_{d+1}\left(F_{0}\right)>0, \quad(1 \leq d \leq m) \tag{5.11}
\end{equation*}$$

$\theta(F)$ will be called stationary of order $d$ for $F=F_{0}$.

The Variance of a U-statistic: i.i.d. Case

If $\left(\alpha_{1}, \cdots, \alpha_{m}\right)$ and $\left(\beta_{1}, \cdots, \beta_{m}\right)$ are two sets of $m$ different integers, $1 \leq \alpha_{i}$, $\beta_{i} \leq n$, and $c$ is the number of integers common to the two sets, we have, by the symmetry of $\Psi$,

$$\begin{equation*}
E\left\lbrace\Psi\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{m}}\right) \Psi\left(X_{\beta_{1}}, \cdots, X_{\beta_{m}}\right)\right\rbrace=\zeta_{c} \tag{5.12}
\end{equation*}$$

If the variance of $U$ exists, it is equal to

$$\begin{align}
\sigma^{2}(U) & =\left(\begin{array}{c}n \\m\end{array}\right)^{-2} E\left\lbrace\Sigma^{\prime} \Psi\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{m}}\right)\right\rbrace^{2} \\& =\left(\begin{array}{c}n \\m\end{array}\right)^{-2} \sum_{c=0}^{m} \Sigma^{(c)} E\left\lbrace\Psi\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{m}}\right) \Psi\left(X_{\beta_{1}}, \cdots, X_{\beta_{m}}\right)\right\rbrace
\end{align}$$

where $\Sigma^{(c)}$ stands for summation over all subscripts such that

$$1 \leq \alpha_{1}<\alpha_{2}<\cdots<\alpha_{m} \leq n, \quad 1 \leq \beta_{1}<\beta_{2}<\cdots<\beta_{m} \leq n$$

and exactly $c$ equations

$$\alpha_{i}=\beta_{j}$$

are satisfied. By (5.12), each term in $\Sigma^{(c)}$ is equal to $\zeta_{c}$. The number of terms in $\Sigma^{(c)}$ is easily seen to be

$$\frac{n(n-1) \cdots(n-2 m+c+1)}{c !(m-c) !(m-c) !}=\left(\begin{array}{l}m \\c\end{array}\right)\left(\begin{array}{l}
n-m \\m-c\end{array}\right)\left(\begin{array}{l}
n \\m\end{array}\right)$$

and hence, since $\zeta_{0}=0$,

$$\sigma^{2}(U)=\left(\begin{array}{l}
n \tag{5.13}\\m\end{array}\right)^{-1} \sum_{c=1}^{m}\left(\begin{array}{l}m \\c\end{array}\right)\left(\begin{array}{l}n-m \\
m-c\end{array}\right) \zeta_{c}$$

The Variance of a U-statistic: General Case

When the distributions of $X_{1}, \cdots, X_{n}$ are different, $F_{\nu}(x)$ being the d.f. of $X_{\nu}$, let

$$\begin{equation*}
\theta_{\alpha_{1}, \cdots, \alpha_{m}}=E\left\lbrace\Phi\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{m}}\right)\right\rbrace \tag{5.14}
\end{equation*}$$

$$\begin{align}
&\Psi_{c\left(\alpha_{1}, \cdots, \alpha_{c}\right) \beta_{1}, \cdots, \beta_{m-c}}\left(x_{1}, \cdots, x_{c}\right) \\
=&E\left\lbrace\Phi\left(x_{1}, \cdots, x_{c}, X_{\beta_{1}} \ldots, X_{\beta_{m-c}}\right)\right\rbrace-\theta_{\alpha_{1}, \cdots, \alpha_{c}, \beta_{1}, \cdots, \beta_{m-c}}, \tag{5.15}\\
&\qquad(c=1, \cdots, m),
\end{align}$$

$$\begin{align}
& \zeta_{c\left(\alpha_{1}, \cdots, \alpha_{c}\right) \beta_{1}, \cdots, \beta_{m-c} ; \gamma_{1}, \cdots, \gamma_{m-c}} \\
=&E\left\lbrace\Psi_{c\left(\alpha_{1}, \cdots, \alpha_{c}\right) \beta_{1}, \cdots, \beta_{m-c}}\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{c}}\right) \Psi_{c\left(\alpha_{1}, \cdots, \alpha_{c}\right) \gamma_{1}, \cdots, \gamma_{m-c}}\right. \tag{5.16}\\
& \left.{ }\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{c}}\right)\right\rbrace \\
\zeta_{c, n}=&\frac{c !(m-c) !(m-c) !}{n(n-1) \cdots(n-2 m+c+1)} \Sigma_{c\left(\alpha_{1}, \cdots, \alpha_{c}\right) \beta_{1}, \cdots, \beta_{m-c} ; \gamma_{1}, \cdots, \gamma_{m-c}} \tag{5.17}
\end{align}$$

where the sum is extended over all subscripts $\alpha, \beta, \gamma$ such that

$$\begin{align}
1 &\leq \alpha_{1}<\cdots<\alpha_{c} \leq n, \quad 1 \leq \beta_{1}<\cdots<\beta_{m-c} \leq n, \quad 1 \leq \gamma_{1}<\cdots \gamma_{m-c} \leq n \\
\alpha_{i}& \neq \beta_{j}, \quad \alpha_{i} \neq \gamma_{j}, \quad \beta_{i} \neq \gamma_{j}
\end{align}$$

Then the variance of $U$ is equal to

$$\sigma^{2}(U)=\left(\begin{array}{l}n \tag{5.18}\\m
\end{array}\right)^{-1} \sum_{c=1}^{m}\left(\begin{array}{l}
m \\c\end{array}\right)\left(\begin{array}{l}
n-m \\m-c
\end{array}\right) \zeta_{c, n}$$

Properties of the Moments and the Variance

Returning to the case of identically distributed $X$ ‘s, we shall now prove some inequalities satisfied by $\zeta_{1}, \cdots, \zeta_{m}$ and $\sigma^{2}(U)$ which are contained in the following theorems:

Theorem 5.1 The quantities $\zeta_{1}, \cdots, \zeta_{m}$ as defined by (5.9) satisfy the inequalities

$$\begin{equation*}
0 \leq \frac{\zeta_{c}}{c} \leq \frac{\zeta_{d}}{d} \quad \text { if } 1 \leq c<d \leq m \tag{5.19}
\end{equation*}$$

Theorem 5.2 The variance $\sigma^{2}\left(U_{n}\right)$ of a U-statistic $U_{n}=U\left(X_{1}, \cdots, X_{n}\right)$, where $X_{1}, \cdots, X_{n}$ are independent and identically distributed, satisfies the inequalities

$$\begin{equation*}
\frac{m^{2}}{n} \zeta_{1} \leq \sigma^{2}\left(U_{n}\right) \leq \frac{m}{n} \zeta_{m} \tag{5.20}
\end{equation*}$$

$n \sigma^{2}\left(U_{n}\right)$ is a decreasing function of $n$,

$$\begin{equation*}
(n+1) \sigma^{2}\left(U_{n+1}\right) \leq n \sigma^{2}\left(U_{n}\right), \tag{5.21}
\end{equation*}$$

which takes on its upper bound $m \zeta_{m}$ for $n=m$ and tends to its lower bound $m^{2} \zeta_{1}$ as $n$ increases:

$$\begin{align}
&\sigma^{2}\left(U_{m}\right)=\zeta_{m} \tag{5.22} \\
&\lim_{n \rightarrow \infty} n \sigma^{2}\left(U_{n}\right)=m^{2} \zeta_{1} \tag{5.23}
\end{align}$$

If $E\left\lbrace U_{n}\right\rbrace=\theta(F)$ is stationary of order $\geq d-1$ for the d.f. of $X_{\alpha},(5.20)$ may be replaced by

$$\begin{equation*}
\frac{m}{d} K_{n}(m, d) \zeta_{d} \leq \sigma^{2}\left(U_{n}\right) \leq K_{n}(m, d) \zeta_{m} \tag{5.24}
\end{equation*}$$

where

$$K_{n}(m, d)=\left(\begin{array}{l}
n \tag{5.25}\\m
\end{array}\right)^{-1} \sum_{c=d}^{m}\left(\begin{array}{l}
m-1 \\c-1
\end{array}\right)\left(\begin{array}{l}
n-m \\m-c
\end{array}\right)$$

A Necessary and Sufficient Condition for the Existence of the Variance

(5.13) and (5.19) imply that a necessary and sufficient condition for the existence of $\sigma^{2}(U)$ is the existence of

$$\begin{equation*}
\zeta_{m}=E\left\lbrace\Phi^{2}\left(X_{1}, \cdots, X_{m}\right)\right\rbrace-\theta^{2} \tag{5.26}
\end{equation*}$$

or that of $E\left\lbrace\Phi^{2}\left(X_{1}, \cdots, X_{m}\right)\right\rbrace$.

If $\zeta_{1}>0, \sigma^{2}(U)$ is of order $n^{-1}$.

If $\theta(F)$ is stationary of order $d$ for $F=F_{0}$, that is, if (5.11) is satisfied, $\sigma^{2}(U)$ is of order $n^{-d-1}$. Only if, for some $F=F_{0}, \theta(F)$ is stationary of order $m$, where $m$ is the degree of $\theta(F)$, we have $\sigma^{2}(U)=0$, and $U$ is equal to a constant with probability 1.

Lemma 5.1

For proving Theorem 5.1 we shall require the following:

Lemma 5.1. If

$$\delta_{d}=\zeta_{d}-\left(\begin{array}{l}
d \\1 \tag{5.27}
\end{array}\right) \zeta_{d-1}+\left(\begin{array}{l}
d \\2
\end{array}\right) \zeta_{d-2} \cdots+(-1)^{d-1}\left(\begin{array}{c}
d \\d-1
\end{array}\right) \zeta_{1}$$

we have

$$\begin{equation*}
\delta_{d} \geq 0, \quad(d=1, \cdots, m)^{6} \tag{5.28}
\end{equation*}$$

and

$$\zeta_{d}=\delta_{d}+\left(\begin{array}{l}
d \tag{5.29}\\1
\end{array}\right) \delta_{d-1}+\cdots+\left(\begin{array}{c}
d \\d-1
\end{array}\right) \delta_{1}$$

Proof. (5.29) follows from (5.27) by induction.

For proving (5.28) let

$$\eta_{0}=\theta^{2}, \quad \eta_{c}=E\left\lbrace\Phi_{c}^{2}\left(X_{1}, \cdots, X_{c}\right)\right\rbrace, \quad(c=1, \cdots, m)$$

Then, by (5.10),

$$\zeta_{c}=\eta_{c}-\eta_{0}$$

and on substituting this in (5.27) we have

$$\delta_{d}=\sum_{c=0}^{d}(-1)^{d-c}\left(\begin{array}{l} d \\ c \end{array}\right) \eta_{c}$$

From (5.9) it is seen that (5.28) is true for $d=1$. Suppose that (5.28) holds for $1, \cdots, d-1$. Then (5.28) will be shown to hold for $d$.

Let

$$\overline{\Phi_{0}} (x_{1})=\Phi_{1}(x_{1})-\theta,$$

$$\begin{align}
\overline{\Phi_{c}}&\left(x_{1}, x_{2}, \cdots, x_{c+1}\right), \quad(c=1, \cdots, d-1) \\
=\Phi_{c+1}&\left(x_{1}, \cdots, x_{c+1}\right)-\Phi_{c}\left(x_{2}, \cdots, x_{c+1}\right).
\end{align}$$

For an arbitrary fixed $x_{1}$, let

$$\overline{\eta_{c}}\left(x_{1}\right)=E\lbrace\overline{\Phi_{c}}^{2}\left(x_{1}, X_{2}, \cdots, X_{c+1}\right)\rbrace, \quad(c=0, \cdots, d-1)$$

Then, by induction hypothesis,

$$\begin{align}\overline{\delta_{d-1}}\left(x_{1}\right)=\sum_{c=0}^{d-1}(-1)^{d-1-c}\left(\begin{array}{c}
d-1 \\ c
\end{array}\right) \overline{\eta_{c}}\left(x_{1}\right) \geq 0\end{align}$$

for any fixed $x_{1}$.

Now,

$$E\left\lbrace\overline{\eta_{c}}\left(X_{1}\right)\right\rbrace=\eta_{c+1}-\eta_{c}$$

and hence

$$\begin{align}
E\left\lbrace\overline{\delta_{d-1}}\left(X_{1}\right)\right\rbrace&=\sum_{c=0}^{d-1}(-1)^{d-1-c}\left(\begin{array}{c}
d-1 \\ c
\end{array}\right)\left(\eta_{c+1}-\eta_{c}\right)\\
&=\sum_{c=0}^{d}(-1)^{d-c}\left(\begin{array}{l}
d \\ c
\end{array}\right) \eta_{c}=\delta_{d}
\end{align}$$

The proof of Lemma 5.1 is complete.

Proof of Theorems 5.1

By (5.29) we have for $c<d$

$$\begin{align}
c \zeta_{d}-d \zeta_{c} & =c \sum_{a=1}^{d}\left(\begin{array}{l}d \\ a
\end{array}\right) \delta_{a}-d \sum_{a=1}^{c}\left(\begin{array}{l}c \\ a
\end{array}\right) \delta_{a} \\
& =\sum_{a=1}^{c}\left[c\left(\begin{array}{l}d \\ a
\end{array}\right)-d\left(\begin{array}{l}c \\ a
\end{array}\right)\right] \delta_{a}+c \sum_{a=c+1}^{d}\left(\begin{array}{l}d \\ a
\end{array}\right) \delta_{a} \tag{5.30}
\end{align}$$

From (5.28), and since $c\left(\begin{array}{l}d \\ a\end{array}\right)-d\left(\begin{array}{l}c \\ a\end{array}\right) \geq 0$ if $1 \leq a \leq c \leq d$, it follows that each term in the two sums of (5.30) is not negative. This, in connection with (5.9) proves Theorem 5.1.

Proof of Theorem 5.2.

From (5.19) we have

$$c \zeta_{1} \leq \zeta_{c} \leq \frac{c}{m} \zeta_{m}, \quad(c=1, \cdots, m)$$

Applying these inequalities to each term in (5.13) and using the identity

$$\left(\begin{array}{l}
n \tag{5.31}\\ m
\end{array}\right)^{-1} \sum_{c=1}^{m} c\left(\begin{array}{l}
m \\ c
\end{array}\right)\left(\begin{array}{l}
n-m \\ m-c
\end{array}\right)=\frac{m^{2}}{n}$$

we obtain (5.20).

(5.22) and (5.23) follow immediately from (5.13).

For (5.21) we may write

$$\begin{equation*}
D_{n} \geq 0 \tag{5.32}
\end{equation*}$$

where

$$D_{n}=n \sigma^{2}\left(U_{n}\right)-(n+1) \sigma^{2}\left(U_{n+1}\right)$$

Let

$$D_{n}=\sum_{c=1}^{m} d_{n, c} \zeta_{c}$$

Then we have from (5.13)

$$\begin{align}
& d_{n, c}=n\left(\begin{array}{c}
m \\ c\end{array}\right)\left(\begin{array}{l}
n-m \\ m-c\end{array}\right)\left(\begin{array}{l}n \\ m
\end{array}\right)^{-1}-(n+1)\left(\begin{array}{c}
m \\ c\end{array}\right) \tag{5.33}\\
&\left(\begin{array}{c}
n+1-m \\ m-c\end{array}\right)\left(\begin{array}{c}
n+1 \\ m\end{array}\right)^{-1},
\end{align}$$

$$\begin{align}
& d_{n, c}=\left(\begin{array}{c}
m \\ c
\end{array}\right)\left(\begin{array}{c}
n-m+1 \\ m-c
\end{array}\right)(n-m+1)^{-1}\left(\begin{array}{l}
n \\ m
\end{array}\right)^{-1}\left\lbrace(c-1) n-(m-1)^{2}\right\rbrace, \\
&(1 \leq c \leq m \leq n).
\end{align}$$

Putting

$$c_{0}=1+\left[\frac{(m-1)^{2}}{n}\right]$$

where $[u]$ denotes the largest integer $\leq u$, we have

$$\begin{array}{ll}
d_{n, c} \leq 0 & \text { if } c \leq c_{0} \\
d_{n, c}>0 & \text { if } c>c_{0} .
\end{array}$$

Hence, by (5.19),

$$d_{n, c} \zeta_{c} \geq \frac{1}{c_{0}} \zeta_{c_{0}} c d_{n, c}, \quad(c=1, \cdots, m)$$

and

$$D_{n} \geq \frac{1}{c_{0}} \zeta_{c_{0}} \sum_{c=1}^{m} c d_{n, c}$$

By (5.33) and (5.31), the latter sum vanishes. This proves (5.32).

For the stationary case $\zeta_{1}=\cdots=\zeta_{d-1}=0$, (5.24) is a direct consequence of (5.13) and (5.19). The proof of Theorem 5.2 is complete.

Properties of the Covariance

Let’s talk about the covariance of two U-statistics. Consider a set of $g$ U-statistics,

$$U^{(\gamma)}=\left(\begin{array}{c}
n \\m(\gamma)
\end{array}\right)^{-1} \Sigma^{\prime} \Phi^{(\gamma)}\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{m}(\gamma)}\right), \quad(\gamma=1, \cdots, g)$$

each $U^{(\gamma)}$ being a function of the same $n$ independent, identically distributed random vectors $X_{1}, \cdots, X_{n}$. The function $\Phi^{(\gamma)}$ is assumed to be symmetric in its $m(\gamma)$ arguments $(\gamma=1, \cdots, g)$.

Let

$$\begin{align}
E\left\lbrace U^{(\gamma)}\right\rbrace=E\left\lbrace\Phi^{(\gamma)}\left(X_{1}, \cdots, X_{m(\gamma)}\right)\right\rbrace=\theta^{(\gamma)}, \quad(\gamma=1, \cdots, g) ; \\
\Psi^{(\gamma)}\left(x_{1}, \cdots, x_{m(\gamma)}\right)=\Phi^{(\gamma)}\left(x_{1}, \cdots, x_{m(\gamma)}\right)-\theta^{(\gamma)}, \quad(\gamma=1, \cdots, g) ; \tag{6.1}\\
\Psi_{c}^{(\gamma)}\left(x_{1}, \cdots, x_{c}\right)=E\left\lbrace\Psi^{(\gamma)}\left(x_{1}, \cdots, x_{c}, X_{c+1}, \cdots, X_{m(\gamma)}\right)\right\rbrace, \tag{6.2}\\
(c=1, \cdots, m(\gamma) ; \gamma=1, \cdots, g) ; \\
\zeta_{c}^{(\gamma, \delta)}=E\left\lbrace\Psi_{c}^{(\gamma)}\left(X_{1}, \cdots, X_{c}\right) \Psi_{c}^{(\delta)}\left(X_{1}, \cdots, X_{c}\right)\right\rbrace, \tag{6.3}
\end{align}$$

$$(\gamma, \delta=1, \cdots, g)$$

If, in particular, $\gamma=\delta$, we shall write

$$\begin{equation*}
\zeta_{c}^{(\gamma)}=\zeta_{c}^{(\gamma, \gamma)}=E\left\lbrace\Psi_{c}^{(\gamma)}\left(X_{1}, \cdots, X_{c}\right)\right\rbrace^{2} \tag{6.4}
\end{equation*}$$

Let

$$\sigma\left(U^{(\gamma)}, U^{(\delta)}\right)=E\left\lbrace\left(U^{(\gamma)}-\theta^{(\gamma)}\right)\left(U^{(\delta)}-\theta^{(\delta)}\right)\right\rbrace$$

be the covariance of $U^{(\gamma)}$ and $U^{(\delta)}$.

In a similar way as for the variance, we find, if $m(\gamma) \leq m(\delta)$,

$$\sigma\left(U^{(\gamma)}, U^{(\delta)}\right)=\left(\begin{array}{c}
n \tag{6.5}\\ m(\gamma)
\end{array}\right)^{-1} \sum_{c=1}^{m(\gamma)}\left(\begin{array}{c}m(\delta) \\ c
\end{array}\right)\left(\begin{array}{c}
n-m(\delta) \\ m(\gamma)-c
\end{array}\right) \zeta_{c}^{(\gamma, \delta)} .$$

The right hand side is easily seen to be symmetric in $\gamma, \delta$.

For $\gamma=\delta,(6.5)$ is the variance of $U^{(\gamma)}$ (cf. (5.13)).

We have from (5.23) and (6.5)

$$\begin{align}
\lim_{n \rightarrow \infty} n \sigma^{2}\left(U^{(\gamma)}\right) & =m^{2}(\gamma) \xi_{1}^{(\gamma)} \\
\lim_{n \rightarrow \infty} n \sigma\left(U^{(\gamma)}, U^{(\delta)}\right) & =m(\gamma) m(\delta) \xi_{1}^{(\gamma, \delta)} .
\end{align}$$

Hence, if $\zeta_{1}^{(\gamma)} \neq 0$ and $\zeta_{1}^{(\delta)} \neq 0$, the product moment correlation $\rho\left(U^{(\gamma)}, U^{(\delta)}\right)$ between $U^{(\gamma)}$ and $U^{(\delta)}$ tends to the limit

$$\begin{equation*}
\lim_{n \rightarrow \infty} \rho\left(U^{(\gamma)}, U^{(\delta)}\right)=\frac{\zeta_{1}^{(\gamma, \delta)}}{\sqrt{\zeta_{1}^{(\gamma)} \zeta_{1}^{(\delta)}}} \tag{6.6}
\end{equation*}$$

The Limit Theorems: i.i.d. Case

In this section the vectors $X_{\alpha}$ will be assumed to be identically distributed.

Notes:
Converge of the Distribution Function:
A sequence of d.f.’s $F_{1}(x)$, $F_{2}(x), \cdots$ converges to a d.f. $F(x)$ if $\lim F_{n}(x)=F(x)$ in every point at which the one-dimensional marginal limiting d.f.’s are continuous.

Singularity of the Distribution:
A $g$-variate normal distribution is called non-singular if the rank $r$ of its covariance matrix is equal to $g$, and singular if $r<g$.

LEMMA 7.1. Let $V_{1}, V_{2}, \cdots$ be an infinite sequence of random vectors $V_{n}=$ $\left(V_{n}^{(1)}, \cdots, V_{n}^{(g)}\right)$, and suppose that the d.f. $F_{n}(v)$ of $V_{n}$ tends to a d.f. $F(v)$ as $n \rightarrow \infty$. Let $V_{n}^{(\gamma)^{\prime}}=V_{n}^{(\gamma)}+d_{n}^{(\gamma)}$, where

$$\begin{align}
\lim_{n \rightarrow \infty} E\left\lbrace d_{n}^{(\gamma)}\right\rbrace^{2}=0, \quad(\gamma=1, \cdots, g) \tag{7.1}
\end{align}$$

Then the d.f. of $V_{n}^{\prime}=\left(V_{n}^{(1)}, \cdots, V_{n}^{(g)}\right)$ tends to $F(v)$.

The Limit Theorem 7.1 and 7.2

Theorem 7.1. Let $X_{1}, \cdots, X_{n}$ be $n$ independent, identically distributed random vectors,

$$X_{\alpha}=\left(X_{\alpha}^{(1)}, \cdots, X_{\alpha}^{(r)}\right), \quad(\alpha=1, \cdots, n)$$

Let

$$\Phi^{(\gamma)}\left(x_{1}, \cdots, x_{m(\gamma)}\right), \quad(\gamma=1, \cdots, g),$$

be $g$ real-valued functions not involving $n, \Phi^{(\gamma)}$ being symmetric in its $m(\gamma)(\leq n)$ vector arguments $x_{\alpha}=\left(x_{\boldsymbol{\alpha}}^{(1)}, \cdots, x_{\alpha}^{(r)}\right),(\alpha=1, \cdots, m(\gamma) ; \gamma=1, \cdots, g)$. Define

$$U^{(\gamma)}=\left(\begin{array}{c}
n \tag{7.2}\\m(\gamma)
\end{array}\right)^{-1} \Sigma^{\prime} \Phi^{(\gamma)}\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{m}(\gamma)}\right), \quad(\gamma=1, \cdots, g)$$

where the summation is over all subscripts such that $1 \leq \alpha_{1}<\cdots<\alpha_{m(\gamma)} \leq n$. Then, if the expected values

$$\begin{equation*}
\theta^{(\gamma)}=E\left\lbrace\Phi^{(\gamma)}\left(X_{1}, \cdots, X_{m(\gamma)}\right)\right\rbrace, \quad(\gamma=1, \cdots ; g) \tag{7.3}
\end{equation*}$$

and

$$\begin{equation*}
E\left\lbrace\Phi^{(\gamma)}\left(X_{1}, \cdots, X_{m(\gamma)}\right)\right\rbrace^{2}, \quad(\gamma=1, \cdots, g) \tag{7.4}
\end{equation*}$$

exist, the joint d.f. of

$$\sqrt{n}\left(U^{(1)}-\theta^{(1)}\right), \cdots, \sqrt{n}\left(U^{(0)}-\theta^{(\theta)}\right)$$

tends, as $n \rightarrow \infty$, to the g-variate normal d.f. with zero means and covariance matrix $\left(m(\gamma) m(\delta) \zeta_{1}^{(\gamma, \delta)}\right)$, where $\zeta_{1}^{(\gamma, \delta)}$ is defined by (6.3). The limiting distribution is non-singular if the determinant $\left|\zeta_{1}^{(\gamma, \delta)}\right|$ is positive.

According to Theorem 5.2, $\sigma^{2}(U)$ exceeds its asymptotic value $m^{2} \zeta_{1} / n$ for any finite $n$. Hence, when $n$ is large but finite, Theorem 7.1 underestimate the variance of $U$. And for such cases the following theorem, which is an immediate consequence of Theorem 7.1, will be more useful.

Theorem 7.2. Under the conditions of Theorem 7.1, and if

$$\zeta_{1}^{(\gamma)}>0, \quad(\gamma=1, \cdots, g)$$

the joint d.f. of

$$\left(U^{(1)}-\theta^{(1)}\right) / \sigma\left(U^{(1)}\right), \cdots,\left(U^{(g)}-\theta^{(g)}\right) / \sigma\left(U^{(g)}\right)$$

tends, as $n \rightarrow \infty$, to the $g$-variate normal d.f. with zero means and covariance matrix $\left(\rho^{(\gamma, \delta)}\right)$, where

$$\rho^{(\gamma, \delta)}=\lim_{n \rightarrow \infty} \frac{\sigma\left(U^{(\gamma)}, U^{(\delta)}\right)}{\sigma\left(U^{(\gamma)}\right) \sigma\left(U^{(\delta)}\right)}=\frac{\zeta_{1}^{(\gamma, \delta)}}{\sqrt{\zeta_{1}^{(\gamma)} \zeta_{1}^{(\delta)}}}, \quad(\gamma, \delta=1, \cdots, g).$$

Proof of Theorem 7.1. The existence of (7.4) entails that of

$$\zeta_{m}^{(\gamma)}=E\left\lbrace\Phi^{(\gamma)}\left(X_{1}, \cdots, X_{m(\gamma)}\right)\right\rbrace^{2}-\left(\theta^{(\gamma)}\right)^{2}$$

which, by (5.19), (5.20) and (6.6), is sufficient for the existence of

$$\zeta_{1}^{(\gamma)}, \cdots, \zeta_{m-1}^{(\gamma)} \text {, of } \sigma^{2}\left(U^{(\gamma)}\right) \text {, and of } \zeta_{1}^{(\gamma, \delta)} \leq \sqrt{\zeta_{1}^{(\gamma)} \zeta_{1}^{(\delta)}}$$

Now, consider the $g$ quantities

$$Y^{(\gamma)}=\frac{m(\gamma)}{\sqrt{n}} \sum_{\alpha=1}^{n} \Psi_{1}^{(\gamma)}\left(X_{\alpha}\right), \quad(\gamma=1, \cdots, g)$$

where $\Psi_{1}^{(\gamma)}(x)$ is defined by (6.2). $\quad Y^{(1)}, \cdots, Y^{(g)}$ are sums of $n$ independent, random variables with zero means, whose covariance matrix, by virtue of (6.3), is

$$\begin{equation*}
\left\lbrace\sigma\left(Y^{(\gamma)}, Y^{(\delta)}\right)\right\rbrace=\left\lbrace m(\gamma) m(\delta) \xi_{1}^{(\gamma, \delta)}\right\rbrace \tag{7.5}
\end{equation*}$$

By the Central Limit Theorem for vectors (cf. Cramer [1, p. 112]), the joint d.f. of $\left(Y^{(1)}, \cdots, Y^{(\rho)}\right)$ tends to the normal $g$-variate d.f. with the same means and covariances.

Theorem 7.1 will be proved by showing that the $g$ random variables

$$\begin{equation*}
Z^{(\gamma)}=\sqrt{n}\left(U^{(\gamma)}-\theta^{(\gamma)}\right), \quad(\gamma=1, \cdots, g), \tag{7.6}
\end{equation*}$$

have the same joint limiting distribution as $Y^{(1)}, \cdots, Y^{(g)}$.

According to Lemma 7.1 it is sufficient to show that

$$\begin{equation*}
\lim _{n \rightarrow \infty} E\left(Z^{(\gamma)}-Y^{(\gamma)}\right)^{2}=0, \quad(\gamma=1, \cdots, n) \tag{7.7}
\end{equation*}$$

For proving (7.7), write

$$\begin{equation*}
E\left\lbrace Z^{(\gamma)}-Y^{(\gamma)}\right\rbrace^{2}=E\left\lbrace Z^{(\gamma)}\right\rbrace^{2}+E\left\lbrace Y^{(\gamma)}\right\rbrace^{2}-2 E\left\lbrace Z^{(\gamma)} Y^{(\gamma)}\right\rbrace \tag{7.8}
\end{equation*}$$

By (5.13) we have

$$\begin{equation*}
E\left\lbrace Z^{(\gamma)}\right\rbrace^{2}=n \sigma^{2}\left(U^{(\gamma)}\right)=m^{2}(\gamma) \zeta_{1}^{(\gamma)}+O\left(n^{-1}\right) \tag{7.9}
\end{equation*}$$

and from (7.5),

$$\begin{equation*}
E\left\lbrace Y^{(\gamma)}\right\rbrace^{2}=m^{2}(\gamma) \zeta_{1}^{(\gamma)} \tag{7.10}
\end{equation*}$$

By (7.2) and (6.1) we may write for (7.6)

$$Z^{(\gamma)}=\sqrt{n}\left(\begin{array}{c}
n \\m(\gamma)
\end{array}\right)^{-1} \Sigma^{\prime} \Psi^{(\gamma)}\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{m}(\gamma)}\right)$$

and hence

$$E\left\lbrace Z^{(\gamma)} Y^{(\gamma)}\right\rbrace=m(\gamma)\left(\begin{array}{c}
n \\m(\gamma)
\end{array}\right)^{-1} \sum_{\alpha=1}^{n} \sum^{\prime} E\left\lbrace\Psi_{1}^{(\gamma)}\left(X_{\alpha}\right) \Psi^{(\gamma)}\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{m}(\gamma)}\right)\right\rbrace.$$

The term

$$E\left\lbrace\Psi_{1}^{(\gamma)}\left(X_{\alpha}\right) \Psi^{(\gamma)}\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{m}(\gamma)}\right)\right\rbrace$$

is $=\zeta_{1}^{(\gamma)}$ if

$$\begin{equation*}
\alpha_{1}=\alpha \quad \text { or } \quad \alpha_{2}=\alpha \cdots \quad \text { or } \quad \alpha_{m(\gamma)}=\alpha \tag{7.11}
\end{equation*}$$

and 0 otherwise. For a fixed $\alpha$, the number of $\operatorname{sets}\left\lbrace\alpha_{1}, \cdots, \alpha_{m(\gamma)}\right\rbrace$ such that $1 \leq \alpha_{1}<\cdots<\alpha_{m(\gamma)} \leq n$ and (7.11) is satisfied, is $\left(\begin{array}{c}n-1 \\ m(\gamma)-1\end{array}\right)$. Thus,

$$E\left\lbrace Z^{(\gamma)} Y^{(\gamma)}\right\rbrace=m(\gamma)\left(\begin{array}{c}
n \tag{7.12}\\m(\gamma)
\end{array}\right)^{-1} n\left(\begin{array}{c}
n-1 \\m(\gamma)-1
\end{array}\right) \zeta_{1}^{(\gamma)}=m^{2}(\gamma) \zeta_{1}^{(\gamma)}$$

On inserting (7.9), (7.10), and (7.12) in (7.8), we see that (7.7) is true.

The proof of Theorem 7.1 is complete.

The Limit Theorem 7.3: Extension Theorem 7.1 to a Larger Class of Statistics

The application of Lemma 7.1 leads immediately to the following extension of Theorem 7.1 to a larger class of statistics.

Theorem 7.3. Let

$$\begin{equation*}
U^{(g)^{\prime}}=U^{(g)}+\frac{b_{n}^{(\gamma)}}{\sqrt{n}}, \quad(\gamma=1, \cdots, g) \tag{7.13}
\end{equation*}$$

where $U^{(\gamma)}$ is defined by (7.2) and $b_{n}^{(\gamma)}$ is a random variable. If the conditions of Theorem 7.1 are satisfied, and $\lim E\left\lbrace b_{n}^{(\gamma)}\right\rbrace^{2}=0,(\gamma=1, \cdots, g)$, then the joint distribution of

$$\sqrt{n}\left(U^{(1) \prime}-\theta^{(1)}\right), \cdots, \sqrt{n}\left(U^{(g) \prime}-\theta^{(g)}\right)$$

tends to the normal distribution with zero means and covariance matrix

$$\left\lbrace m(\gamma) m(\delta) \zeta_{1}^{(\gamma, \delta)}\right\rbrace$$

The Limit Theorem 7.4: Application to Sample Functionals

The theorem 7.3 applies, in particular, to the regular functionals $\theta(S)$ of the sample d.f.,

$$\theta(S)=\frac{1}{n^{m}} \sum_{\alpha_{1}=1}^{n} \ldots \sum_{\alpha_{m}=1}^{n} \Phi\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{m}}\right)$$

in the case that the variance of $\theta(S)$ exists. For we may write

$$n^{m} \theta(S)=\left(\begin{array}{l}
n \\ m
\end{array}\right) U+\Sigma^\star \Phi\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{m}}\right)$$

where the sum $\Sigma^\star$ is extended over all $m$-tuples $\left(\alpha_{1}, \cdots, \alpha_{m}\right)$ in which at least one equality $\alpha i=\alpha_{j}(i \neq j)$ is satisfied. The number of terms in $\Sigma^\star$ is of order $n^{m-1}$. Hence

$$\theta(S)-U=\frac{1}{n} D$$

where the expected value $E\left\lbrace D^{2}\right\rbrace$, whose existence follows from that of $\sigma^{2}{\theta(S)}$, is bounded for $n \rightarrow \infty$. Thus, if we put $U^{(\gamma)^{\prime}}=\theta^{(\gamma)}(S)$, the conditions of Theorem 7.3 are fulfilled. We may summarize this result as follows:

Theorem 7.4. Let $X_{1}, \cdots, X_{n}$ be a random sample from an r-variate population with d.f. $F(x)=F\left(x^{(1)}, \cdots, x^{(r)}\right)$, and let

$\theta^{(\gamma)}(F)=\int \cdots \int \Phi^{(\gamma)}\left(x_{1}, \cdots, x_{m(\gamma)}\right) d F\left(x_{1}\right) \cdots d F\left(x_{m}{ }^{(\gamma)}\right), \quad(\gamma=1, \cdots, g)$,

be $g$ regular functionals of $F$, where $\Phi^{(\gamma)}\left(x_{1}, \cdots, x_{m(\gamma)}\right)$ is symmetric in the vectors $x_{1}, \cdots, x_{m(\gamma)}$ and does not involve $n$. If $S(x)$ is the d.f. of the random sample, and if the variance of

$$\theta^{(\gamma)}(S)=\frac{1}{n^{m}} \sum_{\alpha_{1}=1}^{n} \cdots \sum_{\alpha_{m}(\gamma)=1}^{n} \Phi^{(\gamma)}\left(X_{\alpha_{1}}, \cdots, X_{\alpha_{m}(\gamma)}\right)$$

exists, the joint d.f. of

$$\sqrt{n}\left\lbrace\theta^{(1)}(S)-\theta^{(1)}(F)\right\rbrace, \cdots, \sqrt{n}\left\lbrace\theta^{(\theta)}(S)-\theta^{(g)}(F)\right\rbrace$$

tends to the g-variate normal d.f. with zero means and covariance matrix

$$\left\lbrace m(\gamma) m(\delta) \zeta_{1}^{(\gamma, \delta)}\right\rbrace$$

The Limit Theorem 7.5: Application to Functions of Statistics

The following theorem is concerned with the asymptotic distribution of a function of statistics of the form $U$ or $U^{\prime}$.

Theorem 7.5. Let $U^{\prime}=\left(U^{(1)^{\prime}}, \cdots, U^{(g) \prime}\right)$ be a random vector, where $U^{(\gamma)}$ is defined by (7.13), and suppose that the conditions of Theorem 7.3 are satisfied. If the function $h(y)=h\left(y^{(1)}, \cdots, y^{(g)}\right)$ does not involve $n$ and is continuous together with its second order partial derivatives in some neighborhood of the point $(y)=(\theta)=$ $\left(\theta^{(1)}, \cdots, \theta^{(g)}\right)$, then the distribution of the random variable $\sqrt{n}\left\lbrace h\left(U^{\prime}\right)-h(\theta)\right\rbrace$ tends to the normal distribution with mean zero and variance

$$\begin{equation*}
\sum_{\gamma=1}^{g} \sum_{\delta=1}^{g} m(\gamma) m(\delta) \left(\frac{\partial h(y)}{\partial y^{(\gamma)}} \frac{\partial h(y)}{\partial y^{(\delta)}}\right)_{y=\theta} \zeta_1^{(\gamma, \delta)}
\end{equation*}$$

Applications to particular statistics

Moments and functions of moments
Mean different and coefficient of concentration
Functions of ranks and of the signs of variate differences
Difference sign correlation
Rank correlation and grade correlation
Non-parametric tests of independence
Mann’s test against trend
The coefficient of partial difference sign correlation

References

[1] W. Hoeffding, ‘A Class of Statistics with Asymptotically Normal Distribution’, Ann. Math. Statist., vol. 19, no. 3, pp. 293–325, Sep. 1948, doi: 10.1214/aoms/1177730196.

[2] Shao J. Mathematical statistics[M]. Springer Science & Business Media, 2003.

本文链接： https://lucajiang.github.io/2024/03/24/Notes-on-U-Statistic/

版权声明： 本博客所有文章除特别声明外，均采用 CC BY 4.0 CN协议许可协议。转载请注明出处！

蒋文馨City University of Hong Kong

Stay Hungry, Stay Foolish.