IEEE TRANSACTIONS
ON
INFORMATION
THEORY,
VOL.
IT - 30, NO. 1, JANUARY
1984
93
and V, with value in (1;.
., M}. The local performance of S,, is
measuredby&,
= P{a,,(X) # YlV,, X= x},whiletheglobal
performance by L,, = E( L,( X) IV,). L*(x)
and L* denote the
local Bayes’ probability of error at xRd and the Bayes’ probabil-
ity of error, respectively.
Letp,(x) = E(I(,=,)IX=
x) and letp,,(x)
be its estimate on
the data (X1, Zlr,f,,>,..
.,(X,,
I y =,,) derived from m,. The
recursive computation of pin may iti e carried out by
P,o(X) = I,o(x) = 0
r,,,(x) = I;,-I(x) + K
.(+,=il
-pi,,(x))K(
3).
Using q,,, we obtain a classification rule which classifies every X
as commg from any class which maximizes p,,(X). Reasoning
similarly as in [7] we conclude from Theorem 1 and 2 the next
two theorems concerning asymptotic optimality of the rule.
Theorem 3: If (l), (2), (3) are satisfied then
L,,(x)
-+ L*(x)
in probability as n + CO
for almost all x(p).
If in addition (4), (5) hold then
L,,(x) -+ L*(x)
almost surely as n -+ 00
foralmostallx(p).
Theorem 4: Under the same assumptions as in Theorem 3 we
obtain, respectively,
L,, + L*
in probability
and
L, -+ L*
almost surely as n + cc.
APPENDIX
Proof of Lemma 2: Let us consider the quotient
G P-ll~~~n(hd(i)/a(s,.rh(i))).
. .
This lemma follows from (2) and the fact that hd/p(Sx,rh)
possesses a finite limit as h -+ 0 for almost all x(p) (see Devroye
[3, lemma 2.21).
0
Proof of Lemma 3: Let us transform expression (7) as fol-
lows:
c2H(0) F yn( i
hd(i),/
i
h”(i)) ml
n=l
r=l
i=l
. (h”(n)/(
ilhd(i))2)
j (10)
where
y,, = h-d(n)EK
+j$
i
1
If p is absolutely continuous with density g then yli + g(x) as
n --f cc for almost all x(p) (see Wheeden and Zygmund [9, th.
9.131). By Kronecker’s lemma and by (9) the series in (10) is
convergent for almost all x(p). Thus Theorem 2 follows.
Next, reasoning similarly as in the proof of Theorem 2, (7) may
be transformed for almost all x(p) to the form
where c(x) is a positive