Discriminative Learning can Succeed where
Generative Learning Fails ?
Philip M. Long, a Rocco A. Servedio, b,∗,1 Hans Ulrich Simon c
aGoogle, Mountain View, CA, USA
bColumbia University, New York, New York, USA
cRuhr-Universität Bochum, Bochum, Germany
Generative algorithms for learning classifiers use training data to separately esti-
mate a probability model for each class. New items are classified by comparing their
probabilities under these models. In contrast, discriminative learning algorithms try
to find classifiers that perform well on all the training data.
We show that there is a learning problem that can be solved by a discriminative
learning algorithm, but not by any generative learning algorithm. This statement
is formalized using a framework inspired by previous work of Goldberg .
Key words: algorithms, computational learning theory, discriminative learning,
generative learning, machine learning
If objects and their classifications are generated randomly from a joint prob-
ability distribution, then the optimal way to predict the class y of an item x
is to maximize Pr[y|x]. Applying Bayes’ rule, this is equivalent to maximiz-
ing Pr[x|y] Pr[y]. This motivates what has become known as the generative
approach to learning a classifier, in which the training data is used to learn
? Reference  is a preliminary version of this paper but contains a flaw; see the
more detailed note at the end of Section 1.
∗ Corresponding author.
Email address: email@example.com (Rocco A. Servedio,).
1 Supported in part by NSF CAREER award CCF-0347282, by NSF award CCF-
0523664, and by a Sloan Foundation Fellowship.
Preprint submitted to Elsevier Science
26 February 2007
Pr[·|y] and Pr[y] for the different classes y, and the results are used to approx-
imate the behavior of the optimal predictor for the source (see [2,6]).
In the discriminative approach, the learning algorithm simply tries to find a
classifier that performs well on the training data [12,6,10,7]. Discriminative al-