Using Segmentation to Verify Object Hypotheses
Toyota Technological Institute at Chicago
Chicago, IL 60637
We present an approach for object recognition that com-
bines detection and segmentation within a efficient hypoth-
esize/test framework. Scanning-window template classifiers
are the current state-of-the-art for many object classes such
as faces, cars, and pedestrians. Such approaches, though
quite successful, can be hindered by their lack of explicit
encoding of object shape/structure – one might, for exam-
ple, find faces in trees.
We adopt the following strategy; we first use these sys-
tems as attention mechanisms, generating many possible
object locations by tuning them for low missed-detections
and high false-positives. At each hypothesized detection, we
compute a local figure-ground segmentation using a win-
dow of slightly larger extent than that used by the classifier.
This segmentation task is guided by top-down knowledge.
We learn offline from training data those segmentations that
are consistent with true positives. We then prune away those
hypotheses with bad segmentations. We show this strat-
egy leads to significant improvements (10-20%) over estab-
lished approaches such as ViolaJones and DalalTriggs on a
variety of benchmark datasets including the PASCAL chal-
lenge, LabelMe, and the INRIAPerson dataset.
One of the open issues in object recognition is the role
of segmentation. Several issues remain unclear. Can one
quantitatively demonstrate that segmentation improves de-
tection performance? If so, how does one computationally
detect/segment in an efficient manner?
We address these issues with a simple but surprisingly
effective hypothesize-and-test framework. We leverage the
successful work on sliding-window pattern-recognition de-
tectors. We use these as attention mechanisms that propose
many hundreds of object hypotheses per image. By com-
puting a local figure-ground segmentation at hypothesized
detections, we show one can prune a