Proc. ICDAR, to appear, Aug. 2003
Handwriting Recognition Using Position Sensitive Letter N-Gram Matching
Adnan El-Nasan, Sriharsha Veeramachaneni, George Nagy
DocLab, Rensselaer Polytechnic Institute, Troy, NY 12180
elnasan@rpi.edu
Abstract
We propose further improvement of a handwriting
recognition method that avoids segmentation while able
to recognize words that were never seen before in
handwritten form. This method is based on the fact that
few pairs of English words share exactly the same set of
letter bigrams and even fewer share longer n-grams. The
lexical n-gram matches between every word in a lexicon
and a set of reference words can be precomputed. A
position-based match function then detects the matches
between the handwritten signal of a query word and each
reference word. We show that with a reasonable set of
reference words, the recognition of lexicon words exceeds
90%.
1. Introduction
We are proposing a single-user unconstrained
handwriting recognition system that utilizes partial word
matching to detect letter-bigram or longer segments from
a feature-based representation of word patterns. The
system has a lexicon, and a reference set. The lexicon is
the set of all plausible word labels. Words in the reference
set are words from the lexicon for which we have
handwritten samples. The proposed system consists of
three stages: lexical processing, signal matching and
classification. The lexical processing stage pre-computes
the bigram match properties for each word in the lexicon
by matching the label of a lexicon word against the label
of each reference word. The signal matching stage reports
the length of the longest matching segment between the
feature representation of the unknown and the feature
representation of each reference word. In contradistinction
to our earlier work [4], these matches are limited to the
positions where each lexical candidate matches the label
of a reference word. The classification stage then finds a
label from the