CHAPTER 3
LITERATURE REVIEW
3.1 Introduction
The lucrative features of the speech and ever flourishing technical ability in this field have
fascinated engineers and speech scientists to think over the utilization of speech in many new
spheres of the daily life. Consequently, the topic “automatic speech recognition” is evolved
which aims to construct a machine that can emulate the human performance through recognizing
and synthesizing the human speech. (Podder, 1997).
In recent years, automatic speech recognition has reached very high levels of performance,
with word-error rates dropping by a factor of five in the past five years. This current state of
performance is largely due to improvements in the algorithms and techniques that are used in this
field. As a result, the accuracy level of ASR systems is improved especially when using a
combination of various algorithms and techniques.
This chapter highlights some of key related researches, algorithms and techniques that are
relevant to this research. Various types of features extraction, classification and matching
techniques are also highlighted in this chapter. This chapter provides an overview on current
researches related to ASR and telephony speech recognition systems, which applied in their
implementation the most common features extraction, classification and matching algorithms
and techniques. Finally, this chapter provides a critical comparison of speech recognition
systems using the algorithms that are explained in this literature.
3.2 Features Extraction Techniques
Features extraction in ASR is the computation of a sequence of feature vectors which
provides a compact representation of the given speech signal. It is usually performed in three
main stages. The first stage is called the speech analysis or the acoustic front-end, which
performs spectro-temporal analysis of the speech signal and generates raw features describing
the envelope of the power spectrum of short speech intervals. The second stage compiles an
e