### About Interesting Posts

Interesting documents about a variety of subjects from around the world. Posted on edocr.

1

CHAPTER 1

REVIEW ON CHANNEL EQUALIZATION

1.1 INTRODUCTION

Communication systems comprise three fundamental elements: transmitter, channel and

receiver. When signals are transmitted through a communications system, they are obstructed

by some distortions which are mainly intersymbol interference (ISI) and noise. The

transmitted signal is distorted by ISI which is caused by multipath effect in band limited

(frequency selective) time dispersive channels and is the cause of bit errors on the receiver

side. ISI is considered the main factor negatively affecting fast transmission of data over

wireless channels. In order to eliminate or minimize these distortions, equalizers are

employed in these systems. Equalization is the method of compensating for, eliminating or

reducing the amplitude and phase distortion introduced by the transmission medium in

communications systems. In a general meaning, the term equalization refers to any signal

processing operation which minimizes ISI. An equalizing filter overcomes the ISI caused by

individual received symbols of a transmitted data stream, as well as the crosstalk that for

example occurs due to coupling of a transmitted pulse or that results from the capacitive

coupling of the transmitted pulse on an outgoing pair interfering with the received pulse on an

incoming pair. The task of equalizers is to provide efficient and error free communications by

ensuring that signals transmitted through the channel are recovered as original at the end of

the receiver that communications system has.

Distortions may be linear or nonlinear depending on the channel characteristics of channel.

When transmitting information through a physical channel, various mechanisms distort the

transmitted signal significantly, causing degradation or even failure in the communications.

These mechanisms can be classified as additive thermal noise, man-made noise and

atmospheric noise. In practice, many of the physical channels are characterized by various

channel models. The most frequently encountered channel of communications is that with

additive noise. An additive random noise process is involved in this channel model. The

factors causing the additive noise process are amplifiers and electronic components on the

2

receiver side of the communications system the transmission’s interference as radio signal

transmission, for example. Thermal noise is the category of noise that electronic components

and amplifiers cause. Statistically, that sort of noise gets classified as a random Gaussian

noise process and modeling the channel in terms of mathematics is named the additive

Gaussian noise channel. The mathematical model becomes an additive white Gaussian noise

(AWGN) channel in the case of the random process being a white-noise process. The random

process is a white-noise process when the power spectral density (PSD) is flat (constant) over

all frequencies [1,2].

When compared with AWGN channels, mobile radio channel deficiencies make the signal on

the receiver side greatly distorted or cause its significant fading. This fading is classified as a

non-additive signal disturbance and appears as time variation in the signal amplitude. Some

techniques are utilized to compensate for fading channel deficiencies. The main techniques

used in compensating for fading channel impairments can be classified as equalization,

channel coding and diversity that are employed to compensate for the signal distortions and

improve the received signal quality [3]. This thesis concentrates on equalization technique.

Equalization techniques can be categorized into linear or nonlinear techniques depending on

the way the output of an adaptive equalizer is used for subsequent control of the equalizer.

The decision making device of the receiver processes the equalizer’s output and determines

the value of the digital data bit being received before applying a slicing or thresholding

operation (a nonlinear operation) to determine the value of the reconstructed message data. If

this data is not used in the feedback path for the adapting of the equalizer, it’s a linear type of

equalization, but on the other hand, if the decision making device feeds the reconstructed data

back in order to alter the equalizer’s subsequent outputs, the equalization is nonlinear [3]. If

the used channels are nonlinear, linear equalizers cannot reconstruct the transmitted signal.

There are various equalizer structures among which linear transversal equalizer (LTE) is the

most common. The simplest LTE, whose transfer function is a polynomial, uses only feed

forward taps and has many zeros but poles only at ݖ = 0. This filter is called a finite impulse

response (FIR) filter or simply a transversal filter. In this type of equalizer, the filter

3

coefficient linearly weights the received signal’s current and past values before summing

them to produce the output of the equalizer.

Besides, some applications employ nonlinear equalizers since linear equalizers cannot deal

with high amount of channel distortion. The performance of linear equalizers on channels

involving deep spectral nulls in the passband is not good and hence, linear equalizers enhance

the noise present in the frequencies in which they place too much gain in attempting to

compensate for the distortion. Nonlinear equalizers are superior in performance to linear

equalizers because of these reasons. Three quite effective nonlinear methods which possess

improvements over linear equalization methods and that are used in 2G and 3G systems are:

1. Decision Feedback Equalization (DFE)

2. Maximum Likelihood Symbol Detection (MLSD)

3. Maximum Likelihood Sequence Estimation (MLSE) [4]

There have been large amount of studies aimed at channel equalization using various

methods, techniques and algorithms. Recently, neural network based fuzzy technology has

been widely used as a powerful and significant tool in channel equalization of various types of

signals. Experts have determined the fuzzy rules by utilizing the channel’s input-output data

pairs in this type of equalizers. Adaptive channel equalization based on neural networks and

employing multilayer perceptron (MLP) has been developed as part of this thesis which has

enabled the equalization of Quadrature Amplitude Modulation (QAM) type signals of various

levels. This has been achieved for both linear and nonlinear channels using a Nonlinear

Neuro-Fuzzy Equalizer (NNFE) at a relatively high adaptation speed and accurate equalizer

output results which has proven to be quite effective and practical.

The changeable fuzzy IF-THEN rules which configure the fuzzy adaptive filter are formed by

either human experts or the input-output pairs that are matched throughout a procedure of

adaptation. In this study, neural networks and fuzzy technology are used for the development

of a neuro-fuzzy equalizer for channel distortion of Quadrature Amplitude Modulation

(QAM) signals. Even though the QAM signal has a complex form which is composed of real

(in-phase) and imaginary (quadrature) parts, the complex signal is not directly applied to the

4

channel and equalizer since the used neuro-fuzzy filter is based on real values and best suits

the signal processing that takes place in real multidimensional space. The modulation and

demodulation of M-ary QAM (where M=4 & M=16 ) is accomplished by splitting the stream

of data bit into the in-phase (I) and quadrature (Q) components. Gray coding is employed to

map the I and Q components together. The significant feature of this thesis study is the

application of ‘normalization’ method by which the modulated in-phase and quadrature QAM

signal is normalized to a maximum of one. Consequently, each component of the complex

signal attains values between 0 and 1 by first shifting the values such that the minimum value

is zero and then scaling them such that the maximum value is 1. Each component then is input

to the channel and equalizer separately and denormalized separately at the equalizer’s output

where they are recombined to form the final desired complex QAM scheme at the end. The

normalization method provides better BER and convergence performance since it is stable in

addition to more accurate equalizer output results with relatively small number of iterations

before the minimum error is attained.

This thesis consists of five chapters where:

Chapter 1 presents an overview on channel equalization. The state of application of neuro-

fuzzy system and fuzzy logic as well as their properties and features are explained.

Chapter 2 explains the channel equalization, the distortions and noise in the channel.

Mathematical models and formulas representing the channels and nonlinear neuro-fuzzy

equalizer used in the thesis together with its characteristics are described.

Chapter 3 outlines the architecture and operation principles of the nonlinear neuro-fuzzy

network (NNFN). The used learning algorithm, the linguistic data about the target system and

numerical input-output relationships of NNFN are explained in detail. Fuzzy rule-based fuzzy

sets, the parameters and error calculations are analyzed.

Chapter 4 describes in detail the quadrature amplitude modulation (QAM) and its properties.

The application of QAM on NNFN and the features of the thesis design are explained. The

specific technique of normalization used in equalizing QAM signals and its mathematical

implementation are described.

5

Chapter 5 illustrates the simulation results of the equalization system demonstrating

graphically and statistically the performance of the equalization system. Bit error rate (BER)

versus signal-to-noise ratio (SNR) analysis is made in tabulated and graphical forms proving

the accuracy of the system. Comparisons between the channels and between the two

constellations of QAM are made to illustrate the performance of the equalizer, as well.

Conclusions are discussed at the end.

1.2 Overview

In order to accurately transmit the input signals from the transmitter to the receiver,

minimization and thus equalization of distortions in the channel is critical. This can be

successfully done by employing efficient equalization algorithms and techniques during the

transmission of the signals from the transmitter to the receiver. This chapter considers

methods used in channel equalization. Neural networks, fuzzy and neuro-fuzzy technologies

which form the basis of the adaptive channel equalization are analyzed and discussed.

1.3 The State of Application of Channel Equalization

Linear and nonlinear distortions are the main obstacles in transmitting the input signals to the

receiver of a communications system in their original state. These distortions, namely ISI and

noise, are caused in the channel and channel equalization is needed in order to transmit the

signals as accurately as possible. Even though both linear and nonlinear equalizers can be

used for this purpose, nonlinear equalizers are more preferably used because they are capable

of compensating both linear and nonlinear channel distortions effectively.

Two types of equalization are used which are sequence estimation and symbol detection. In

this thesis, symbol detection technique is used to realize the adaptive channel equalization.

This technique maps the input baseband signal of the input on top of a feature space that the

representation of a learnt property of the transmitted signal determined. The symbols are

separated by the usage of decision regions which function to classify the distorted signal.

The ISI problem which affects all digital communication systems is mainly caused by

restricted bandwidth. The restricted bandwidth is caused by rectangular multilevel pulses

6

when they are filtered improperly as they pass through a communication system spreading in

time, being smeared into adjacent time slots, causing ISI [2]. This ISI in turn causes errors

when transmitting data over the channel. Additionally, channel characteristics have a

significant role in causing distortions and the response of channel is time-variant meaning that

channel characteristics are not known in advance. The time-variant channel response and the

unknown channel characteristics obligates the equalizers to be designed to adjust themselves

to the channel response and to adapt themselves to the variations of time in the response of

channel so as to compensate for the channel characteristics’ variations. Such equalizers are

called adaptive equalizers and they have been receiving great attention because of their

superior features. In practice, as an example, there are situations when the channel consists of

dial-up telephone lines and the channel transfer function changes from call to call. In such a

case, the equalizer should be an adaptive filter.

Adaptive equalizers are categorized as supervised and unsupervised equalizers. When it is

necessary to use a training sequence because of the unpredictable channel characteristics in a

communications system, supervised equalizers are employed. This is done in order for the

channel response to be compared with the input to be able to update the parameters of the

equalizer. On the other hand, some communications systems do not allow the use of training

signals because the methods used to accomplish the equalization of channel do not allow the

training sequence to be transmitted. This is when unsupervised equalization is employed. This

equalization that involves a self-recovery method is also referred to as blind equalization [5].

Supervised equalization can be brought about by either sequence estimation or symbol

detection. Sequence estimator’s duty is to test the possible sequences of data instead of

decoding every one of the received symbols on its own and then selecting the sequence of

data that is most likely to be the output [4]. This sequence estimator is also referred to as

maximum likelihood sequence estimator (MLSE).

Unsupervised or blind equalization is used when the signal has no memory i.e. the signals

transmitted in successive symbol intervals are interdependent. In this case, each transmitted

symbol is detected separately. The constant modulus algorithm (CMA), discovered by Godard

[6] and Treichler [7] serves to be a highly significant algorithm for blind equalization. Its

7

robustness and capability of converging before phase recovery made this algorithm very

successful [5]. Another algorithm called the multimodulus algorithm (MMA) [8,9] has

improved performance over CMA since it provides low steady-state mean-squared error

(MSE) in addition to cancelling the necessity for phase recovery in steady-state operation [9].

Additionally, hybrid blind equalization algorithms are different types of blind equalization

algorithms known for combining or augmenting existing cost functions to attain improved

performance [5].

Nonlinear equalizers are considered significant among signal processing techniques due to

their both superior performance and improved features compared with linear equalizers, in

addition to the wide variety they offer. One of those features is the ability to form nonlinear

decision boundaries where the Bayesian equalizer determines the performance of these

equalizers. Decision Feedback Equalizers (DFEs) are one class of nonlinear equalizers with

relatively improved performance. Estimating and cancelling the ISI that an information

symbol induces on future symbols after it has been detected and decided upon forms the basis

of decision feedback equalization [4]. The DFE can possess two structures which are either

direct transversal or lattice structures. The direct form is made up of a feed forward filter

(FFF) and a feedback filter (FBF). The output of a detector located in between the FFF and

FBF determines the decisions that will be input to the FBF, eventually adjusting the

coefficients of the FBF to eliminate the current symbol’s ISI caused by past detected symbols.

The remarkable feature of the DFE is its superiority over linear transversal equalizer (LTE)

which is the most common equalizer structure. This superiority is due to its smaller minimum

mean square error (MMSE) than that of the LTE. This is caused by the severely distorted

channel of the LTE or when it exhibits nulls in the spectrum causing the performance of an

LTE to degrade and the minimum mean squared error (MMSE), which is the basic

performance criterion of the DFE, to be quite better than that of the LTE.

The goal in designing a communications system is to transmit information to the receiver with

as little deterioration as possible and at the meantime to satisfy design constraints of allowed

signal bandwidth, transmitted energy and cost. In digital communications systems, the

probability of bit error (Pe), which is named bit error rate (BER) is generally taken to be the

8

measure of degradation and performance. In analog communications systems, the signal-to-

noise ratio (SNR) that is related with the end of the receiver is generally the performance

criterion. It’s important to attain a low mean square error (MSE) and high convergence rate

beside a low BER in nonlinear channel equalization. Training sequences are also an important

factor that determines the efficiency of a communications system. They are intended to be as

short as possible which requires the adaptation process to end in as few iterations as possible.

The application of linear equalizers to nonlinear channels does not yield the desired BER

performance since they are based on linear system theory and are used for equalization of

linear channels. Recently, neural networks and fuzzy technology have evolved into a powerful

tool in the equalization of nonlinear channel distortions.

1.4 State of Application of Neural Networks and Fuzzy Technologies for Channel

Equalization

1.4.1 Design of neural network based equalizers

Nonlinear equalizers are capable of compensating for both nonlinear and linear channel

distortion. Adaptive nonlinear equalizers that implemented neural network models were used

extensively primarily for noise-cancellation in various applications. A multilayer perceptron

(MLP) is one of the neural network structures which is used in neural network based

equalizers. MLP networks consist of feedforward neural networks having one or more layers

of neurons, known as hidden neurons that are between the input and output neurons.

Filtering is the process of changing the relative amplitudes of the frequency components in a

signal or eliminating some frequency components completely in a variety of applications [10].

Assigning k information bits to the ܯ = 2 possible signal amplitudes which can be carried

out in a number of ways is called mapping or transformation. Generally, the nonlinear

equalization includes a channel estimator since the channel information is not available at the

receiver end [12]. Filtering comprise two estimation procedures, one of them being the

mapping from the available samples and the other one being the estimation of the output of

the filter from the input by the realization of this mapping [11]. The mapping is more difficult

9

for a nonlinear filter than for a linear filter but research still goes on to effectively realize the

mapping of nonlinear filters.

1.4.2 Channel equalization by using fuzzy logic

Adaptive equalizers for nonlinear channels can be developed by a variety of effective ways.

Baye’s probability theory [13] is capable of bringing about the optimal solution for a symbol

equalizer and is referred to as the Bayesian equalizer. Symbol decision equalizers are

particularly simple and less complex in terms of computationality compared with the MLSE.

A channel estimate is not always necessary for them. They function as inverse filters [14] and

such algorithms as recursive least square (RLS) or least mean square (LMS) are employed to

base an adaptive filter. The channel inverse is found by the adaptive filter where noise

provides a linear decision boundary. In general, an optimal equalizer requires decision

function that is naturally nonlinear. This equalization is usually thought to be a nonlinear

problem of classification with this perspective and because of this reason, linear equalizers’

performance is not good enough to be optimal. This is the reason search for nonlinear

equalizes providing a nonlinear decision function has been undertaken. Nonlinear equalizers

employing artificial neural networks (ANNs) [15], [16], [17] and radial basis function (RBF)

networks [15], [18], [19] were successfully developed. Nonlinear equalizers using ANN and

RBF networks were shown to provide superior performance to linear equalizers for channels

corrupted with ISI and AWGN [20]. The ANN equalizers had some discrepancies due to poor

convergence and RBF equalizers provided functional behavior which is localized and required

by the optimal equalizer where it was difficult to train the centers. This, however, caused the

examinations to find different nonlinear equalization techniques. A fuzzy adaptive filter forms

the basis of a fuzzy equalizer and this fuzzy equalizer has been suggested in [21] as the result

of examinations to find alternative nonlinear equalization techniques and a fuzzy system

related equalizer is offered by [22]. It was found that these equalizers had good performance

but the Bayesian equalizer decision function could not be found, in addition to the difficulty

of demand by fuzzy adaptive filter based equalizer, for high computational complexity.

The fuzzy logic is based on fuzzy rules that use input-output data pairs of the channel. This

type of adaptive equalizers operates by processing numerical data and linguistic information.

10

Fuzzy equalizer depends on fuzzy IF-THEN rules which are determined by human experts.

These rules use the channel’s input-output data pairs and carries out the construction of the

filter for nonlinear channel. The bit error rate (BER) and adaptation speed can be improved by

the linguistic and numerical information.

Digital communications involving quadrature amplitude modulation (QAM) can apply the

fuzzy filter with both linear and nonlinear channel characteristics as has been achieved in this

thesis. The present study proposes a complex fuzzy adaptive filter with changeable fuzzy IF-

THEN rules, which is an extension of the real fuzzy filter. The filter inputs and outputs are all

complex valued. However, the inputs of the channels are real reciprocals of the modulated

complex transmitted inputs and the equalizer outputs are real reciprocal estimates of the

reciprocal channel input signals. Afterwards, the reciprocal normalized equalizer outputs are

denormalized to form the final complex-valued, equalized estimate outputs of the receiver.

This technique which is primarily based on normalization and directly applied on the

transmitter, on the whole presents a new method to successfully equalize complex-valued

QAM signals which are severely distorted in both linear and hostile time-varying nonlinear

channel environments, by using real-valued reciprocals of the signals in question. In addition

to the methodology, the membership functions derived from the training data set and the

gradient-descent learning algorithm which trains the data set, represent a significant element

of the nonlinear neuro-fuzzy equalizer that is capable of this adaptive channel equalization. Its

superiority relies not only on its high equalization performance but also on its capability of

minimizing or eliminating the non-linear channel distortions that in general, linear equalizers

are not capable of doing. In turn, the fuzzy logic based neuro-fuzzy equalization is proven to

be an efficient equalizer on a complex scheme such as QAM with high approximation ability

in nonlinear problems in addition to the linear ones.

A fuzzy adaptive filter is based on a set of fuzzy IF-THEN rules whose function is to change

adaptively in order to minimize some criterion function as new information is available [35].

A recursive least squares (RLS) adaptation algorithm is used by a fuzzy adaptive filter.

The construction of RLS fuzzy adaptive filter is accomplished by the following four steps:

11

1) Defining fuzzy sets in the filter input space UєRn which has membership functions

covering U.

2) Constructing a set of fuzzy IF-THEN rules that either human experts determine or the

adaptation procedure determines by matching input-output data pairs;

3) Constructing a filter that is based on the set of rules; and,

4) Updating the filter’s free parameters by utilizing the RLS algorithm.

The fuzzy adaptive filter’s main advantage is the possibility of integrating linguistic

information (in the shape of fuzzy IF-THEN rules) and numerical information (in the shape of

input-output pairs) into the filter uniformly. At the end, when it’s time to apply the fuzzy

adaptive filter to equalization problems related with nonlinear communication channel, the

following fundamental differences between RLS and LMS are reached:

1) The RLS algorithm is faster than that of the LMS algorithm.

2) Having, in fuzzy terms, incorporated some linguistic description about the channel

into the fuzzy adaptive filter will extensively enhance the adaptation speed of RLS.

3) The fuzzy equalizer’s bit error rate is quite approximate to the bit error rate of the

optimal equalizer.

4) The excess mean-square error of the RLS algorithm is inclined towards zero as the

number of iterations comes nearer to infinity.

Development of neuro-fuzzy system in order to equalize channel distortion includes the

following steps:

-First, the methodologies utilized to equalize channel distortions are analyzed and state of

application problems of neural and fuzzy technologies for the development of an equalizer is

considered.

-Second, the data transmission structure is explained and the operation structure of adaptive

channel equalization utilizing neuro-fuzzy network is presented.

12

-Third, the mathematical model of the neuro-fuzzy network for the development of

equalization system for channel distortion is presented. The learning algorithm of neuro-

fuzzy system is considered.

-Fourth, the development of the neuro-fuzzy equalizer for channel distortion is presented.

-Fifth, the QAM signaling is explained and its application on nonlinear neuro-fuzzy network

is presented. The simulation results of the equalizer using QAM signals and analytical tables

demonstrating the performance of the equalizer are presented. Additionally, tables comparing

the different QAM constellations are presented.

1.5 Summary

In this chapter, the application of channel equalization is explained. The types of distortions in

channels and the types of equalizers used to minimize them are explained with their

classifications and properties. Performance criteria of equalizers, namely bit error rate (BER),

signal-to-noise ratio (SNR) and convergence rate with their ideal indications are stated.

Neural networks and fuzzy logic are particularly discussed and explained with their structures

and features. The methods of equalization using neural networks, specifically filtering is

described. Different types of algorithms, networks and equalizers used especially for difficult

nonlinear channels are defined.

Fuzzy IF-THEN rules which constitute the basis of fuzzy logic are described to point out their

significance in channel equalization. The steps of constructing a fuzzy adaptive filter using

these rules are defined. The methods used in equalizing QAM signals applied on neuro-fuzzy

network and the gradient-descent learning algorithm as part of the equalization system are

described as well.

13

CHAPTER 2

STRUCTURE OF CHANNEL EQUALIZATION

2.1 Overview

All communications systems are composed of three fundamental subsystems which are

transmitter, channel and receiver (Fig. 2.1). A transmitter’s task is to transmit information

signal through physical channel or transmission medium after converting it into a form which

is convenient for transmission. The receiver’s task, on the other hand, is to produce an

accurate replica of the transmitted symbol sequence by recovering the message signal that the

received signal contains. The communications channel acts as a connector between the

transmitter and the receiver sending the electrical signal from the transmitter to the receiver.

The unknown channel characteristics cause distortions to the transmitted signal before it

reaches the receiver.

Figure 2.1 Basic components of a communications system

Digital communications systems are preferred more compared with the analog ones due to

increasing demand for data communication and because digital transmission provides data

processing options and flexibilities that analog transmission cannot offer. The distinguishing

feature of a digital communications system is that it sends a waveform from a finite set of

possible waveforms during a finite interval of time as opposed to an analog communication

system that transmits a waveform from unlimited number of various waveforms which have

theoretically infinite resolution. The message from the source which is represented by an

information waveform is encoded before transmission so that transmission error can be

detected and corrected by the receiver. At the receiver end, the message signal must be

decoded before being used. The distortions preventing the correct transmission of signals are

mainly intersymbol interference (ISI) and noise. Noise is meant to be unwanted electrical

signals which exist in electrical systems. The equalization of channel is an efficient technique

Transmitter

Channel

Receiver

14

employed to reduce or eliminate the obscuring effect of distortion caused in the channel. This

chapter outlines the structure of data transmission system and the functions of its main

components as well as the equalization of channel distortion.

2.2 Architecture of Data Transmission Systems

A communications channel is an electrical medium which connects the transmitter and the

receiver, providing the data transmission from a source which generates the information to

one or more destinations. In the analysis and design of communication systems, the

characteristics of the physical channels through which the information is transmitted, are of

particular importance. Wire lines or free space may be used in the communications path from

the transmitter to the receiver. The examples for wire lines are coaxial cables, wire pairs and

optical fibers. These are widely used in terrestrial telephone networks, even though infrared

and optical free space links such as video, remote controls for TV and hi-fi equipment as well

as some security systems may be used in different situations, as well. This point of

transmission medium is where most of the attenuation and noise is observed [23].

The receiver functions to reverse the signal processing steps performed by the transmitter

recovering the original message signal by compensating for any signal deteriorations caused

by the channel. This involves amplification, filtering, demodulation and decoding and in

general is a more complex task than the transmitting process.

There are many reasons as to why digital communication systems are preferred over analog

systems. Digital communication systems (DCSs) represent an increase in complexity over the

equivalent analog systems. The principal advantages and reasons of DCS’s being the

preferred option instead of analog communication systems can be listed as:

1. The ease with which digital signals, compared with analog signals, are regenerated.

2. Digital systems are not as prone to distortion and interference as analog systems.

3. Increased demand for data transmission.

4. Increased scale of integration, sophistication and reliability of digital electronics for

signal processing, combined with decreased cost.

5. Facility to source code for data compression.

15

6. Possibility of channel coding (line and error control coding) to minimize the effects of

noise and interference.

7. Ease with which bandwidth, power and time can be traded off in order to optimize the

use of these limited resources.

8. Standardization of signals, irrespective of their type, origin or the services they

support, leading to an integrated services digital network (ISDN)

9. Digital hardware can be implemented more flexibly than analog hardware.

10. Various types of digital signals such as data, telephone, TV and telegraph can be

considered identical signals in transmission and switching [24].

Modulation, which is part of the transmission and equalization process, involves encoding

information from a message source in a way that is convenient for transmission. It is

accomplished by translating a baseband message signal to a bandpass signal at frequencies

which are quite high when compared with the frequency of baseband. It is also referred to as

the mapping of the baseband input information waveform into the bandpass signal. The

bandpass signal is referred to as the modulated signal and the baseband message signal is

referred to as the modulating signal. Modulation can be accomplished by varying the

frequency, phase or amplitude of a high frequency carrier in conformity with the amplitude of

the message signal. Demodulation, on the other hand, is the process of extracting the

baseband message from the carrier in order to enable the aimed receiver (also known as the

sink) to process and interpret it. In digital wireless communication systems, it’s possible to

represent the modulating signal as a time sequence of pulses or symbols, where each symbol

has m finite states. The representation of n bits of information where n = log2 m bits/symbol,

is done by each symbol [4].

The block diagram illustrated in Fig. 2.2 can describe communications systems. The source of

data is the signal generator that produces the information to be transmitted and modulated.

This information is in the form of a message symbol that can consist of a single bit or a

grouping of bits.

In order to make the transmission more efficient in terms of the time it takes and/or bandwidth

it requires, encoder is employed as a signal processor that converts the sources of digital

16

information into binary form, i.e. each symbol is encoded as a binary word. Encoding is

performed so as to enable the signal processor in the receiver to detect and correct errors

which will provide the minimization and/or elimination of bit errors caused by noise in the

channel.

The procedure used for detecting and correcting errors is called coding. Coding includes

adding redundant (extra) bits to the stream of data. The redundant bits like parity bits are

employed by the decoder and serve to correct errors at the receiver output even though a high

degree of redundancy may increase the bandwidth of the encoded signal. Codes can be

classified into two broad categories as block codes and convolutional codes. The main

difference is that block coder is a memoryless device whereas a coder having a memory

produces a convolutional coder. Hamming Codes, Golay Codes, Hadamard Codes, Cyclic

Codes, BCH (Bose-Chaudhuri-Hocquenghem) Codes and Reed-Solomon Codes are some

examples of block codes. In addition to block codes and convolutional codes, a new family of

codes, called turbo codes is used recently and is being incorporated in 3G wireless standards.

Turbo codes combine the capabilities of convolutional codes with channel estimation theory

and can be thought of as nested or parallel convolutional codes. When implemented properly,

turbo codes allow coding gains which are far superior to all previous error correcting codes

and permit a link of wireless communications to come surprisingly near to realizing the

Shannon capacity bound [4].

Each digital word has n binary digits and there are ܯ = 2 unique code words which are

possible where each code word corresponds to a certain amplitude level. However, each

sample value from the analog signal could be any one of an infinitely high number of levels

for the digital word which represents the amplitude closest to the actual sampled value to be

utilized. That is known as quantizing [2]. Gray coding was used as the mapping of bits along

the in-phase and quadrature axes of the QAM constellation as part of this thesis study. The

Gray code has been selected since it has change of only one bit for each change of step in the

quantized level. Multisymbol signaling can be thought of as a coding or bit mapping process

17

TRANSMITTER

AWGN

RECEIVER

Figure 2.2 Architecture of a digital communications system [39]

in which n binary symbols (bits) are mapped into a single M-ary symbol. A detection error in

a single symbol can therefore translate into several errors in the corresponding decoded bit

sequence. The bit error rate (BER), therefore relies not only on the probability of symbol error

and the symbol entropy but on the code or bit mapping used and the types of error which

occur as well. If a Gray code is used to map binary symbols to phasor states, this type of error

results in only a single decoded bit error [23]. Consequently, single errors in the receiver will

cause minimal errors in the recovered level.

There are many criteria used in the evaluation of the performance of a communications

system. The optimum system that is considered close to being ideal or perfect for digital

systems is the one that minimizes the bit error rate (BER) at the receiver output subject to

constraints on channel bandwidth and transmitted energy. This raises the matter of inventing a

system with no bit error at the output even when there is noise in the channel. Shannon

demonstrated in 1948 that it was possible to calculate a channel capacity C (bits/s) in the way

that if the rate of information was less than C, the probability of error would approach to zero.

In this case, the maximum possible bandwidth efficiency

max

B

, which is defined as the

Data

source

Encoder

Filter

Modulator

Physical Channel

Demodulator

Filter

Equalizer

Decision

device

Decoder

18

capability of a modulation scheme to accommodate data within a limited bandwidth, is

restricted by the channel noise and is stated by the channel capacity formula in Eq.2.1.

Shannon’s channel capacity formula is applicable to AWGN and is given by

N

S

B

C

B

1

log 2

max

or (2.1)

N

S

B

C

1

log 2

in which C is the channel capacity (bits per second), B is the transmission bandwidth, S is the

average power of the transmitted signal and N is the power spectral density of the white

Gaussian noise. S/N is called the signal-to-noise ratio. Shannon also showed that errors that a

noisy channel induces, could be decreased to any desired level by encoding the information

properly, without sacrificing the rate of information transfer.

The physical medium or the channel that the message signal is transmitted through, induces

distortions like intersymbol interference (ISI) and noise. The receiver, on the other hand is

responsible for separating the source information from the received modulated signal which is

distorted by noise that is usually random, additive white Gaussian noise (AWGN). The

receiver’s duty is to take the corrupted signal at the output of the channel and to convert it to a

baseband signal that the baseband processor could handle. The baseband processor eliminates

or minimizes this signal and distributes an estimate of the source information to the output of

the communications system [2]. Demodulation process is employed at the receiver to the

signal in order to recover the transmitted signal in its baseband form and make it ready to be

processed by the receiver filter. At the end, the decision device reconstructs the encoded

message signal depending on the decisions of the equalizer and the decoder reconstructs the

sequence of transmitted signals by bringing about the reverse operation of the encoder.

19

2.3 Channel Characteristics

Channels must have appropriate frequency band for their transmission medium. The

processed baseband signal is converted by the transmitter circuit into this frequency band. If

the channel is a fiber-optic cable, the carrier circuits convert the baseband input to light

frequencies and the transmitted signal is light.

Channels are classified as wire and wireless channels. Some examples of wire channels can be

counted as coaxial cables, fiber-optic cables, twisted-pair telephone lines and waveguides

whereas air, vacuum and seawater are examples of wireless channels.

The constraints channels may introduce are in favor of a particular type of signaling.

Generally, the signal is attenuated by the channel so that the channel or the noise produced by

an imperfect receiver deteriorates the delivered information from that of the source [2]. There

are various sources that cause noise; those sources may be natural electrical disturbances such

as lightning, artificial sources like ignition systems in cars, switching circuits in a digital

computer or high voltage transmission lines. The channel is likely to involve amplifying

devices such as satellite transponders in space communication systems or repeaters in

telephone systems that help the signal to be above the noise level. In addition to noise,

multiple paths that arise between the input and output of channel involve attenuation

characteristics and time delays. The attenuation characteristics may vary with time, which

makes the signal fade at the channel output. Fading of that type can be observed while

listening to distant shortwave stations.

Another significant characteristic of channels is bandwidth. In general terms, bandwidth is

defined to be the width of a positive frequency band of waveforms whose magnitude spectra

are even about the origin ݂ = 0. Bandwidth in a channel must be enough to accommodate the

signal but reject the noise. High bandwidth allows more users to be assigned as well as more

information to be transmitted. Some examples of band limited channels are telephone

channels and digital microwave radio channels. When the channel is band limited to ܹHz,

any frequency components above ܹ will not be passed by the channel. In turn, the bandwidth

of the transmitted signal will be limited to ܹ Hz, as well. When the channel is not ideal (i.e.

20

|݂| ≤ܹ), signal transmission at a symbol rate equivalent of or exceeding ܹ concludes as

intersymbol interference (ISI) among a number of adjacent symbols. In addition to telephone

channels, other physical channels which exhibit some form of time dispersion and thus

introduce ISI, are also available. Radio channels like shortwave ionospheric propagation (HF)

and tropospheric scatter are two examples of time-dispersive channels. In these channels, time

dispersion and hence, ISI is the consequence of multiple propagation paths that have different

path delays [1]. In addition to noise, multipath propagation and ISI, there are other

impairments in the channels specifically nonlinear distortion, frequency offset and phase

jitter. Channel impairments affect the transmission rate over the channel and the modulation

technique to be used. Depending on the rates, bandwidth efficient modulation techniques are

employed and some form of equalization is employed accordingly.

2.4 Channel Distortions

Channels which are used to transmit data distort signals in both amplitude and phase. In

addition to the nature of the channel itself, other factors like linear distortion, nonlinear

distortion and frequency offset are significant factors causing these distortions.

Linear distortion occurs in linear time-invariant systems in which channels are characterized

as band-limited linear filters. Those channels like telephone channels are part of digital

communications systems where distortionless transmission is highly desired. A linear time-

invariant system will produce two types of linear distortion which are amplitude distortion

and phase distortion. In order to have distortionless transmission with linear time-invariant

systems, the first requirement is that the transfer function of the channel must be given by

d

fT

j

Ae

f

X

f

Y

f

H

2

)

(

)

(

)

(

(2.2)

which means that in order to have no distortion at the system output, the following

requirements have to be met:

1. Flat amplitude response. That is,

A

f

H

constant

)

(

(2.3a)

21

2. The phase response that is a linear function of frequency. That is,

d

fT

f

H

f

2

)

(

)

(

(2.3b)

When the first condition is satisfied, no amplitude distortion exists and when the second

condition is satisfied, no phase distortion exists. The second requirement is related with the

time delay of the system and it is defined as

)

(

2

1

)

(

2

1

)

(

f

H

f

f

f

f

Td

(2.4)

and it is compulsory that

constant

)

(

f

Td

(2.5)

for distortionless transmission. If

)

( f

Td

is not constant, there is phase distortion since the

phase response,

)

( f

, is not a linear function of frequency.

Nonlinear distortion in telephone channels arises from nonlinearities in amplifiers and

compandors used in the telephone system. This type of distortion is usually small and it is

very difficult to correct [1]. There will be nonlinear distortion on the output signal if the

voltage gain coefficients from the second order on, are not zero. There are three types of

nonlinear distortions associated with the amplifiers which are harmonic distortion,

intermodulation distortion (IMD) and cross-modulation distortion. Harmonic distortion

occurs at the amplifier output and is caused by first and second order frequencies of the

amplifier output. The intermodulation distortion is produced by cross-product term of the

amplifier input-output equation whereas the cross-modulation distortion is caused by the third

order distortion products of the amplifier output.

In addition to linear and nonlinear distortions, signals transmitted through telephone channels

are subject to the impairment of frequency offset. A small frequency offset which is mostly

less than 5 Hz, results from the use of carrier equipment in the telephone channel. High-speed

digital transmission systems that use synchronous phase-coherent demodulation cannot

22

tolerate this type of offset. This offset is compensated for by the carrier recovery loop in the

demodulator.

Phase jitter is basically a low-index frequency modulation of the transmitted signal with the

low frequency harmonics of the power line frequency. Phase jitter poses a serious problem in

digital transmission of high rates. Yet, it can be tracked and compensated for, to some extent,

at the demodulator.

Distortion can occur within the transmitter, the receiver and the channel. As opposed to noise

and interference, distortion appears when the signal is turned off.

2.4.1 Multipath propagation

Multipath fading occurs to varying extents in many different radio applications. It is caused

whenever radio energy reaches the receiver by more than one path. Multiple paths may also

occur due to ground reflections, reflections from stable tropospheric layers and refraction by

tropospheric layers with extreme refractive index gradients [23]. Scattering obstacles also

cause multipath propagation to some other systems like urban cellular radio systems.

There are two principal effects of multipath propagation on systems, their relative severity

depending essentially on the relative bandwidth of the resulting channel compared with that of

the signal being transmitted. The fading process is governed by changes in atmospheric

conditions for fixed point systems such as the microwave radio relay network. The path delay

spread often is adequately short for the channel frequency response to be essentially constant

over its operating bandwidth. If that happens, fading is considered flat because all signal

frequency components become prone to the same fade at any given instant. In the case of path

delay spread being longer, the channel frequency response is likely to change rapidly on a

frequency scale that can be compared with signal bandwidth. If that happens, the fading is

considered frequency selective and the received signal is subject to severe amplitude and

phase distortion. Adaptive equalizers may then be required to flatten and linearize the overall

characteristics of channel. The flat fading effects can be combated by increasing transmitter

power whilst the effects of frequency selective channel cannot. A fade margin is usually

designed into the link budget to offset the expected multipath fades for microwave links

23

which are subject to flat fading. The magnitude of this margin depends on the required

availability of the link.

Paths of multiple propagation that have different path delays cause time dispersion and ISI in

time-dispersive channels. The reason for calling these channels time-variant multipath

channels is that the relative time delays among the paths and the number of paths vary with

time. Various frequency response characteristics are caused by the time-variant multipath

conditions resulting in inappropriate frequency response characterization for time-variant

multipath channels, which is used for telephone channels. Instead, scattering function

statistically characterizes these radio channels. The scattering function is a two-dimensional

representation of the average received signal power which depends on Doppler frequency and

relative time delay.

2.4.2 Intersymbol interference

Rectangular pulse signaling, in principle, has a spectral efficiency of 0 bits/s/Hz since each

rectangular pulse has infinite absolute bandwidth. In practice, of course, rectangular pulses

can be transmitted over channels with finite bandwidth if a degree of distortion can be

tolerated.

In digital communications, it might appear that distortion is unimportant since a receiver must

only distinguish between pulses which have been distorted in the same way. If the pulses are

filtered improperly as they pass through a communications system i.e. if the distortion is

severe enough, they will spread in time. The decision instant voltage might then arise not only

from the current symbol but also from one or more preceding pulses. Intersymbol interference

(ISI) is caused when smearing the pulse for each symbol into adjacent time slots occurs. The

pulses would have rounded tops instead of flat ones with a restricted bandwidth. What’s

important about ISI is the decision instant. The decision instant can be defined as the

sampling instant (or sampling point) at which each time slot of the transmitted or received

waveform begins. It is at this point that ISI occurs due to the smearing effect of the pulse.

This smearing will cause unwanted contributions from the adjacent pulses that are likely to

degrade bit error rate (BER) performance. The decision instant shows an important point: The

24

performance of digital communications systems is only related with decision instant ISI. If ISI

occurs at times that are not decision instants, it does not matter [23].

If the signal pulses could be persuaded to pass through zero crossing point (of the time axis) at

every decision instant (except one), then ISI would no longer be a problem. This suggests a

definition for an ISI-free signal, i.e.: If a signal passes through zero at all instants that are not

one of the sampling instants, it’s an ISI-free signal [23].

While transmitting information with pulses over an analog channel, the original signal is a

discrete time sequence (or an acceptable approximation); the received signal is a continuous

time signal. The channel can be considered a low-pass analog filter, by that means, smearing

or spreading the shape of the impulse train into a continuous signal with peaks that are related

with the original pulses’ amplitudes. Convolution of the pulse sequence by a continuous time

channel response could describe the operation in terms of mathematics. The convolution

integral is the beginning of the operation:

(2.6)

where x(k) denotes the received signal, h(k) denotes the channel impulse response and s(k)

denotes the input signal. The second half on the right side of the above equation illustrates the

commutativity property of the convolution operation.

Component s(k) is the input pulse train that is comprised of periodically transmitted impulses

of varying amplitudes, for that reason;

s(k) = 0 for k≠nT (2.7)

s(k) = Sn for k=nT

where T is the symbol period. Here, it is meant that the only significant values of the variable

of integration in the integral of equation (2.6), are those for which ݇ = ݊ܶ. A different value

of k amounts to multiplication by 0 and for that reason, x(k) can be stated as

d

k

h

s

d

k

s

h

k

s

k

h

k

x

)

(

)

(

)

(

)

(

)

(

)

(

)

(

25

)

(

)

(

nT

k

h

s

k

x

n

n

(2.8)

The above equation that represents x(k) is more similar to the convolution sum, however, it

nevertheless is the description of a continuous time system. It illustrates that the received

signal is comprised of the addition of a large number of shifted and scaled impulse responses

of continuous time system. The amplitudes of the transmitted pulses of x(k) scale the impulse

responses.

The first term in Eq. 2.8 is the component of x(k) because of the Nth symbol. The centre tap of

the channel impulse response multiplies it. ISI terms are the other product terms in the

summation. The appropriate samples in the tails of the channel impulse response scale the

input pulses in the neighborhood of the Nth symbol.

2.4.3 Noise

In communications systems, the received waveform is usually classified as the desired part

which contains the information and the extraneous or unwanted part. The desired part is the

signal and the unwanted part is the noise. Noise limits our ability to communicate and causes

more power consumption during the transmission of information. Minimizing the noise

effects is achieved after enhancing the power amount in the transmitted signal. Yet, factors

like equipment and various practical limitations restrict the level of power in the signal which

is transmitted.

The most frequently encountered problem in the transmission of signals through any channel

is additive noise that is generally generated internally at the receiver end by components like

solid-state devices of a subsystem and resistors employed in the implementation of the

communications system. That is at times referred to as thermal noise. Thermal noise is

produced by the random motion of free charge carriers (usually electrons) in a resistive

medium. Additive noise generated by the electronic components is usually found in a storage

system’s readback signal, as in the case of a radio or telephone communication system. When

such noise occupies the same frequency band that the desired signal occupies, suitable design

of the transmitted signal and its demodulator at the receiver can minimize its effect [23].

26

Another problem in transmission is the non-thermal noise, also known as the shot noise.

Although the time averaged current flowing in a device may be constant, statistical

fluctuations will be present if individual charge carriers have to pass through a potential

barrier. The potential barrier may, for example, be the junction of a PN junction diode, the

cathode of a vacuum tube or the emitter bus junction of a bipolar transistor. Such statistical

fluctuations constitute shot noise.

Noise that arises from external sources can be coupled into a communication system by the

receiving antenna. Antenna noise which is dominated by the broadband radiation produced in

lightning discharges associated with thunderstorms, below 30 MHz originates from several

different sources. This radiation is trapped by the ionosphere and propagates worldwide.

Such noise is sometimes referred to as atmospheric noise.

Noise can be classified into categories as:

a. White noise: A stochastic process which has a flat power spectral density over the

entirety of frequency range. It’s not possible to express that sort of noise using

quadrature components because of its wideband character. When problems tackling

the narrowband signal demodulation in noise are in question, modeling the additive

noise process as white and representing the noise using quadrature components is

mathematically convenient. It’s possible to accomplish this after putting forward that

the signals and noise at the receiver managed to pass through an ideal bandpass filter,

which has a passband including the spectrum of the signals but is a lot wider. The

noise that is the result of passing the white noise process through a spectrally flat

bandpass filter is referred to as bandpass white noise.

b. Electromagnetic Noise: Usually found in electrical devices like television and radio

transmitters and receivers. They can be present at all frequencies.

c. Impulse Noise: An additive disturbance which arises primarily from the switching

equipment in the telephone system. It is made up of short-duration pulses having

random duration and amplitude.

d. Acoustic Noise: Present in almost all conversations and limit telecommunications

environments such as telephone circuits and hands-free telephones. It may be

27

unnoticeable or distinct, depending on the time delay involved. If the delay between

the speech and its echo (noise) is short, the noise is unnoticeable, but perceived as a

form of spectral distortion referred to as reverberation. If, however, the delay exceeds

a few tens of milliseconds, the noise is distinctly noticeable [25]. Background noise

generated in a car cabin, air conditioners and computer fans represent some types of

acoustic noise.

e. Processing Noise: Modeled as a zero-mean, white-noise process

in data

communication systems. It is the result of digital analog processing of signals, e.g. lost

data packets in digital data communications systems or quantization noise in digital

coding of image or speech.

f. Colored Noise: It’s a Gaussian type noise which is part of wideband signal processes

with non-constant spectrum. Autoregressive noise and brown noise are some examples

of the non-white, colored noise.

Gaussian noise and specifically the additive white Gaussian noise (AWGN) is the most

frequently encountered type of noise in communication systems. It represents the simplest

mathematical model for a communication channel. Below are given a list of channel models

in which the effects of noise on electrical communication and the most important

characteristics of the transmission channels are investigated.

2.4.3.1 The additive noise channel

Contaminating noise in signal transmission usually has an additive effect in the sense that

noise often adds to the information bearing signal at various points between the source and the

destination. Random additive noise process n(k) whose channel has a mathematical model as

shown in Fig. 2.3, corrupts the transmitted signal x(k). The additive noise becomes white

when the random process has a power spectral density (PSD) which is constant over all

frequencies and becomes the most often assumed model of additive white Gaussian noise

(AWGN), when the noise has a Gaussian distribution. AWGN contains a uniform continuous

frequency spectrum over a particular frequency band and the majority of physical

communication channels implements this model since it is mathematically tractable.

28

s(k) x(k)=s(k)+n(k)

Figure 2.3 The additive Gaussian noise channel [39]

2.4.3.2 The linear filter channel

Filtering is an operation which includes extracting information about a quantity of interest

from data with noise at time ݐ by using measured data that includes ݐ. A filter is considered

linear when filtering, smoothing or predicting the amount at the filter output is done and this

amount linearly depends on the observations applied to the filter input [25].

Linear filter channels are those that enable the transmitted signals to remain in specified

bandwidth limitations without interfering with each other. The mathematical model including

the additive noise is illustrated in Fig. 2.4 in which s(k) is the channel input and the channel

output is represented as

)

(

)

(

)

(

)

(

)

(

)

(

)

(

k

n

d

k

s

h

k

n

k

h

k

s

k

x

(2.9)

in which h(τ) is the linear filter impulse response and * denotes convolution.

s(k) x(k)=s(k)∗h(k)+n(k)

Figure 2.4 Linear filter channel with additive noise [39]

Channel

n(k)

Linear

filter h(k)

Channel

n(k)

29

When attenuation is applied to the signal while being transmitted, the received signal becomes

x(k)=αs(k)+n(k) (2.10)

where α is the attenuation factor.

2.4.3.3 The linear time-variant filter channel

Mobile systems such as a moving vehicle and wireless channels such as radio channels cause

multipath propagation resulting in time-varying fading signals because their frequency

response characteristics are time-variant. The time-varying mobile channel characteristics

necessitate using a channel equalizer which continuously adapts to these characteristics,

effectively implementing a filter which is matched to these characteristics. A time-variant

channel impulse response h(τ;k) is a characteristic of such time-variant linear filters. The

channel response h(τ;k) contains an impulse applied at time k-τ where τ stands for the elapsed-

time variable. The linear time-variant filter channel containing additive noise and the signal of

channel output when s(k) is the input, becomes

)

(

)

(

)

;

(

)

(

)

;

(

)

(

)

(

t

n

d

k

s

k

h

k

n

k

h

k

s

k

x

(2.11)

in which the time-variant impulse response has the following representation

)

(

)

(

)

;

(

1

k

L

n

n

k

k

a

k

h

(2.12)

where the {an(k)} denotes the possibly time-variant attenuation factors for the L multipath

propagation paths. Substituting Eq. 2.12 into Eq. 2.11 makes the received signal

)

(

)

(

)

(

)

(

1

k

n

k

k

a

k

x

k

L

n

n

(2.13)

where each of the L multipath components is attenuated by {an(k)} and delayed by {߬(݊)}.

30

A large majority of physical channels are formed by the three defined mathematical models

and the communication systems are analyzed and designed based on these three channel

models.

2.5 Summary

This chapter outlines the structure of channel equalization system. The factors causing

distortions in the channel and their properties are explained and discussed in detail. The noise

types and interferences are described in detail in addition to their effects on the channel and

the ways of removing them from the channel.

The types of channels used within the data transmission system have been discussed.

Mathematical models representing various types of channels have been outlined and

described. Mathematical formulas representing the input, impulse response and the output of

the channel have been explained beside the channel characteristics of each type.

31

CHAPTER 3

MATHEMATICAL BACKGROUND OF A NEURO-FUZZY EQUALIZER

3.1 Overview

When the channel distortion in communications applications is extreme and linear equalizers

are not able to deal with them, nonlinear equalizers are employed instead. A linear equalizer

doesn’t have good performance on channels that have amplitude characteristics containing

deep spectral nulls or on channels containing nonlinear distortions. In an effort to compensate

for the channel distortion, the linear equalizer puts a vast gain in the vicinity of the spectral

null for the channel distortion compensation and consequently increases the amount of

additive noise the received signal has got.

Neural networks can be considered mathematical models of brain and mind activities. The

main purpose of neural networks is to form the organization of numerous simple processing

elements into layers for achieving tasks with higher level sophistication. High computation

rates, high capability for nonlinear problems, massive parallelism and continuous adaptation

are among the properties of neural networks. Those features turn neural networks into desired

tools for different sorts of applications [28]. Neural networks have been put forward for

equalization problems because of these attractive properties and their nonlinear capability.

On the other hand, neural networks have some weaknesses related with their individual

models. Their computational power is low and learning capability is limited. At this point, the

fuzzy systems have been considered to compensate these weaknesses with their capabilities of

logically reaching conclusions on a more advanced (linguistic or semantic) level.

This chapter describes the synthesizing of fuzzy logic with neural networks, the operation and

structure algorithms of neuro-fuzzy system as the channel equalization basis of QAM signals.

3.2 Neuro-Fuzzy System

Intelligent control is largely rule based, whereas classical control is rooted in the theory of

linear differential equations, because the dependencies involved in its deployment are much

32

too complex to permit an analytical representation. In tackling such dependencies, it is

expedient to use the mathematics of fuzzy systems and neural networks. The power of fuzzy

systems lies in their ability to measure the quantity of linguistic inputs and to quickly provide

a working approximation of complex and frequently unknown input-output rules of system.

The power of neural networks is in their ability to learn from data. It’s possible to combine

neural networks and fuzzy logic in a number of ways and both have advantages that provide

flexibility and effectiveness when combined. Fuzzy adaptive filters are effective because of

their data approximation ability in nonlinear problems and therefore are widely used in signal

processing problems. Fuzzy logic equalizers usually require fewer training samples than

conventional equalizers, especially for linear channels. They are capable of yielding better

error performance and also perform better in the presence of channel nonlinearities [29].

Neural networks supply algorithms for numeric classification, optimization and associative

storage. When fuzzy logic and neural networks are integrated, the emerging neuro-fuzzy

system becomes capable of training the network in a shorter time as a result of decreased

number of nodes of the network. There is a natural synergy between neural networks and

fuzzy systems that makes their hybridization a powerful tool for intelligent control and other

applications.

3.3 Fuzzy Inference Systems

3.3.1 Architecture of fuzzy inference systems

Fuzzy Inference Systems (FIS) are one of the well known applications of fuzzy sets theory

and fuzzy logic. They are used in achieving classification tasks, process control, offline

process simulation and diagnosis and online decision support tools. The power of FIS depends

on the twofold identity of both being capable of managing linguistic concepts and being

universal approximators which are capable of performing nonlinear mappings between inputs

and outputs.

FIS is often utilized for process simulation or control. Either expert knowledge or data can

design them. Knowledge based FIS solely may suffer from a loss of accuracy, for complex

33

systems which is the most important motivation to use fuzzy rules concluded from data [30].

The functional blocks as explained below, comprise a fuzzy inference system (Figure 3.1).

- Determining a set of fuzzy IF-THEN rules. Fuzzy rules are composed of linguistic

statements which describe the way the FIS makes a decision about the classification of

an input or the controlling of an output.

- Fuzzifying the inputs, which involves transforming the crisp inputs into degrees to

match with linguistic values, using the input membership functions defined by a data

base.

- Combining the fuzzified inputs in accordance with the fuzzy rules to set up a rule

strength (also called weight or fire strength).

- Determining the rule’s consequence by putting together the rule strength and the

membership function of the output.

- Combining the consequences so as to obtain an output distribution.

- Defuzzification of the output distribution which involves transforming the fuzzy rules

of the inference into crisp output.

The operations upon fuzzy IF-THEN rules are explained in the steps below:

1. Mapping the inputs to membership values of each linguistic label, utilizing a set of

input membership functions on the premise part (fuzzification process).

2. Computation of the rule strength by combining the fuzzified inputs (combining the

membership values), by utilizing the process of the fuzzy combination. (the fuzzy

combinations are also referred to as T-norms which are used in making a fuzzy rule

and involve the operators of ”and”, “or” and sometimes “not”)

3. Generating the qualified fuzzy or crisp consequent of each rule according to the rule

strength.

4. Combining the entirety of the fuzzy rule outputs to attain one fuzzy output distribution

and then aggregating the qualified consequent to produce a single crisp output

(Defuzzification of output distribution).

34

Figure 3.1 Structure of fuzzy inference system [39]

3.3.2 Rule base fuzzy if-then rule

The fuzzy knowledge base that includes a set of fuzzy IF-THEN rules forms one of the basic

blocks of a fuzzy system. The following is the form of expression that represents fuzzy IF-

THEN rules or fuzzy conditional statements [37].

If u is A Then y is B (3.1)

where u and y represent the input and output linguistic variables, A and B represent the labels

of the fuzzy sets characterized by appropriate membership functions. A denotes the premise

and B denotes the consequent part of the rule.

There are many forms representing IF-THEN rules among which Single Input Single Output

(SISO), given by statement (3.1) is the simplest. Multi-Input Single Output (MISO) of the

below given statements (3.2) and (3.3), are the other forms.

If u1 is

jA1 and u2 is

jA2 and ,…., and un is

l

nA Then yq is

p

qB

(3.2)

Input

Output

Knowledge base

Decision-making

Fuzzification

Interface

Defuzzification

Interface

35

If u1 is

jA1 and u2 is

jA2 and ,…., and un is

l

nA Then y1 is

rB1 and y2 is

sB2 (3.3)

The membership functions describe the fuzzy values A and B and Figure 3.2 demonstrates the

most widely used types of membership functions with their shapes.

1 1 1

0.5 0.5 0.5

0

(a)

0

(b)

0

(c)

Figure 3.2 Examples of membership functions (a) bell, (b) triangular, (c) trapezoidal

The following exponential function is one representation of a decision function that produces

a bell curve.

2

2

0

2

exp

)

(

x

x

x

(3.4)

where x is the independent variable on the universe, x0 denotes the position of the peak

relative to the universe and σ denotes the standard deviation.

The expressions (3.5) and (3.6) represent triangle and trapezoidal membership functions,

respectively.

ߤ(ݔ) = ቐ

1 − ௫ି௫

௫ି௫

,

ݔ < ݔ < ݔ

1 − ௫ି௫

௫ି௫

, ݔ < ݔ < ݔ

(3.5)

ߤ(ݔ) =

⎩

⎪

⎨

⎪

⎧1 − ௫ି௫

௫ି௫

, ݔ < ݔ < ݔ

1, ݔ < ݔ < ݔ

1 − ௫ି௫

௫ି௫

, ݔ < ݔ < ݔ

(3.6)

36

The following representation is the form of the types of rules, called Takagi and Sugeno fuzzy

rules because the consequent part of the fuzzy rules is a mathematical function of the input

variables.

If ܣ1(ݔ1), ܣ2(ݔ2), …… , ܣ݊(ݔ) then Y=݂(ݔଵ, ݔଶ, … . , ݔ) (3.7)

where the premise part is fuzzy and the function ݂ in the consequent part is usually a linear or

quadratic mathematical function.

݂ = ܽ + ܽଵx ݔଵ+ ܽଶx ݔଶ+ … + ܽx ݔ (3.8)

Fuzzy IF-THEN rules are widely used in modeling. They are considered the local description

of the system being designed and form the basics of the fuzzy inference system (FIS).

Fuzzification: The aim of fuzzification is mapping the crisp input into a fuzzy set. This input

can be from a set of sensors or features of those sensors like amplitude or frequency, and is

mapped into fuzzy numbers of values from 0 to 1, using a set of input membership functions.

The numeric inputs, ui߳Ui are converted into fuzzy sets by the fuzzification process for the

fuzzy system to use.

When

ܷ

∗ represents the set of all possible fuzzy sets which can be defined on ܷ∗ (given

ui߳Ui), ui is transformed to a fuzzy set denoted by ܣ

௨௭௭ that is defined on the universe of

discourse

ܷ

∗. The fuzzification operator F that produces this transformation is defined by

F: Ui =>

ܷ

∗

where

F(ui) = ܣ

௨௭௭,

Frequently, “singleton fuzzification” is used. It produces a fuzzy set ܣ

௨௭௭߳

ܷ

∗ with a

membership function given by

37

ߤೠ

(ݔ) = ቄ1 ݔ = ݑ

0 ݐℎ݁ݎݓ݅ݏ݁

Any fuzzy set with this form of membership function is termed “singleton”. Singleton

fuzzification is the type for which the input fuzzy set has only a single point of nonzero

membership and the number ui is represented by the singleton fuzzy set. In implementations

where singleton fuzzification is used, ui only takes on its measured values without any noise

involved. “Gaussian fuzzification” that uses bell type membership functions about input

points, and triangular fuzzification using triangle shapes, are common examples [38].

3.3.3 Fuzzy inference mechanism

Designing a fuzzy inference system (FIS) from data can be separated into two principal

stages: (1) automatic rule generation and (2) system optimization [30]. Rule generation is the

guide to a fundamental system that has a given space partitioning and the set of rules that

corresponds to it. System optimization is realized at different sorts of levels. Variable

selection could be a comprehensive selection or is possibly handled rule by rule. The goal of

rule base optimization is to choose the most efficient rules and to use rule conclusions in the

best way. It’s possible to enhance space partitioning by adding or removing fuzzy sets and by

tuning the parameters of membership function. Structure optimization has great significance:

choosing variables, lessening the rule base and optimizing the number of fuzzy sets.

There are two main tasks associated with fuzzy inference mechanism:

1. Matching task which involves determining the degree of each rule’s being relevant to

the current situation as marked by the inputs ݑ, ݅ = 1,2, … . ,݊.

2. Inference step which involves reaching the conclusions from the current inputs ui and

the information in the rule-base.

When the fuzzy set representing the premise of the ith rule is denoted by ܣଵ

× ܣଶ × … × ܣ ,

there will be two steps for matching:

38

Step 1: Combining inputs with rule premises: This step is about finding fuzzy sets ܣଵ

,

ܣଶ

, … ,ܣ

, with membership functions.

ߤ

భ

ೕ (ݑଵ) = ߤభೕ

(ݑଵ) ∗ ߤభೠ

(ݑଵ)

ߤ

మ

ೖ(ݑଶ) = ߤమೖ(ݑଶ) ∗ ߤమೠ

(ݑଶ)

.

.

ߤ

ೕ (ݑ) = ߤೕ (ݑ) ∗ ߤೠ

(ݑ)

(for all j, k, … ,l) combining the fuzzy sets from fuzzification with the fuzzy sets used in each

of the terms in the rules’ premises. When the singleton fuzzification is used, each of the input

fuzzy sets has only a single point of nonzero membership function.

(e.g. ߤ

ೕ (ݑ) = ߤೕ (ݑ) for ݑଵ = ݑଵ and ߤೕ

(ݑ) = 0 for ݑଵ ≠ ݑଵ)

To put it in another way, ߤ

ೠ(ݑ) = 1, with singleton fuzzification, for all ݅ = 1,2, … ,݊ for

the given ݑ inputs resulting in

ߤ

భ

ೕ (ݑଵ) = ߤభೕ

(ݑଵ)

ߤ

మ

ೖ(ݑଶ) = ߤమೖ(ݑଶ)

.

ߤ

ೕ (ݑ) = ߤೕ (ݑ)

Step 2: Determining those rules that are on: In this step, membership values ߤ(ݑଵ,ݑଶ, … ,ݑ)

are determined for the premise of ݅௧ rule which represents the certainty that each rule

premise is consistent with the given inputs. Defining

ߤ(ݑଵ,ݑଶ , … ,ݑ) = ߤೕ

(ݑଵ)ߤమೖ

(ݑଶ) … ߤ

39

that is a function of the inputs ݑ, ߤ(ݑଵ,ݑଶ, … ,ݑ) represents the certainty that the

antecedent of rule ݅ matches the information in the case of singleton fuzzification use. The

ߤ(ݑଵ,ݑଶ, … ,ݑ) is a multidimensional certainty surface. It stands for the certainty of a

premise of a rule and for the level to which a particular rule is consistent for a given set of

inputs. The implied fuzzy set is determined by the inference step which is then taken by

calculating the “implied fuzzy set” ܤ ,, for the ݅௧ rule, with the membership function

ߤ

൫ݕ൯ = ߤ(ݑଵ,ݑଶ, … , ݑ) ∗ ߤ(ݕ) (3.9)

The certainty level of the output’s being a specific crisp output ݕ within the universe of

discourse ݕ is specified by the implied fuzzy set ܤ

on considering simply rule I. The

defuzzification that comes after the inference step is employed to aggregate the conclusions of

all the rules which the implied fuzzy sets represent.

Defuzzification Methods: It is frequently important to find out a single crisp output from a

FIS. For instance, in the case of one attempting to classify a letter drawn by hand on a

drawing tablet, the FIS would be obliged to find out a crisp number to determine the letter that

was drawn. A process called defuzzification is used to attain this crisp number. In other

words, defuzzification means the way of extracting a crisp value from a fuzzy set as a

representative value.

Two known methods can be used for defuzzifying:

Center of Gravity (COG): The method picks the output distribution and works to find its

center of mass to produce one crisp number.

i

i

i

i

i

x

x

x

u

)

(

)

(

(3.10)

where the crisp output value ݑ is the abscissa (center of mass) under the center of gravity of

the fuzzy set, ߤ(ݔ) is the membership value in the membership function, ݔ is a running point

40

in a discrete universe. This expression is also considered the weighted average of the elements

in the support set.

The COG method for singletons attains the following expression

i

i

i

i

i

s

s

s

u

)

(

)

(

(3.11)

where ݏ is the position of singleton ݅ in the universe and ߤ(ݏ) represents the rule strength ߙ

of rule ݅. This technique has a good computational complexity and ݑ is differentiable with

respect to the singletons ݏ, that is practical in neuro-fuzzy systems.

Center of Average (COA): In this widely used method, a crisp output ݕ

௦ is selected

employing the centers of every one of the output membership functions and the highest

certainty of every one of the conclusions the implied fuzzy sets represent, and is described as

R

i

q

i

q

yq

R

i

q

i

q

yq

q

i

Crisp

q

y

B

y

B

b

y

1

1

)}

(

{

sup

)}

(

{

sup

(3.12)

here “sup” is the “supermum” (i.e., the least upper bound that is frequently regarded as

maximum value). Therefore, ݏݑ௫{ߤ(ݔ)} can simply be considered the highest value of ߤ(ݔ).

Fig. 3.3 outlines the inference mechanisms on different types of fuzzy systems graphically.

Most fuzzy inference systems can be categorized into three types depending on the types of

fuzzy reasoning.

In Type 1 fuzzy systems, the defuzzifier puts together the output sets that correspond to the

whole of the fired rules in a way to attain a single output set and afterwards comes up with a

crisp number which represents this output set that is put together, e.g., the centroid defuzzifier

comes up with the unity of the whole of the output sets and utilizes the centroid of the unity as

the crisp output [31]. The weighted average of each rule’s crisp output introduced by rule’s

weight and the output membership functions is the overall output.

41

Premise Consequent

Type1 Type2 Type3

A1 B1 w1 C1 w1 C1

z1=ax+by+c

ݔ ݕ ݖ

ݖ

A2 B2 w2 C2 w2 C2

z2=px+qy+r

ݔ ݕ ݖ ݖ

max

Multiplication

or min.

z = [w1z1+ w2z2]/ w1+ w2 z = [w1z1+ w2z2]/ w1+ w2

Figure 3.3 Types of fuzzy reasoning mechanisms [11]

In Type 2 fuzzy systems, fuzzy sets are quite helpful in conditions that make the

determination of an exact membership function for a fuzzy set hard; for this reason, they are

quite helpful in the incorporation of uncertainties. These uncertainties are caused by the

knowledge employed in the construction of rules in a fuzzy logic system and lead to rules that

have uncertain antecedents and/or consequents that are transformed in succession into

uncertain antecedent and/or consequent membership functions [31]. The overall fuzzy output

is attained after the application of 'max' operation to the fuzzy outputs that qualify. Every one

of these outputs equals the minimum rule strength and each rule’s membership function.

z

42

Type 3 is Takagi and Sugeno’s fuzzy IF-THEN rules. The output is a crisp number computed

by the multiplication of every one of the inputs by a constant and summing the result

afterwards. The weighted average of each rule’s output is the output.

In Fig. 3.3, a fuzzy inference system with two rules and two inputs is used to demonstrate the

different types of fuzzy rules and fuzzy reasoning described above.

3.4 Artificial Neural Networks

Recognizing that computing in the human brain takes place in a totally different manner from

the traditional digital computer, has been the incentive for research into artificial neural

networks, also known as “neural networks”. The brain is extremely complex, nonlinear and

parallel computing (information-processing system). It is capable of organizing its structural

constituents, called neurons, in order to carry out some necessary computations (e.g. pattern

recognition, perception and motor control) a lot more quickly than the highest speed digital

computer of the present time [29].

x0 ܫ = ∑ݓ ݔ ܵݑ݉݉ܽݐ݅݊

x1 ݓ ܻ

= ݂(ܫ)

ܶݎܽ݊ݏ݂݁ݎ

x2 ݓଶ

ݓଵ

Sum Transfer Output Path

•

•

•

wn

xn

Processing

Element

Inputs xi Synaptic

Weights wi

Figure 3.4 Artificial neuron

43

A neuron is a unit that processes information and is significant in a neural network’s

operation. The model of an artificial neuron that is fundamental in the design of artificial

neural networks is demonstrated in the block diagram of Fig. 3.4.

A set of synapses, also called connecting links, are the foundation elements of the neuronal

model. A weight or strength of its own characterizes each of these synapses. Specifically, a

signal ݔ for ݅ = 1,2, … ,݊, at the input of synapse, connected to neuron ݇, is multiplied by the

synaptic connection weight ݓ, for ݅ = 1,2, … ,݊. A result is generated by summing these

products, feeding them through a transfer function and then outputting them.

The output of the artificial neuron displayed in Fig. 3.4 is calculated from

)

(

1

n

j

i

j

ij

i

x

w

f

y

(3.13)

where ݔ is the input, ݕ is the output signal of the neuron, ݓ are the synaptic weight

coefficients, ߠ denotes the bias and ݂ is the activation function.

The activation function can be linear or nonlinear but a nonlinear sigmoid function is

frequently utilized as the activation function (eq. 3.14).

n

j

j

i

ij

j

x

w

y

1

)]

(

exp[

1

1

(3.14)

Neural networks are formed by a set of neurons in layer(s). The neurons are interconnected by

weighted connections at certain connection points which are called nodes. The way of

organization in a layered neural network is the layer formation. The least complicated

formation of a network with layers uses an input layer of source nodes which projects onto an

output layer of neurons (computation nodes) but not the other way round. This network is a

feedforward or acyclic type of network. Neurons in the network act as processing elements

which multiply an input by a set of weights and nonlinearly transform the result into an output

value.

44

On the whole, three basically different architectural network classes can be defined which are

single-layer feedforward (non-recurrent), multilayer feedforward and recurrent networks. The

feedforward neural network structures are shown in Fig. 3.5.

ݔଵ ݕଵ

ݔଵ

ݔଶ ∙

∙

ݔଶ

⋮

ݕଵ

ݔ୫

∙

∙∙

∙

ݕଶ

ݔ୫

⋮

⋮

⋮

2y

(a) (b)

Figure 3.5 (a) A single layer network, (b) A simple multilayer network [11]

3.4.1 Neural network’s learning

The most important specialty of a neural network is its capability of learning from its

environment and improving its performance through learning. An interactive adjustment

process applied to its synaptic weights and bias levels enables a neural network to learn about

its environment. Every one of the iterations of the learning process makes the network well-

informed of its environment. Learning in the circumstances of neural networks can be clearly

stated to be a process by which the neural network’s free parameters are adapted through a

stimulation process by the environment where the network is embedded. The way that the

parameter changes occur determines the type of learning [29].

A set of rules that are well determined and defined for the solution of a learning process is

referred to as a learning algorithm. No learning algorithm that is the only one of its sort exists

in the neural network design, as expected. The manner that the adjustment to a neuron’s

synaptic weight is clearly and exactly expressed, fundamentally cause the learning algorithms

to differ from each other. Another factor that should be taken into consideration is the way

45

that a neural network (learning machine) which is comprised of a set of interconnected

neurons, is related to its environment. In this latter context, a term is spoken as a learning

paradigm that refers to a model of the environment in which the neural network operates [29].

There are two fundamental learning paradigms associated with neural networks: (1) Learning

with a teacher (known as supervised learning) and (2) Learning without a teacher which is

divided into two subdivisions that are unsupervised learning and reinforcement learning.

Supervised learning involves training with a teacher. The teacher can be thought of as a set of

input-output examples representing the knowledge of the environment. Neural network, on

the other hand, does not know the environment. Considering that a training vector drawn from

the environment is applied to both the teacher and the neural network, the teacher is capable

of supplying the neural network with a desired response for the training vector. The network

parameters, i.e. the connection weights, are adjusted under the combined influence of the

training vector and the error signal. The error signal is what makes the desired response differ

from the actual response of the network. This adjustment is brought about in an iterative and

step-by-step way aiming at eventually causing the neural network to emulate the teacher; this

emulation is supposedly optimum in a statistical sense. This manner transfers the

environment’s knowledge that can be obtained by the teacher, to the neural network through

learning as fully as possible. On reaching this condition, the teacher may be removed and the

neural network copes with the environment entirely on its own.

The form of supervised learning just described, is the error correction learning which involves

a closed-loop feedback system but the loop does not contain the unknown environment. The

mean-square error or the sum of squared errors over the training samples that are in terms of

the free parameters of the system constitutes the performance criterion for the system. This

criterion may be visualized as a multidimensional error performance or simply error surface,

with the parameters as coordinates. The true error surface is averaged over all possible input-

output examples. It’s a point on the error surface which represents any one of the system’s

operations that the teacher supervises. The operating point has to move down one after

46

another toward a minimum point of the error surface so that the system improves performance

over time and thus learns from the teacher; it’s possible for the minimum point to be a local

minimum or a global minimum. A supervised learning system is capable of doing this using

the helpful information it has about the gradient of the error surface that corresponds to the

system’s current behavior. The gradient of an error surface at any point is a vector which

points in the direction of steepest descent. On providing an algorithm designed to minimize

the cost function, a sufficient set of input-output examples and sufficient time allowed to carry

out the training, a supervised learning system is generally capable of performing tasks like

pattern classification and function approximation [29].

3.4.2 Multilayer perceptrons & backpropagation algorithm

Multilayer feedforward networks form a significant classification of neural networks. The

network is characteristically comprised of a set of sensory units (source nodes) which

establish the input layer, one or more hidden layers of computation nodes, and an output layer

of computation nodes. The input signal propagates through the network in a forward direction,

on a layer-by-layer basis. These neural networks are called multilayer perceptrons (MLP) that

represent a generalization of a single-layer perceptron.

A widely used algorithm which is named the error back-propagation algorithm, trains the

multilayer perceptrons in applications in order to successfully solve some challenging and

diverse problems. Error correction learning rule forms the basis for this algorithm. It may be

considered a generalization of an equally popular adaptive filtering algorithm: the least mean

square (LMS) algorithm for the special case of a single layer neuron [29].

Error back-propagation learning is comprised of two passes through the different layers of the

network which are a forward pass and a backward pass. The forward pass contains an activity

pattern (input vector) whose effect propagates through the network one layer after another and

is applied to the network’s sensory nodes. Consequently, an output set is created as the real

network response. In the duration of the forward pass, all of the synaptic weights of the

47

network are unchanging. In the duration of the backward pass, all the synaptic weights are

adjusted according to an error correction rule. Particularly, the real network response is taken

out of a desired (target) response to come up with an error signal. The error signal is

propagated back through the network, in contrast to the direction of synaptic connections, thus

the naming “error back-propagation”. The synaptic weights are adjusted such that the real

network response moves nearer to the desired response statistically. The error back-

propagation algorithm is known in the literature as the back-propagation algorithm, or simply,

back-prop, as well. The learning process carried out with the algorithm is referred to as the

back-propagation learning.

There are three distinguishing characteristics of a multilayer perceptron:

1. There is a nonlinear function involved in the model of each neuron. The nonlinearity

mentioned here is smooth, in other words, differentiable everywhere. A generally

employed nonlinearity form which is sufficient for this requirement is a sigmoidal

nonlinearity that the following logistic function defines:

)

exp(

1

1

j

j

v

y

(3.15)

where ݒ is the induced local field (i.e. the weighted sum of all synaptic inputs plus the

bias) of neuron ݆, and ݕ is the output of the neuron.

2. One or more layers of hidden neurons which do not belong to the input or output of

the network can be found in the network. The network is capable of learning complex

duties by extracting increasingly significant specialties from the input patterns

(vectors) due to these hidden neurons.

3. The network performs a high connectivity degree which is decided by the network

synapses. A change in the network’s connectivity obligates a change in the population

of synaptic connections or their weights.

48

The multilayer perceptron derives its computational power when these characteristics are

combined with the capability of learning from experience through training. The back-

propagation algorithm has great significance in neural networks since it supplies a

computationally efficient method in order to train multilayer perceptrons.

Fig. 3.6 demonstrates the architectural graph of a multilayer perceptron with one hidden layer

and an output layer. The illustrated network is fully connected meaning that a neuron in one

layer of the network is connected to all the nodes/neurons in the previous layer. Signal flow

through the network progresses in a forward direction, from left to right and on a layer-by-

layer basis. The value of each neuron is computed by first summing the weighted sums and

the bias and then applying ݂(sum) (the sigmoid function) to calculate the neuron’s activation.

Input Output

⋮ ⋮

⋮ ⋮

⋮

⋮

Bias

Figure 3.6 Multilayer feedforward network [11]

Next, the training processes of the three layer feedforward network will be analyzed. Firstly,

three stages describing the feedforward phase in the network are: input (I), hidden (H) and

output (O) layers.

Input Layer (I): The input of the hidden layer is equal to the output of the input layer.

H

I

Input

Output

CHAPTER 1

REVIEW ON CHANNEL EQUALIZATION

1.1 INTRODUCTION

Communication systems comprise three fundamental elements: transmitter, channel and

receiver. When signals are transmitted through a communications system, they are obstructed

by some distortions which are mainly intersymbol interference (ISI) and noise. The

transmitted signal is distorted by ISI which is caused by multipath effect in band limited

(frequency selective) time dispersive channels and is the cause of bit errors on the receiver

side. ISI is considered the main factor negatively affecting fast transmission of data over

wireless channels. In order to eliminate or minimize these distortions, equalizers are

employed in these systems. Equalization is the method of compensating for, eliminating or

reducing the amplitude and phase distortion introduced by the transmission medium in

communications systems. In a general meaning, the term equalization refers to any signal

processing operation which minimizes ISI. An equalizing filter overcomes the ISI caused by

individual received symbols of a transmitted data stream, as well as the crosstalk that for

example occurs due to coupling of a transmitted pulse or that results from the capacitive

coupling of the transmitted pulse on an outgoing pair interfering with the received pulse on an

incoming pair. The task of equalizers is to provide efficient and error free communications by

ensuring that signals transmitted through the channel are recovered as original at the end of

the receiver that communications system has.

Distortions may be linear or nonlinear depending on the channel characteristics of channel.

When transmitting information through a physical channel, various mechanisms distort the

transmitted signal significantly, causing degradation or even failure in the communications.

These mechanisms can be classified as additive thermal noise, man-made noise and

atmospheric noise. In practice, many of the physical channels are characterized by various

channel models. The most frequently encountered channel of communications is that with

additive noise. An additive random noise process is involved in this channel model. The

factors causing the additive noise process are amplifiers and electronic components on the

2

receiver side of the communications system the transmission’s interference as radio signal

transmission, for example. Thermal noise is the category of noise that electronic components

and amplifiers cause. Statistically, that sort of noise gets classified as a random Gaussian

noise process and modeling the channel in terms of mathematics is named the additive

Gaussian noise channel. The mathematical model becomes an additive white Gaussian noise

(AWGN) channel in the case of the random process being a white-noise process. The random

process is a white-noise process when the power spectral density (PSD) is flat (constant) over

all frequencies [1,2].

When compared with AWGN channels, mobile radio channel deficiencies make the signal on

the receiver side greatly distorted or cause its significant fading. This fading is classified as a

non-additive signal disturbance and appears as time variation in the signal amplitude. Some

techniques are utilized to compensate for fading channel deficiencies. The main techniques

used in compensating for fading channel impairments can be classified as equalization,

channel coding and diversity that are employed to compensate for the signal distortions and

improve the received signal quality [3]. This thesis concentrates on equalization technique.

Equalization techniques can be categorized into linear or nonlinear techniques depending on

the way the output of an adaptive equalizer is used for subsequent control of the equalizer.

The decision making device of the receiver processes the equalizer’s output and determines

the value of the digital data bit being received before applying a slicing or thresholding

operation (a nonlinear operation) to determine the value of the reconstructed message data. If

this data is not used in the feedback path for the adapting of the equalizer, it’s a linear type of

equalization, but on the other hand, if the decision making device feeds the reconstructed data

back in order to alter the equalizer’s subsequent outputs, the equalization is nonlinear [3]. If

the used channels are nonlinear, linear equalizers cannot reconstruct the transmitted signal.

There are various equalizer structures among which linear transversal equalizer (LTE) is the

most common. The simplest LTE, whose transfer function is a polynomial, uses only feed

forward taps and has many zeros but poles only at ݖ = 0. This filter is called a finite impulse

response (FIR) filter or simply a transversal filter. In this type of equalizer, the filter

3

coefficient linearly weights the received signal’s current and past values before summing

them to produce the output of the equalizer.

Besides, some applications employ nonlinear equalizers since linear equalizers cannot deal

with high amount of channel distortion. The performance of linear equalizers on channels

involving deep spectral nulls in the passband is not good and hence, linear equalizers enhance

the noise present in the frequencies in which they place too much gain in attempting to

compensate for the distortion. Nonlinear equalizers are superior in performance to linear

equalizers because of these reasons. Three quite effective nonlinear methods which possess

improvements over linear equalization methods and that are used in 2G and 3G systems are:

1. Decision Feedback Equalization (DFE)

2. Maximum Likelihood Symbol Detection (MLSD)

3. Maximum Likelihood Sequence Estimation (MLSE) [4]

There have been large amount of studies aimed at channel equalization using various

methods, techniques and algorithms. Recently, neural network based fuzzy technology has

been widely used as a powerful and significant tool in channel equalization of various types of

signals. Experts have determined the fuzzy rules by utilizing the channel’s input-output data

pairs in this type of equalizers. Adaptive channel equalization based on neural networks and

employing multilayer perceptron (MLP) has been developed as part of this thesis which has

enabled the equalization of Quadrature Amplitude Modulation (QAM) type signals of various

levels. This has been achieved for both linear and nonlinear channels using a Nonlinear

Neuro-Fuzzy Equalizer (NNFE) at a relatively high adaptation speed and accurate equalizer

output results which has proven to be quite effective and practical.

The changeable fuzzy IF-THEN rules which configure the fuzzy adaptive filter are formed by

either human experts or the input-output pairs that are matched throughout a procedure of

adaptation. In this study, neural networks and fuzzy technology are used for the development

of a neuro-fuzzy equalizer for channel distortion of Quadrature Amplitude Modulation

(QAM) signals. Even though the QAM signal has a complex form which is composed of real

(in-phase) and imaginary (quadrature) parts, the complex signal is not directly applied to the

4

channel and equalizer since the used neuro-fuzzy filter is based on real values and best suits

the signal processing that takes place in real multidimensional space. The modulation and

demodulation of M-ary QAM (where M=4 & M=16 ) is accomplished by splitting the stream

of data bit into the in-phase (I) and quadrature (Q) components. Gray coding is employed to

map the I and Q components together. The significant feature of this thesis study is the

application of ‘normalization’ method by which the modulated in-phase and quadrature QAM

signal is normalized to a maximum of one. Consequently, each component of the complex

signal attains values between 0 and 1 by first shifting the values such that the minimum value

is zero and then scaling them such that the maximum value is 1. Each component then is input

to the channel and equalizer separately and denormalized separately at the equalizer’s output

where they are recombined to form the final desired complex QAM scheme at the end. The

normalization method provides better BER and convergence performance since it is stable in

addition to more accurate equalizer output results with relatively small number of iterations

before the minimum error is attained.

This thesis consists of five chapters where:

Chapter 1 presents an overview on channel equalization. The state of application of neuro-

fuzzy system and fuzzy logic as well as their properties and features are explained.

Chapter 2 explains the channel equalization, the distortions and noise in the channel.

Mathematical models and formulas representing the channels and nonlinear neuro-fuzzy

equalizer used in the thesis together with its characteristics are described.

Chapter 3 outlines the architecture and operation principles of the nonlinear neuro-fuzzy

network (NNFN). The used learning algorithm, the linguistic data about the target system and

numerical input-output relationships of NNFN are explained in detail. Fuzzy rule-based fuzzy

sets, the parameters and error calculations are analyzed.

Chapter 4 describes in detail the quadrature amplitude modulation (QAM) and its properties.

The application of QAM on NNFN and the features of the thesis design are explained. The

specific technique of normalization used in equalizing QAM signals and its mathematical

implementation are described.

5

Chapter 5 illustrates the simulation results of the equalization system demonstrating

graphically and statistically the performance of the equalization system. Bit error rate (BER)

versus signal-to-noise ratio (SNR) analysis is made in tabulated and graphical forms proving

the accuracy of the system. Comparisons between the channels and between the two

constellations of QAM are made to illustrate the performance of the equalizer, as well.

Conclusions are discussed at the end.

1.2 Overview

In order to accurately transmit the input signals from the transmitter to the receiver,

minimization and thus equalization of distortions in the channel is critical. This can be

successfully done by employing efficient equalization algorithms and techniques during the

transmission of the signals from the transmitter to the receiver. This chapter considers

methods used in channel equalization. Neural networks, fuzzy and neuro-fuzzy technologies

which form the basis of the adaptive channel equalization are analyzed and discussed.

1.3 The State of Application of Channel Equalization

Linear and nonlinear distortions are the main obstacles in transmitting the input signals to the

receiver of a communications system in their original state. These distortions, namely ISI and

noise, are caused in the channel and channel equalization is needed in order to transmit the

signals as accurately as possible. Even though both linear and nonlinear equalizers can be

used for this purpose, nonlinear equalizers are more preferably used because they are capable

of compensating both linear and nonlinear channel distortions effectively.

Two types of equalization are used which are sequence estimation and symbol detection. In

this thesis, symbol detection technique is used to realize the adaptive channel equalization.

This technique maps the input baseband signal of the input on top of a feature space that the

representation of a learnt property of the transmitted signal determined. The symbols are

separated by the usage of decision regions which function to classify the distorted signal.

The ISI problem which affects all digital communication systems is mainly caused by

restricted bandwidth. The restricted bandwidth is caused by rectangular multilevel pulses

6

when they are filtered improperly as they pass through a communication system spreading in

time, being smeared into adjacent time slots, causing ISI [2]. This ISI in turn causes errors

when transmitting data over the channel. Additionally, channel characteristics have a

significant role in causing distortions and the response of channel is time-variant meaning that

channel characteristics are not known in advance. The time-variant channel response and the

unknown channel characteristics obligates the equalizers to be designed to adjust themselves

to the channel response and to adapt themselves to the variations of time in the response of

channel so as to compensate for the channel characteristics’ variations. Such equalizers are

called adaptive equalizers and they have been receiving great attention because of their

superior features. In practice, as an example, there are situations when the channel consists of

dial-up telephone lines and the channel transfer function changes from call to call. In such a

case, the equalizer should be an adaptive filter.

Adaptive equalizers are categorized as supervised and unsupervised equalizers. When it is

necessary to use a training sequence because of the unpredictable channel characteristics in a

communications system, supervised equalizers are employed. This is done in order for the

channel response to be compared with the input to be able to update the parameters of the

equalizer. On the other hand, some communications systems do not allow the use of training

signals because the methods used to accomplish the equalization of channel do not allow the

training sequence to be transmitted. This is when unsupervised equalization is employed. This

equalization that involves a self-recovery method is also referred to as blind equalization [5].

Supervised equalization can be brought about by either sequence estimation or symbol

detection. Sequence estimator’s duty is to test the possible sequences of data instead of

decoding every one of the received symbols on its own and then selecting the sequence of

data that is most likely to be the output [4]. This sequence estimator is also referred to as

maximum likelihood sequence estimator (MLSE).

Unsupervised or blind equalization is used when the signal has no memory i.e. the signals

transmitted in successive symbol intervals are interdependent. In this case, each transmitted

symbol is detected separately. The constant modulus algorithm (CMA), discovered by Godard

[6] and Treichler [7] serves to be a highly significant algorithm for blind equalization. Its

7

robustness and capability of converging before phase recovery made this algorithm very

successful [5]. Another algorithm called the multimodulus algorithm (MMA) [8,9] has

improved performance over CMA since it provides low steady-state mean-squared error

(MSE) in addition to cancelling the necessity for phase recovery in steady-state operation [9].

Additionally, hybrid blind equalization algorithms are different types of blind equalization

algorithms known for combining or augmenting existing cost functions to attain improved

performance [5].

Nonlinear equalizers are considered significant among signal processing techniques due to

their both superior performance and improved features compared with linear equalizers, in

addition to the wide variety they offer. One of those features is the ability to form nonlinear

decision boundaries where the Bayesian equalizer determines the performance of these

equalizers. Decision Feedback Equalizers (DFEs) are one class of nonlinear equalizers with

relatively improved performance. Estimating and cancelling the ISI that an information

symbol induces on future symbols after it has been detected and decided upon forms the basis

of decision feedback equalization [4]. The DFE can possess two structures which are either

direct transversal or lattice structures. The direct form is made up of a feed forward filter

(FFF) and a feedback filter (FBF). The output of a detector located in between the FFF and

FBF determines the decisions that will be input to the FBF, eventually adjusting the

coefficients of the FBF to eliminate the current symbol’s ISI caused by past detected symbols.

The remarkable feature of the DFE is its superiority over linear transversal equalizer (LTE)

which is the most common equalizer structure. This superiority is due to its smaller minimum

mean square error (MMSE) than that of the LTE. This is caused by the severely distorted

channel of the LTE or when it exhibits nulls in the spectrum causing the performance of an

LTE to degrade and the minimum mean squared error (MMSE), which is the basic

performance criterion of the DFE, to be quite better than that of the LTE.

The goal in designing a communications system is to transmit information to the receiver with

as little deterioration as possible and at the meantime to satisfy design constraints of allowed

signal bandwidth, transmitted energy and cost. In digital communications systems, the

probability of bit error (Pe), which is named bit error rate (BER) is generally taken to be the

8

measure of degradation and performance. In analog communications systems, the signal-to-

noise ratio (SNR) that is related with the end of the receiver is generally the performance

criterion. It’s important to attain a low mean square error (MSE) and high convergence rate

beside a low BER in nonlinear channel equalization. Training sequences are also an important

factor that determines the efficiency of a communications system. They are intended to be as

short as possible which requires the adaptation process to end in as few iterations as possible.

The application of linear equalizers to nonlinear channels does not yield the desired BER

performance since they are based on linear system theory and are used for equalization of

linear channels. Recently, neural networks and fuzzy technology have evolved into a powerful

tool in the equalization of nonlinear channel distortions.

1.4 State of Application of Neural Networks and Fuzzy Technologies for Channel

Equalization

1.4.1 Design of neural network based equalizers

Nonlinear equalizers are capable of compensating for both nonlinear and linear channel

distortion. Adaptive nonlinear equalizers that implemented neural network models were used

extensively primarily for noise-cancellation in various applications. A multilayer perceptron

(MLP) is one of the neural network structures which is used in neural network based

equalizers. MLP networks consist of feedforward neural networks having one or more layers

of neurons, known as hidden neurons that are between the input and output neurons.

Filtering is the process of changing the relative amplitudes of the frequency components in a

signal or eliminating some frequency components completely in a variety of applications [10].

Assigning k information bits to the ܯ = 2 possible signal amplitudes which can be carried

out in a number of ways is called mapping or transformation. Generally, the nonlinear

equalization includes a channel estimator since the channel information is not available at the

receiver end [12]. Filtering comprise two estimation procedures, one of them being the

mapping from the available samples and the other one being the estimation of the output of

the filter from the input by the realization of this mapping [11]. The mapping is more difficult

9

for a nonlinear filter than for a linear filter but research still goes on to effectively realize the

mapping of nonlinear filters.

1.4.2 Channel equalization by using fuzzy logic

Adaptive equalizers for nonlinear channels can be developed by a variety of effective ways.

Baye’s probability theory [13] is capable of bringing about the optimal solution for a symbol

equalizer and is referred to as the Bayesian equalizer. Symbol decision equalizers are

particularly simple and less complex in terms of computationality compared with the MLSE.

A channel estimate is not always necessary for them. They function as inverse filters [14] and

such algorithms as recursive least square (RLS) or least mean square (LMS) are employed to

base an adaptive filter. The channel inverse is found by the adaptive filter where noise

provides a linear decision boundary. In general, an optimal equalizer requires decision

function that is naturally nonlinear. This equalization is usually thought to be a nonlinear

problem of classification with this perspective and because of this reason, linear equalizers’

performance is not good enough to be optimal. This is the reason search for nonlinear

equalizes providing a nonlinear decision function has been undertaken. Nonlinear equalizers

employing artificial neural networks (ANNs) [15], [16], [17] and radial basis function (RBF)

networks [15], [18], [19] were successfully developed. Nonlinear equalizers using ANN and

RBF networks were shown to provide superior performance to linear equalizers for channels

corrupted with ISI and AWGN [20]. The ANN equalizers had some discrepancies due to poor

convergence and RBF equalizers provided functional behavior which is localized and required

by the optimal equalizer where it was difficult to train the centers. This, however, caused the

examinations to find different nonlinear equalization techniques. A fuzzy adaptive filter forms

the basis of a fuzzy equalizer and this fuzzy equalizer has been suggested in [21] as the result

of examinations to find alternative nonlinear equalization techniques and a fuzzy system

related equalizer is offered by [22]. It was found that these equalizers had good performance

but the Bayesian equalizer decision function could not be found, in addition to the difficulty

of demand by fuzzy adaptive filter based equalizer, for high computational complexity.

The fuzzy logic is based on fuzzy rules that use input-output data pairs of the channel. This

type of adaptive equalizers operates by processing numerical data and linguistic information.

10

Fuzzy equalizer depends on fuzzy IF-THEN rules which are determined by human experts.

These rules use the channel’s input-output data pairs and carries out the construction of the

filter for nonlinear channel. The bit error rate (BER) and adaptation speed can be improved by

the linguistic and numerical information.

Digital communications involving quadrature amplitude modulation (QAM) can apply the

fuzzy filter with both linear and nonlinear channel characteristics as has been achieved in this

thesis. The present study proposes a complex fuzzy adaptive filter with changeable fuzzy IF-

THEN rules, which is an extension of the real fuzzy filter. The filter inputs and outputs are all

complex valued. However, the inputs of the channels are real reciprocals of the modulated

complex transmitted inputs and the equalizer outputs are real reciprocal estimates of the

reciprocal channel input signals. Afterwards, the reciprocal normalized equalizer outputs are

denormalized to form the final complex-valued, equalized estimate outputs of the receiver.

This technique which is primarily based on normalization and directly applied on the

transmitter, on the whole presents a new method to successfully equalize complex-valued

QAM signals which are severely distorted in both linear and hostile time-varying nonlinear

channel environments, by using real-valued reciprocals of the signals in question. In addition

to the methodology, the membership functions derived from the training data set and the

gradient-descent learning algorithm which trains the data set, represent a significant element

of the nonlinear neuro-fuzzy equalizer that is capable of this adaptive channel equalization. Its

superiority relies not only on its high equalization performance but also on its capability of

minimizing or eliminating the non-linear channel distortions that in general, linear equalizers

are not capable of doing. In turn, the fuzzy logic based neuro-fuzzy equalization is proven to

be an efficient equalizer on a complex scheme such as QAM with high approximation ability

in nonlinear problems in addition to the linear ones.

A fuzzy adaptive filter is based on a set of fuzzy IF-THEN rules whose function is to change

adaptively in order to minimize some criterion function as new information is available [35].

A recursive least squares (RLS) adaptation algorithm is used by a fuzzy adaptive filter.

The construction of RLS fuzzy adaptive filter is accomplished by the following four steps:

11

1) Defining fuzzy sets in the filter input space UєRn which has membership functions

covering U.

2) Constructing a set of fuzzy IF-THEN rules that either human experts determine or the

adaptation procedure determines by matching input-output data pairs;

3) Constructing a filter that is based on the set of rules; and,

4) Updating the filter’s free parameters by utilizing the RLS algorithm.

The fuzzy adaptive filter’s main advantage is the possibility of integrating linguistic

information (in the shape of fuzzy IF-THEN rules) and numerical information (in the shape of

input-output pairs) into the filter uniformly. At the end, when it’s time to apply the fuzzy

adaptive filter to equalization problems related with nonlinear communication channel, the

following fundamental differences between RLS and LMS are reached:

1) The RLS algorithm is faster than that of the LMS algorithm.

2) Having, in fuzzy terms, incorporated some linguistic description about the channel

into the fuzzy adaptive filter will extensively enhance the adaptation speed of RLS.

3) The fuzzy equalizer’s bit error rate is quite approximate to the bit error rate of the

optimal equalizer.

4) The excess mean-square error of the RLS algorithm is inclined towards zero as the

number of iterations comes nearer to infinity.

Development of neuro-fuzzy system in order to equalize channel distortion includes the

following steps:

-First, the methodologies utilized to equalize channel distortions are analyzed and state of

application problems of neural and fuzzy technologies for the development of an equalizer is

considered.

-Second, the data transmission structure is explained and the operation structure of adaptive

channel equalization utilizing neuro-fuzzy network is presented.

12

-Third, the mathematical model of the neuro-fuzzy network for the development of

equalization system for channel distortion is presented. The learning algorithm of neuro-

fuzzy system is considered.

-Fourth, the development of the neuro-fuzzy equalizer for channel distortion is presented.

-Fifth, the QAM signaling is explained and its application on nonlinear neuro-fuzzy network

is presented. The simulation results of the equalizer using QAM signals and analytical tables

demonstrating the performance of the equalizer are presented. Additionally, tables comparing

the different QAM constellations are presented.

1.5 Summary

In this chapter, the application of channel equalization is explained. The types of distortions in

channels and the types of equalizers used to minimize them are explained with their

classifications and properties. Performance criteria of equalizers, namely bit error rate (BER),

signal-to-noise ratio (SNR) and convergence rate with their ideal indications are stated.

Neural networks and fuzzy logic are particularly discussed and explained with their structures

and features. The methods of equalization using neural networks, specifically filtering is

described. Different types of algorithms, networks and equalizers used especially for difficult

nonlinear channels are defined.

Fuzzy IF-THEN rules which constitute the basis of fuzzy logic are described to point out their

significance in channel equalization. The steps of constructing a fuzzy adaptive filter using

these rules are defined. The methods used in equalizing QAM signals applied on neuro-fuzzy

network and the gradient-descent learning algorithm as part of the equalization system are

described as well.

13

CHAPTER 2

STRUCTURE OF CHANNEL EQUALIZATION

2.1 Overview

All communications systems are composed of three fundamental subsystems which are

transmitter, channel and receiver (Fig. 2.1). A transmitter’s task is to transmit information

signal through physical channel or transmission medium after converting it into a form which

is convenient for transmission. The receiver’s task, on the other hand, is to produce an

accurate replica of the transmitted symbol sequence by recovering the message signal that the

received signal contains. The communications channel acts as a connector between the

transmitter and the receiver sending the electrical signal from the transmitter to the receiver.

The unknown channel characteristics cause distortions to the transmitted signal before it

reaches the receiver.

Figure 2.1 Basic components of a communications system

Digital communications systems are preferred more compared with the analog ones due to

increasing demand for data communication and because digital transmission provides data

processing options and flexibilities that analog transmission cannot offer. The distinguishing

feature of a digital communications system is that it sends a waveform from a finite set of

possible waveforms during a finite interval of time as opposed to an analog communication

system that transmits a waveform from unlimited number of various waveforms which have

theoretically infinite resolution. The message from the source which is represented by an

information waveform is encoded before transmission so that transmission error can be

detected and corrected by the receiver. At the receiver end, the message signal must be

decoded before being used. The distortions preventing the correct transmission of signals are

mainly intersymbol interference (ISI) and noise. Noise is meant to be unwanted electrical

signals which exist in electrical systems. The equalization of channel is an efficient technique

Transmitter

Channel

Receiver

14

employed to reduce or eliminate the obscuring effect of distortion caused in the channel. This

chapter outlines the structure of data transmission system and the functions of its main

components as well as the equalization of channel distortion.

2.2 Architecture of Data Transmission Systems

A communications channel is an electrical medium which connects the transmitter and the

receiver, providing the data transmission from a source which generates the information to

one or more destinations. In the analysis and design of communication systems, the

characteristics of the physical channels through which the information is transmitted, are of

particular importance. Wire lines or free space may be used in the communications path from

the transmitter to the receiver. The examples for wire lines are coaxial cables, wire pairs and

optical fibers. These are widely used in terrestrial telephone networks, even though infrared

and optical free space links such as video, remote controls for TV and hi-fi equipment as well

as some security systems may be used in different situations, as well. This point of

transmission medium is where most of the attenuation and noise is observed [23].

The receiver functions to reverse the signal processing steps performed by the transmitter

recovering the original message signal by compensating for any signal deteriorations caused

by the channel. This involves amplification, filtering, demodulation and decoding and in

general is a more complex task than the transmitting process.

There are many reasons as to why digital communication systems are preferred over analog

systems. Digital communication systems (DCSs) represent an increase in complexity over the

equivalent analog systems. The principal advantages and reasons of DCS’s being the

preferred option instead of analog communication systems can be listed as:

1. The ease with which digital signals, compared with analog signals, are regenerated.

2. Digital systems are not as prone to distortion and interference as analog systems.

3. Increased demand for data transmission.

4. Increased scale of integration, sophistication and reliability of digital electronics for

signal processing, combined with decreased cost.

5. Facility to source code for data compression.

15

6. Possibility of channel coding (line and error control coding) to minimize the effects of

noise and interference.

7. Ease with which bandwidth, power and time can be traded off in order to optimize the

use of these limited resources.

8. Standardization of signals, irrespective of their type, origin or the services they

support, leading to an integrated services digital network (ISDN)

9. Digital hardware can be implemented more flexibly than analog hardware.

10. Various types of digital signals such as data, telephone, TV and telegraph can be

considered identical signals in transmission and switching [24].

Modulation, which is part of the transmission and equalization process, involves encoding

information from a message source in a way that is convenient for transmission. It is

accomplished by translating a baseband message signal to a bandpass signal at frequencies

which are quite high when compared with the frequency of baseband. It is also referred to as

the mapping of the baseband input information waveform into the bandpass signal. The

bandpass signal is referred to as the modulated signal and the baseband message signal is

referred to as the modulating signal. Modulation can be accomplished by varying the

frequency, phase or amplitude of a high frequency carrier in conformity with the amplitude of

the message signal. Demodulation, on the other hand, is the process of extracting the

baseband message from the carrier in order to enable the aimed receiver (also known as the

sink) to process and interpret it. In digital wireless communication systems, it’s possible to

represent the modulating signal as a time sequence of pulses or symbols, where each symbol

has m finite states. The representation of n bits of information where n = log2 m bits/symbol,

is done by each symbol [4].

The block diagram illustrated in Fig. 2.2 can describe communications systems. The source of

data is the signal generator that produces the information to be transmitted and modulated.

This information is in the form of a message symbol that can consist of a single bit or a

grouping of bits.

In order to make the transmission more efficient in terms of the time it takes and/or bandwidth

it requires, encoder is employed as a signal processor that converts the sources of digital

16

information into binary form, i.e. each symbol is encoded as a binary word. Encoding is

performed so as to enable the signal processor in the receiver to detect and correct errors

which will provide the minimization and/or elimination of bit errors caused by noise in the

channel.

The procedure used for detecting and correcting errors is called coding. Coding includes

adding redundant (extra) bits to the stream of data. The redundant bits like parity bits are

employed by the decoder and serve to correct errors at the receiver output even though a high

degree of redundancy may increase the bandwidth of the encoded signal. Codes can be

classified into two broad categories as block codes and convolutional codes. The main

difference is that block coder is a memoryless device whereas a coder having a memory

produces a convolutional coder. Hamming Codes, Golay Codes, Hadamard Codes, Cyclic

Codes, BCH (Bose-Chaudhuri-Hocquenghem) Codes and Reed-Solomon Codes are some

examples of block codes. In addition to block codes and convolutional codes, a new family of

codes, called turbo codes is used recently and is being incorporated in 3G wireless standards.

Turbo codes combine the capabilities of convolutional codes with channel estimation theory

and can be thought of as nested or parallel convolutional codes. When implemented properly,

turbo codes allow coding gains which are far superior to all previous error correcting codes

and permit a link of wireless communications to come surprisingly near to realizing the

Shannon capacity bound [4].

Each digital word has n binary digits and there are ܯ = 2 unique code words which are

possible where each code word corresponds to a certain amplitude level. However, each

sample value from the analog signal could be any one of an infinitely high number of levels

for the digital word which represents the amplitude closest to the actual sampled value to be

utilized. That is known as quantizing [2]. Gray coding was used as the mapping of bits along

the in-phase and quadrature axes of the QAM constellation as part of this thesis study. The

Gray code has been selected since it has change of only one bit for each change of step in the

quantized level. Multisymbol signaling can be thought of as a coding or bit mapping process

17

TRANSMITTER

AWGN

RECEIVER

Figure 2.2 Architecture of a digital communications system [39]

in which n binary symbols (bits) are mapped into a single M-ary symbol. A detection error in

a single symbol can therefore translate into several errors in the corresponding decoded bit

sequence. The bit error rate (BER), therefore relies not only on the probability of symbol error

and the symbol entropy but on the code or bit mapping used and the types of error which

occur as well. If a Gray code is used to map binary symbols to phasor states, this type of error

results in only a single decoded bit error [23]. Consequently, single errors in the receiver will

cause minimal errors in the recovered level.

There are many criteria used in the evaluation of the performance of a communications

system. The optimum system that is considered close to being ideal or perfect for digital

systems is the one that minimizes the bit error rate (BER) at the receiver output subject to

constraints on channel bandwidth and transmitted energy. This raises the matter of inventing a

system with no bit error at the output even when there is noise in the channel. Shannon

demonstrated in 1948 that it was possible to calculate a channel capacity C (bits/s) in the way

that if the rate of information was less than C, the probability of error would approach to zero.

In this case, the maximum possible bandwidth efficiency

max

B

, which is defined as the

Data

source

Encoder

Filter

Modulator

Physical Channel

Demodulator

Filter

Equalizer

Decision

device

Decoder

18

capability of a modulation scheme to accommodate data within a limited bandwidth, is

restricted by the channel noise and is stated by the channel capacity formula in Eq.2.1.

Shannon’s channel capacity formula is applicable to AWGN and is given by

N

S

B

C

B

1

log 2

max

or (2.1)

N

S

B

C

1

log 2

in which C is the channel capacity (bits per second), B is the transmission bandwidth, S is the

average power of the transmitted signal and N is the power spectral density of the white

Gaussian noise. S/N is called the signal-to-noise ratio. Shannon also showed that errors that a

noisy channel induces, could be decreased to any desired level by encoding the information

properly, without sacrificing the rate of information transfer.

The physical medium or the channel that the message signal is transmitted through, induces

distortions like intersymbol interference (ISI) and noise. The receiver, on the other hand is

responsible for separating the source information from the received modulated signal which is

distorted by noise that is usually random, additive white Gaussian noise (AWGN). The

receiver’s duty is to take the corrupted signal at the output of the channel and to convert it to a

baseband signal that the baseband processor could handle. The baseband processor eliminates

or minimizes this signal and distributes an estimate of the source information to the output of

the communications system [2]. Demodulation process is employed at the receiver to the

signal in order to recover the transmitted signal in its baseband form and make it ready to be

processed by the receiver filter. At the end, the decision device reconstructs the encoded

message signal depending on the decisions of the equalizer and the decoder reconstructs the

sequence of transmitted signals by bringing about the reverse operation of the encoder.

19

2.3 Channel Characteristics

Channels must have appropriate frequency band for their transmission medium. The

processed baseband signal is converted by the transmitter circuit into this frequency band. If

the channel is a fiber-optic cable, the carrier circuits convert the baseband input to light

frequencies and the transmitted signal is light.

Channels are classified as wire and wireless channels. Some examples of wire channels can be

counted as coaxial cables, fiber-optic cables, twisted-pair telephone lines and waveguides

whereas air, vacuum and seawater are examples of wireless channels.

The constraints channels may introduce are in favor of a particular type of signaling.

Generally, the signal is attenuated by the channel so that the channel or the noise produced by

an imperfect receiver deteriorates the delivered information from that of the source [2]. There

are various sources that cause noise; those sources may be natural electrical disturbances such

as lightning, artificial sources like ignition systems in cars, switching circuits in a digital

computer or high voltage transmission lines. The channel is likely to involve amplifying

devices such as satellite transponders in space communication systems or repeaters in

telephone systems that help the signal to be above the noise level. In addition to noise,

multiple paths that arise between the input and output of channel involve attenuation

characteristics and time delays. The attenuation characteristics may vary with time, which

makes the signal fade at the channel output. Fading of that type can be observed while

listening to distant shortwave stations.

Another significant characteristic of channels is bandwidth. In general terms, bandwidth is

defined to be the width of a positive frequency band of waveforms whose magnitude spectra

are even about the origin ݂ = 0. Bandwidth in a channel must be enough to accommodate the

signal but reject the noise. High bandwidth allows more users to be assigned as well as more

information to be transmitted. Some examples of band limited channels are telephone

channels and digital microwave radio channels. When the channel is band limited to ܹHz,

any frequency components above ܹ will not be passed by the channel. In turn, the bandwidth

of the transmitted signal will be limited to ܹ Hz, as well. When the channel is not ideal (i.e.

20

|݂| ≤ܹ), signal transmission at a symbol rate equivalent of or exceeding ܹ concludes as

intersymbol interference (ISI) among a number of adjacent symbols. In addition to telephone

channels, other physical channels which exhibit some form of time dispersion and thus

introduce ISI, are also available. Radio channels like shortwave ionospheric propagation (HF)

and tropospheric scatter are two examples of time-dispersive channels. In these channels, time

dispersion and hence, ISI is the consequence of multiple propagation paths that have different

path delays [1]. In addition to noise, multipath propagation and ISI, there are other

impairments in the channels specifically nonlinear distortion, frequency offset and phase

jitter. Channel impairments affect the transmission rate over the channel and the modulation

technique to be used. Depending on the rates, bandwidth efficient modulation techniques are

employed and some form of equalization is employed accordingly.

2.4 Channel Distortions

Channels which are used to transmit data distort signals in both amplitude and phase. In

addition to the nature of the channel itself, other factors like linear distortion, nonlinear

distortion and frequency offset are significant factors causing these distortions.

Linear distortion occurs in linear time-invariant systems in which channels are characterized

as band-limited linear filters. Those channels like telephone channels are part of digital

communications systems where distortionless transmission is highly desired. A linear time-

invariant system will produce two types of linear distortion which are amplitude distortion

and phase distortion. In order to have distortionless transmission with linear time-invariant

systems, the first requirement is that the transfer function of the channel must be given by

d

fT

j

Ae

f

X

f

Y

f

H

2

)

(

)

(

)

(

(2.2)

which means that in order to have no distortion at the system output, the following

requirements have to be met:

1. Flat amplitude response. That is,

A

f

H

constant

)

(

(2.3a)

21

2. The phase response that is a linear function of frequency. That is,

d

fT

f

H

f

2

)

(

)

(

(2.3b)

When the first condition is satisfied, no amplitude distortion exists and when the second

condition is satisfied, no phase distortion exists. The second requirement is related with the

time delay of the system and it is defined as

)

(

2

1

)

(

2

1

)

(

f

H

f

f

f

f

Td

(2.4)

and it is compulsory that

constant

)

(

f

Td

(2.5)

for distortionless transmission. If

)

( f

Td

is not constant, there is phase distortion since the

phase response,

)

( f

, is not a linear function of frequency.

Nonlinear distortion in telephone channels arises from nonlinearities in amplifiers and

compandors used in the telephone system. This type of distortion is usually small and it is

very difficult to correct [1]. There will be nonlinear distortion on the output signal if the

voltage gain coefficients from the second order on, are not zero. There are three types of

nonlinear distortions associated with the amplifiers which are harmonic distortion,

intermodulation distortion (IMD) and cross-modulation distortion. Harmonic distortion

occurs at the amplifier output and is caused by first and second order frequencies of the

amplifier output. The intermodulation distortion is produced by cross-product term of the

amplifier input-output equation whereas the cross-modulation distortion is caused by the third

order distortion products of the amplifier output.

In addition to linear and nonlinear distortions, signals transmitted through telephone channels

are subject to the impairment of frequency offset. A small frequency offset which is mostly

less than 5 Hz, results from the use of carrier equipment in the telephone channel. High-speed

digital transmission systems that use synchronous phase-coherent demodulation cannot

22

tolerate this type of offset. This offset is compensated for by the carrier recovery loop in the

demodulator.

Phase jitter is basically a low-index frequency modulation of the transmitted signal with the

low frequency harmonics of the power line frequency. Phase jitter poses a serious problem in

digital transmission of high rates. Yet, it can be tracked and compensated for, to some extent,

at the demodulator.

Distortion can occur within the transmitter, the receiver and the channel. As opposed to noise

and interference, distortion appears when the signal is turned off.

2.4.1 Multipath propagation

Multipath fading occurs to varying extents in many different radio applications. It is caused

whenever radio energy reaches the receiver by more than one path. Multiple paths may also

occur due to ground reflections, reflections from stable tropospheric layers and refraction by

tropospheric layers with extreme refractive index gradients [23]. Scattering obstacles also

cause multipath propagation to some other systems like urban cellular radio systems.

There are two principal effects of multipath propagation on systems, their relative severity

depending essentially on the relative bandwidth of the resulting channel compared with that of

the signal being transmitted. The fading process is governed by changes in atmospheric

conditions for fixed point systems such as the microwave radio relay network. The path delay

spread often is adequately short for the channel frequency response to be essentially constant

over its operating bandwidth. If that happens, fading is considered flat because all signal

frequency components become prone to the same fade at any given instant. In the case of path

delay spread being longer, the channel frequency response is likely to change rapidly on a

frequency scale that can be compared with signal bandwidth. If that happens, the fading is

considered frequency selective and the received signal is subject to severe amplitude and

phase distortion. Adaptive equalizers may then be required to flatten and linearize the overall

characteristics of channel. The flat fading effects can be combated by increasing transmitter

power whilst the effects of frequency selective channel cannot. A fade margin is usually

designed into the link budget to offset the expected multipath fades for microwave links

23

which are subject to flat fading. The magnitude of this margin depends on the required

availability of the link.

Paths of multiple propagation that have different path delays cause time dispersion and ISI in

time-dispersive channels. The reason for calling these channels time-variant multipath

channels is that the relative time delays among the paths and the number of paths vary with

time. Various frequency response characteristics are caused by the time-variant multipath

conditions resulting in inappropriate frequency response characterization for time-variant

multipath channels, which is used for telephone channels. Instead, scattering function

statistically characterizes these radio channels. The scattering function is a two-dimensional

representation of the average received signal power which depends on Doppler frequency and

relative time delay.

2.4.2 Intersymbol interference

Rectangular pulse signaling, in principle, has a spectral efficiency of 0 bits/s/Hz since each

rectangular pulse has infinite absolute bandwidth. In practice, of course, rectangular pulses

can be transmitted over channels with finite bandwidth if a degree of distortion can be

tolerated.

In digital communications, it might appear that distortion is unimportant since a receiver must

only distinguish between pulses which have been distorted in the same way. If the pulses are

filtered improperly as they pass through a communications system i.e. if the distortion is

severe enough, they will spread in time. The decision instant voltage might then arise not only

from the current symbol but also from one or more preceding pulses. Intersymbol interference

(ISI) is caused when smearing the pulse for each symbol into adjacent time slots occurs. The

pulses would have rounded tops instead of flat ones with a restricted bandwidth. What’s

important about ISI is the decision instant. The decision instant can be defined as the

sampling instant (or sampling point) at which each time slot of the transmitted or received

waveform begins. It is at this point that ISI occurs due to the smearing effect of the pulse.

This smearing will cause unwanted contributions from the adjacent pulses that are likely to

degrade bit error rate (BER) performance. The decision instant shows an important point: The

24

performance of digital communications systems is only related with decision instant ISI. If ISI

occurs at times that are not decision instants, it does not matter [23].

If the signal pulses could be persuaded to pass through zero crossing point (of the time axis) at

every decision instant (except one), then ISI would no longer be a problem. This suggests a

definition for an ISI-free signal, i.e.: If a signal passes through zero at all instants that are not

one of the sampling instants, it’s an ISI-free signal [23].

While transmitting information with pulses over an analog channel, the original signal is a

discrete time sequence (or an acceptable approximation); the received signal is a continuous

time signal. The channel can be considered a low-pass analog filter, by that means, smearing

or spreading the shape of the impulse train into a continuous signal with peaks that are related

with the original pulses’ amplitudes. Convolution of the pulse sequence by a continuous time

channel response could describe the operation in terms of mathematics. The convolution

integral is the beginning of the operation:

(2.6)

where x(k) denotes the received signal, h(k) denotes the channel impulse response and s(k)

denotes the input signal. The second half on the right side of the above equation illustrates the

commutativity property of the convolution operation.

Component s(k) is the input pulse train that is comprised of periodically transmitted impulses

of varying amplitudes, for that reason;

s(k) = 0 for k≠nT (2.7)

s(k) = Sn for k=nT

where T is the symbol period. Here, it is meant that the only significant values of the variable

of integration in the integral of equation (2.6), are those for which ݇ = ݊ܶ. A different value

of k amounts to multiplication by 0 and for that reason, x(k) can be stated as

d

k

h

s

d

k

s

h

k

s

k

h

k

x

)

(

)

(

)

(

)

(

)

(

)

(

)

(

25

)

(

)

(

nT

k

h

s

k

x

n

n

(2.8)

The above equation that represents x(k) is more similar to the convolution sum, however, it

nevertheless is the description of a continuous time system. It illustrates that the received

signal is comprised of the addition of a large number of shifted and scaled impulse responses

of continuous time system. The amplitudes of the transmitted pulses of x(k) scale the impulse

responses.

The first term in Eq. 2.8 is the component of x(k) because of the Nth symbol. The centre tap of

the channel impulse response multiplies it. ISI terms are the other product terms in the

summation. The appropriate samples in the tails of the channel impulse response scale the

input pulses in the neighborhood of the Nth symbol.

2.4.3 Noise

In communications systems, the received waveform is usually classified as the desired part

which contains the information and the extraneous or unwanted part. The desired part is the

signal and the unwanted part is the noise. Noise limits our ability to communicate and causes

more power consumption during the transmission of information. Minimizing the noise

effects is achieved after enhancing the power amount in the transmitted signal. Yet, factors

like equipment and various practical limitations restrict the level of power in the signal which

is transmitted.

The most frequently encountered problem in the transmission of signals through any channel

is additive noise that is generally generated internally at the receiver end by components like

solid-state devices of a subsystem and resistors employed in the implementation of the

communications system. That is at times referred to as thermal noise. Thermal noise is

produced by the random motion of free charge carriers (usually electrons) in a resistive

medium. Additive noise generated by the electronic components is usually found in a storage

system’s readback signal, as in the case of a radio or telephone communication system. When

such noise occupies the same frequency band that the desired signal occupies, suitable design

of the transmitted signal and its demodulator at the receiver can minimize its effect [23].

26

Another problem in transmission is the non-thermal noise, also known as the shot noise.

Although the time averaged current flowing in a device may be constant, statistical

fluctuations will be present if individual charge carriers have to pass through a potential

barrier. The potential barrier may, for example, be the junction of a PN junction diode, the

cathode of a vacuum tube or the emitter bus junction of a bipolar transistor. Such statistical

fluctuations constitute shot noise.

Noise that arises from external sources can be coupled into a communication system by the

receiving antenna. Antenna noise which is dominated by the broadband radiation produced in

lightning discharges associated with thunderstorms, below 30 MHz originates from several

different sources. This radiation is trapped by the ionosphere and propagates worldwide.

Such noise is sometimes referred to as atmospheric noise.

Noise can be classified into categories as:

a. White noise: A stochastic process which has a flat power spectral density over the

entirety of frequency range. It’s not possible to express that sort of noise using

quadrature components because of its wideband character. When problems tackling

the narrowband signal demodulation in noise are in question, modeling the additive

noise process as white and representing the noise using quadrature components is

mathematically convenient. It’s possible to accomplish this after putting forward that

the signals and noise at the receiver managed to pass through an ideal bandpass filter,

which has a passband including the spectrum of the signals but is a lot wider. The

noise that is the result of passing the white noise process through a spectrally flat

bandpass filter is referred to as bandpass white noise.

b. Electromagnetic Noise: Usually found in electrical devices like television and radio

transmitters and receivers. They can be present at all frequencies.

c. Impulse Noise: An additive disturbance which arises primarily from the switching

equipment in the telephone system. It is made up of short-duration pulses having

random duration and amplitude.

d. Acoustic Noise: Present in almost all conversations and limit telecommunications

environments such as telephone circuits and hands-free telephones. It may be

27

unnoticeable or distinct, depending on the time delay involved. If the delay between

the speech and its echo (noise) is short, the noise is unnoticeable, but perceived as a

form of spectral distortion referred to as reverberation. If, however, the delay exceeds

a few tens of milliseconds, the noise is distinctly noticeable [25]. Background noise

generated in a car cabin, air conditioners and computer fans represent some types of

acoustic noise.

e. Processing Noise: Modeled as a zero-mean, white-noise process

in data

communication systems. It is the result of digital analog processing of signals, e.g. lost

data packets in digital data communications systems or quantization noise in digital

coding of image or speech.

f. Colored Noise: It’s a Gaussian type noise which is part of wideband signal processes

with non-constant spectrum. Autoregressive noise and brown noise are some examples

of the non-white, colored noise.

Gaussian noise and specifically the additive white Gaussian noise (AWGN) is the most

frequently encountered type of noise in communication systems. It represents the simplest

mathematical model for a communication channel. Below are given a list of channel models

in which the effects of noise on electrical communication and the most important

characteristics of the transmission channels are investigated.

2.4.3.1 The additive noise channel

Contaminating noise in signal transmission usually has an additive effect in the sense that

noise often adds to the information bearing signal at various points between the source and the

destination. Random additive noise process n(k) whose channel has a mathematical model as

shown in Fig. 2.3, corrupts the transmitted signal x(k). The additive noise becomes white

when the random process has a power spectral density (PSD) which is constant over all

frequencies and becomes the most often assumed model of additive white Gaussian noise

(AWGN), when the noise has a Gaussian distribution. AWGN contains a uniform continuous

frequency spectrum over a particular frequency band and the majority of physical

communication channels implements this model since it is mathematically tractable.

28

s(k) x(k)=s(k)+n(k)

Figure 2.3 The additive Gaussian noise channel [39]

2.4.3.2 The linear filter channel

Filtering is an operation which includes extracting information about a quantity of interest

from data with noise at time ݐ by using measured data that includes ݐ. A filter is considered

linear when filtering, smoothing or predicting the amount at the filter output is done and this

amount linearly depends on the observations applied to the filter input [25].

Linear filter channels are those that enable the transmitted signals to remain in specified

bandwidth limitations without interfering with each other. The mathematical model including

the additive noise is illustrated in Fig. 2.4 in which s(k) is the channel input and the channel

output is represented as

)

(

)

(

)

(

)

(

)

(

)

(

)

(

k

n

d

k

s

h

k

n

k

h

k

s

k

x

(2.9)

in which h(τ) is the linear filter impulse response and * denotes convolution.

s(k) x(k)=s(k)∗h(k)+n(k)

Figure 2.4 Linear filter channel with additive noise [39]

Channel

n(k)

Linear

filter h(k)

Channel

n(k)

29

When attenuation is applied to the signal while being transmitted, the received signal becomes

x(k)=αs(k)+n(k) (2.10)

where α is the attenuation factor.

2.4.3.3 The linear time-variant filter channel

Mobile systems such as a moving vehicle and wireless channels such as radio channels cause

multipath propagation resulting in time-varying fading signals because their frequency

response characteristics are time-variant. The time-varying mobile channel characteristics

necessitate using a channel equalizer which continuously adapts to these characteristics,

effectively implementing a filter which is matched to these characteristics. A time-variant

channel impulse response h(τ;k) is a characteristic of such time-variant linear filters. The

channel response h(τ;k) contains an impulse applied at time k-τ where τ stands for the elapsed-

time variable. The linear time-variant filter channel containing additive noise and the signal of

channel output when s(k) is the input, becomes

)

(

)

(

)

;

(

)

(

)

;

(

)

(

)

(

t

n

d

k

s

k

h

k

n

k

h

k

s

k

x

(2.11)

in which the time-variant impulse response has the following representation

)

(

)

(

)

;

(

1

k

L

n

n

k

k

a

k

h

(2.12)

where the {an(k)} denotes the possibly time-variant attenuation factors for the L multipath

propagation paths. Substituting Eq. 2.12 into Eq. 2.11 makes the received signal

)

(

)

(

)

(

)

(

1

k

n

k

k

a

k

x

k

L

n

n

(2.13)

where each of the L multipath components is attenuated by {an(k)} and delayed by {߬(݊)}.

30

A large majority of physical channels are formed by the three defined mathematical models

and the communication systems are analyzed and designed based on these three channel

models.

2.5 Summary

This chapter outlines the structure of channel equalization system. The factors causing

distortions in the channel and their properties are explained and discussed in detail. The noise

types and interferences are described in detail in addition to their effects on the channel and

the ways of removing them from the channel.

The types of channels used within the data transmission system have been discussed.

Mathematical models representing various types of channels have been outlined and

described. Mathematical formulas representing the input, impulse response and the output of

the channel have been explained beside the channel characteristics of each type.

31

CHAPTER 3

MATHEMATICAL BACKGROUND OF A NEURO-FUZZY EQUALIZER

3.1 Overview

When the channel distortion in communications applications is extreme and linear equalizers

are not able to deal with them, nonlinear equalizers are employed instead. A linear equalizer

doesn’t have good performance on channels that have amplitude characteristics containing

deep spectral nulls or on channels containing nonlinear distortions. In an effort to compensate

for the channel distortion, the linear equalizer puts a vast gain in the vicinity of the spectral

null for the channel distortion compensation and consequently increases the amount of

additive noise the received signal has got.

Neural networks can be considered mathematical models of brain and mind activities. The

main purpose of neural networks is to form the organization of numerous simple processing

elements into layers for achieving tasks with higher level sophistication. High computation

rates, high capability for nonlinear problems, massive parallelism and continuous adaptation

are among the properties of neural networks. Those features turn neural networks into desired

tools for different sorts of applications [28]. Neural networks have been put forward for

equalization problems because of these attractive properties and their nonlinear capability.

On the other hand, neural networks have some weaknesses related with their individual

models. Their computational power is low and learning capability is limited. At this point, the

fuzzy systems have been considered to compensate these weaknesses with their capabilities of

logically reaching conclusions on a more advanced (linguistic or semantic) level.

This chapter describes the synthesizing of fuzzy logic with neural networks, the operation and

structure algorithms of neuro-fuzzy system as the channel equalization basis of QAM signals.

3.2 Neuro-Fuzzy System

Intelligent control is largely rule based, whereas classical control is rooted in the theory of

linear differential equations, because the dependencies involved in its deployment are much

32

too complex to permit an analytical representation. In tackling such dependencies, it is

expedient to use the mathematics of fuzzy systems and neural networks. The power of fuzzy

systems lies in their ability to measure the quantity of linguistic inputs and to quickly provide

a working approximation of complex and frequently unknown input-output rules of system.

The power of neural networks is in their ability to learn from data. It’s possible to combine

neural networks and fuzzy logic in a number of ways and both have advantages that provide

flexibility and effectiveness when combined. Fuzzy adaptive filters are effective because of

their data approximation ability in nonlinear problems and therefore are widely used in signal

processing problems. Fuzzy logic equalizers usually require fewer training samples than

conventional equalizers, especially for linear channels. They are capable of yielding better

error performance and also perform better in the presence of channel nonlinearities [29].

Neural networks supply algorithms for numeric classification, optimization and associative

storage. When fuzzy logic and neural networks are integrated, the emerging neuro-fuzzy

system becomes capable of training the network in a shorter time as a result of decreased

number of nodes of the network. There is a natural synergy between neural networks and

fuzzy systems that makes their hybridization a powerful tool for intelligent control and other

applications.

3.3 Fuzzy Inference Systems

3.3.1 Architecture of fuzzy inference systems

Fuzzy Inference Systems (FIS) are one of the well known applications of fuzzy sets theory

and fuzzy logic. They are used in achieving classification tasks, process control, offline

process simulation and diagnosis and online decision support tools. The power of FIS depends

on the twofold identity of both being capable of managing linguistic concepts and being

universal approximators which are capable of performing nonlinear mappings between inputs

and outputs.

FIS is often utilized for process simulation or control. Either expert knowledge or data can

design them. Knowledge based FIS solely may suffer from a loss of accuracy, for complex

33

systems which is the most important motivation to use fuzzy rules concluded from data [30].

The functional blocks as explained below, comprise a fuzzy inference system (Figure 3.1).

- Determining a set of fuzzy IF-THEN rules. Fuzzy rules are composed of linguistic

statements which describe the way the FIS makes a decision about the classification of

an input or the controlling of an output.

- Fuzzifying the inputs, which involves transforming the crisp inputs into degrees to

match with linguistic values, using the input membership functions defined by a data

base.

- Combining the fuzzified inputs in accordance with the fuzzy rules to set up a rule

strength (also called weight or fire strength).

- Determining the rule’s consequence by putting together the rule strength and the

membership function of the output.

- Combining the consequences so as to obtain an output distribution.

- Defuzzification of the output distribution which involves transforming the fuzzy rules

of the inference into crisp output.

The operations upon fuzzy IF-THEN rules are explained in the steps below:

1. Mapping the inputs to membership values of each linguistic label, utilizing a set of

input membership functions on the premise part (fuzzification process).

2. Computation of the rule strength by combining the fuzzified inputs (combining the

membership values), by utilizing the process of the fuzzy combination. (the fuzzy

combinations are also referred to as T-norms which are used in making a fuzzy rule

and involve the operators of ”and”, “or” and sometimes “not”)

3. Generating the qualified fuzzy or crisp consequent of each rule according to the rule

strength.

4. Combining the entirety of the fuzzy rule outputs to attain one fuzzy output distribution

and then aggregating the qualified consequent to produce a single crisp output

(Defuzzification of output distribution).

34

Figure 3.1 Structure of fuzzy inference system [39]

3.3.2 Rule base fuzzy if-then rule

The fuzzy knowledge base that includes a set of fuzzy IF-THEN rules forms one of the basic

blocks of a fuzzy system. The following is the form of expression that represents fuzzy IF-

THEN rules or fuzzy conditional statements [37].

If u is A Then y is B (3.1)

where u and y represent the input and output linguistic variables, A and B represent the labels

of the fuzzy sets characterized by appropriate membership functions. A denotes the premise

and B denotes the consequent part of the rule.

There are many forms representing IF-THEN rules among which Single Input Single Output

(SISO), given by statement (3.1) is the simplest. Multi-Input Single Output (MISO) of the

below given statements (3.2) and (3.3), are the other forms.

If u1 is

jA1 and u2 is

jA2 and ,…., and un is

l

nA Then yq is

p

qB

(3.2)

Input

Output

Knowledge base

Decision-making

Fuzzification

Interface

Defuzzification

Interface

35

If u1 is

jA1 and u2 is

jA2 and ,…., and un is

l

nA Then y1 is

rB1 and y2 is

sB2 (3.3)

The membership functions describe the fuzzy values A and B and Figure 3.2 demonstrates the

most widely used types of membership functions with their shapes.

1 1 1

0.5 0.5 0.5

0

(a)

0

(b)

0

(c)

Figure 3.2 Examples of membership functions (a) bell, (b) triangular, (c) trapezoidal

The following exponential function is one representation of a decision function that produces

a bell curve.

2

2

0

2

exp

)

(

x

x

x

(3.4)

where x is the independent variable on the universe, x0 denotes the position of the peak

relative to the universe and σ denotes the standard deviation.

The expressions (3.5) and (3.6) represent triangle and trapezoidal membership functions,

respectively.

ߤ(ݔ) = ቐ

1 − ௫ି௫

௫ି௫

,

ݔ < ݔ < ݔ

1 − ௫ି௫

௫ି௫

, ݔ < ݔ < ݔ

(3.5)

ߤ(ݔ) =

⎩

⎪

⎨

⎪

⎧1 − ௫ି௫

௫ି௫

, ݔ < ݔ < ݔ

1, ݔ < ݔ < ݔ

1 − ௫ି௫

௫ି௫

, ݔ < ݔ < ݔ

(3.6)

36

The following representation is the form of the types of rules, called Takagi and Sugeno fuzzy

rules because the consequent part of the fuzzy rules is a mathematical function of the input

variables.

If ܣ1(ݔ1), ܣ2(ݔ2), …… , ܣ݊(ݔ) then Y=݂(ݔଵ, ݔଶ, … . , ݔ) (3.7)

where the premise part is fuzzy and the function ݂ in the consequent part is usually a linear or

quadratic mathematical function.

݂ = ܽ + ܽଵx ݔଵ+ ܽଶx ݔଶ+ … + ܽx ݔ (3.8)

Fuzzy IF-THEN rules are widely used in modeling. They are considered the local description

of the system being designed and form the basics of the fuzzy inference system (FIS).

Fuzzification: The aim of fuzzification is mapping the crisp input into a fuzzy set. This input

can be from a set of sensors or features of those sensors like amplitude or frequency, and is

mapped into fuzzy numbers of values from 0 to 1, using a set of input membership functions.

The numeric inputs, ui߳Ui are converted into fuzzy sets by the fuzzification process for the

fuzzy system to use.

When

ܷ

∗ represents the set of all possible fuzzy sets which can be defined on ܷ∗ (given

ui߳Ui), ui is transformed to a fuzzy set denoted by ܣ

௨௭௭ that is defined on the universe of

discourse

ܷ

∗. The fuzzification operator F that produces this transformation is defined by

F: Ui =>

ܷ

∗

where

F(ui) = ܣ

௨௭௭,

Frequently, “singleton fuzzification” is used. It produces a fuzzy set ܣ

௨௭௭߳

ܷ

∗ with a

membership function given by

37

ߤೠ

(ݔ) = ቄ1 ݔ = ݑ

0 ݐℎ݁ݎݓ݅ݏ݁

Any fuzzy set with this form of membership function is termed “singleton”. Singleton

fuzzification is the type for which the input fuzzy set has only a single point of nonzero

membership and the number ui is represented by the singleton fuzzy set. In implementations

where singleton fuzzification is used, ui only takes on its measured values without any noise

involved. “Gaussian fuzzification” that uses bell type membership functions about input

points, and triangular fuzzification using triangle shapes, are common examples [38].

3.3.3 Fuzzy inference mechanism

Designing a fuzzy inference system (FIS) from data can be separated into two principal

stages: (1) automatic rule generation and (2) system optimization [30]. Rule generation is the

guide to a fundamental system that has a given space partitioning and the set of rules that

corresponds to it. System optimization is realized at different sorts of levels. Variable

selection could be a comprehensive selection or is possibly handled rule by rule. The goal of

rule base optimization is to choose the most efficient rules and to use rule conclusions in the

best way. It’s possible to enhance space partitioning by adding or removing fuzzy sets and by

tuning the parameters of membership function. Structure optimization has great significance:

choosing variables, lessening the rule base and optimizing the number of fuzzy sets.

There are two main tasks associated with fuzzy inference mechanism:

1. Matching task which involves determining the degree of each rule’s being relevant to

the current situation as marked by the inputs ݑ, ݅ = 1,2, … . ,݊.

2. Inference step which involves reaching the conclusions from the current inputs ui and

the information in the rule-base.

When the fuzzy set representing the premise of the ith rule is denoted by ܣଵ

× ܣଶ × … × ܣ ,

there will be two steps for matching:

38

Step 1: Combining inputs with rule premises: This step is about finding fuzzy sets ܣଵ

,

ܣଶ

, … ,ܣ

, with membership functions.

ߤ

భ

ೕ (ݑଵ) = ߤభೕ

(ݑଵ) ∗ ߤభೠ

(ݑଵ)

ߤ

మ

ೖ(ݑଶ) = ߤమೖ(ݑଶ) ∗ ߤమೠ

(ݑଶ)

.

.

ߤ

ೕ (ݑ) = ߤೕ (ݑ) ∗ ߤೠ

(ݑ)

(for all j, k, … ,l) combining the fuzzy sets from fuzzification with the fuzzy sets used in each

of the terms in the rules’ premises. When the singleton fuzzification is used, each of the input

fuzzy sets has only a single point of nonzero membership function.

(e.g. ߤ

ೕ (ݑ) = ߤೕ (ݑ) for ݑଵ = ݑଵ and ߤೕ

(ݑ) = 0 for ݑଵ ≠ ݑଵ)

To put it in another way, ߤ

ೠ(ݑ) = 1, with singleton fuzzification, for all ݅ = 1,2, … ,݊ for

the given ݑ inputs resulting in

ߤ

భ

ೕ (ݑଵ) = ߤభೕ

(ݑଵ)

ߤ

మ

ೖ(ݑଶ) = ߤమೖ(ݑଶ)

.

ߤ

ೕ (ݑ) = ߤೕ (ݑ)

Step 2: Determining those rules that are on: In this step, membership values ߤ(ݑଵ,ݑଶ, … ,ݑ)

are determined for the premise of ݅௧ rule which represents the certainty that each rule

premise is consistent with the given inputs. Defining

ߤ(ݑଵ,ݑଶ , … ,ݑ) = ߤೕ

(ݑଵ)ߤమೖ

(ݑଶ) … ߤ

39

that is a function of the inputs ݑ, ߤ(ݑଵ,ݑଶ, … ,ݑ) represents the certainty that the

antecedent of rule ݅ matches the information in the case of singleton fuzzification use. The

ߤ(ݑଵ,ݑଶ, … ,ݑ) is a multidimensional certainty surface. It stands for the certainty of a

premise of a rule and for the level to which a particular rule is consistent for a given set of

inputs. The implied fuzzy set is determined by the inference step which is then taken by

calculating the “implied fuzzy set” ܤ ,, for the ݅௧ rule, with the membership function

ߤ

൫ݕ൯ = ߤ(ݑଵ,ݑଶ, … , ݑ) ∗ ߤ(ݕ) (3.9)

The certainty level of the output’s being a specific crisp output ݕ within the universe of

discourse ݕ is specified by the implied fuzzy set ܤ

on considering simply rule I. The

defuzzification that comes after the inference step is employed to aggregate the conclusions of

all the rules which the implied fuzzy sets represent.

Defuzzification Methods: It is frequently important to find out a single crisp output from a

FIS. For instance, in the case of one attempting to classify a letter drawn by hand on a

drawing tablet, the FIS would be obliged to find out a crisp number to determine the letter that

was drawn. A process called defuzzification is used to attain this crisp number. In other

words, defuzzification means the way of extracting a crisp value from a fuzzy set as a

representative value.

Two known methods can be used for defuzzifying:

Center of Gravity (COG): The method picks the output distribution and works to find its

center of mass to produce one crisp number.

i

i

i

i

i

x

x

x

u

)

(

)

(

(3.10)

where the crisp output value ݑ is the abscissa (center of mass) under the center of gravity of

the fuzzy set, ߤ(ݔ) is the membership value in the membership function, ݔ is a running point

40

in a discrete universe. This expression is also considered the weighted average of the elements

in the support set.

The COG method for singletons attains the following expression

i

i

i

i

i

s

s

s

u

)

(

)

(

(3.11)

where ݏ is the position of singleton ݅ in the universe and ߤ(ݏ) represents the rule strength ߙ

of rule ݅. This technique has a good computational complexity and ݑ is differentiable with

respect to the singletons ݏ, that is practical in neuro-fuzzy systems.

Center of Average (COA): In this widely used method, a crisp output ݕ

௦ is selected

employing the centers of every one of the output membership functions and the highest

certainty of every one of the conclusions the implied fuzzy sets represent, and is described as

R

i

q

i

q

yq

R

i

q

i

q

yq

q

i

Crisp

q

y

B

y

B

b

y

1

1

)}

(

{

sup

)}

(

{

sup

(3.12)

here “sup” is the “supermum” (i.e., the least upper bound that is frequently regarded as

maximum value). Therefore, ݏݑ௫{ߤ(ݔ)} can simply be considered the highest value of ߤ(ݔ).

Fig. 3.3 outlines the inference mechanisms on different types of fuzzy systems graphically.

Most fuzzy inference systems can be categorized into three types depending on the types of

fuzzy reasoning.

In Type 1 fuzzy systems, the defuzzifier puts together the output sets that correspond to the

whole of the fired rules in a way to attain a single output set and afterwards comes up with a

crisp number which represents this output set that is put together, e.g., the centroid defuzzifier

comes up with the unity of the whole of the output sets and utilizes the centroid of the unity as

the crisp output [31]. The weighted average of each rule’s crisp output introduced by rule’s

weight and the output membership functions is the overall output.

41

Premise Consequent

Type1 Type2 Type3

A1 B1 w1 C1 w1 C1

z1=ax+by+c

ݔ ݕ ݖ

ݖ

A2 B2 w2 C2 w2 C2

z2=px+qy+r

ݔ ݕ ݖ ݖ

max

Multiplication

or min.

z = [w1z1+ w2z2]/ w1+ w2 z = [w1z1+ w2z2]/ w1+ w2

Figure 3.3 Types of fuzzy reasoning mechanisms [11]

In Type 2 fuzzy systems, fuzzy sets are quite helpful in conditions that make the

determination of an exact membership function for a fuzzy set hard; for this reason, they are

quite helpful in the incorporation of uncertainties. These uncertainties are caused by the

knowledge employed in the construction of rules in a fuzzy logic system and lead to rules that

have uncertain antecedents and/or consequents that are transformed in succession into

uncertain antecedent and/or consequent membership functions [31]. The overall fuzzy output

is attained after the application of 'max' operation to the fuzzy outputs that qualify. Every one

of these outputs equals the minimum rule strength and each rule’s membership function.

z

42

Type 3 is Takagi and Sugeno’s fuzzy IF-THEN rules. The output is a crisp number computed

by the multiplication of every one of the inputs by a constant and summing the result

afterwards. The weighted average of each rule’s output is the output.

In Fig. 3.3, a fuzzy inference system with two rules and two inputs is used to demonstrate the

different types of fuzzy rules and fuzzy reasoning described above.

3.4 Artificial Neural Networks

Recognizing that computing in the human brain takes place in a totally different manner from

the traditional digital computer, has been the incentive for research into artificial neural

networks, also known as “neural networks”. The brain is extremely complex, nonlinear and

parallel computing (information-processing system). It is capable of organizing its structural

constituents, called neurons, in order to carry out some necessary computations (e.g. pattern

recognition, perception and motor control) a lot more quickly than the highest speed digital

computer of the present time [29].

x0 ܫ = ∑ݓ ݔ ܵݑ݉݉ܽݐ݅݊

x1 ݓ ܻ

= ݂(ܫ)

ܶݎܽ݊ݏ݂݁ݎ

x2 ݓଶ

ݓଵ

Sum Transfer Output Path

•

•

•

wn

xn

Processing

Element

Inputs xi Synaptic

Weights wi

Figure 3.4 Artificial neuron

43

A neuron is a unit that processes information and is significant in a neural network’s

operation. The model of an artificial neuron that is fundamental in the design of artificial

neural networks is demonstrated in the block diagram of Fig. 3.4.

A set of synapses, also called connecting links, are the foundation elements of the neuronal

model. A weight or strength of its own characterizes each of these synapses. Specifically, a

signal ݔ for ݅ = 1,2, … ,݊, at the input of synapse, connected to neuron ݇, is multiplied by the

synaptic connection weight ݓ, for ݅ = 1,2, … ,݊. A result is generated by summing these

products, feeding them through a transfer function and then outputting them.

The output of the artificial neuron displayed in Fig. 3.4 is calculated from

)

(

1

n

j

i

j

ij

i

x

w

f

y

(3.13)

where ݔ is the input, ݕ is the output signal of the neuron, ݓ are the synaptic weight

coefficients, ߠ denotes the bias and ݂ is the activation function.

The activation function can be linear or nonlinear but a nonlinear sigmoid function is

frequently utilized as the activation function (eq. 3.14).

n

j

j

i

ij

j

x

w

y

1

)]

(

exp[

1

1

(3.14)

Neural networks are formed by a set of neurons in layer(s). The neurons are interconnected by

weighted connections at certain connection points which are called nodes. The way of

organization in a layered neural network is the layer formation. The least complicated

formation of a network with layers uses an input layer of source nodes which projects onto an

output layer of neurons (computation nodes) but not the other way round. This network is a

feedforward or acyclic type of network. Neurons in the network act as processing elements

which multiply an input by a set of weights and nonlinearly transform the result into an output

value.

44

On the whole, three basically different architectural network classes can be defined which are

single-layer feedforward (non-recurrent), multilayer feedforward and recurrent networks. The

feedforward neural network structures are shown in Fig. 3.5.

ݔଵ ݕଵ

ݔଵ

ݔଶ ∙

∙

ݔଶ

⋮

ݕଵ

ݔ୫

∙

∙∙

∙

ݕଶ

ݔ୫

⋮

⋮

⋮

2y

(a) (b)

Figure 3.5 (a) A single layer network, (b) A simple multilayer network [11]

3.4.1 Neural network’s learning

The most important specialty of a neural network is its capability of learning from its

environment and improving its performance through learning. An interactive adjustment

process applied to its synaptic weights and bias levels enables a neural network to learn about

its environment. Every one of the iterations of the learning process makes the network well-

informed of its environment. Learning in the circumstances of neural networks can be clearly

stated to be a process by which the neural network’s free parameters are adapted through a

stimulation process by the environment where the network is embedded. The way that the

parameter changes occur determines the type of learning [29].

A set of rules that are well determined and defined for the solution of a learning process is

referred to as a learning algorithm. No learning algorithm that is the only one of its sort exists

in the neural network design, as expected. The manner that the adjustment to a neuron’s

synaptic weight is clearly and exactly expressed, fundamentally cause the learning algorithms

to differ from each other. Another factor that should be taken into consideration is the way

45

that a neural network (learning machine) which is comprised of a set of interconnected

neurons, is related to its environment. In this latter context, a term is spoken as a learning

paradigm that refers to a model of the environment in which the neural network operates [29].

There are two fundamental learning paradigms associated with neural networks: (1) Learning

with a teacher (known as supervised learning) and (2) Learning without a teacher which is

divided into two subdivisions that are unsupervised learning and reinforcement learning.

Supervised learning involves training with a teacher. The teacher can be thought of as a set of

input-output examples representing the knowledge of the environment. Neural network, on

the other hand, does not know the environment. Considering that a training vector drawn from

the environment is applied to both the teacher and the neural network, the teacher is capable

of supplying the neural network with a desired response for the training vector. The network

parameters, i.e. the connection weights, are adjusted under the combined influence of the

training vector and the error signal. The error signal is what makes the desired response differ

from the actual response of the network. This adjustment is brought about in an iterative and

step-by-step way aiming at eventually causing the neural network to emulate the teacher; this

emulation is supposedly optimum in a statistical sense. This manner transfers the

environment’s knowledge that can be obtained by the teacher, to the neural network through

learning as fully as possible. On reaching this condition, the teacher may be removed and the

neural network copes with the environment entirely on its own.

The form of supervised learning just described, is the error correction learning which involves

a closed-loop feedback system but the loop does not contain the unknown environment. The

mean-square error or the sum of squared errors over the training samples that are in terms of

the free parameters of the system constitutes the performance criterion for the system. This

criterion may be visualized as a multidimensional error performance or simply error surface,

with the parameters as coordinates. The true error surface is averaged over all possible input-

output examples. It’s a point on the error surface which represents any one of the system’s

operations that the teacher supervises. The operating point has to move down one after

46

another toward a minimum point of the error surface so that the system improves performance

over time and thus learns from the teacher; it’s possible for the minimum point to be a local

minimum or a global minimum. A supervised learning system is capable of doing this using

the helpful information it has about the gradient of the error surface that corresponds to the

system’s current behavior. The gradient of an error surface at any point is a vector which

points in the direction of steepest descent. On providing an algorithm designed to minimize

the cost function, a sufficient set of input-output examples and sufficient time allowed to carry

out the training, a supervised learning system is generally capable of performing tasks like

pattern classification and function approximation [29].

3.4.2 Multilayer perceptrons & backpropagation algorithm

Multilayer feedforward networks form a significant classification of neural networks. The

network is characteristically comprised of a set of sensory units (source nodes) which

establish the input layer, one or more hidden layers of computation nodes, and an output layer

of computation nodes. The input signal propagates through the network in a forward direction,

on a layer-by-layer basis. These neural networks are called multilayer perceptrons (MLP) that

represent a generalization of a single-layer perceptron.

A widely used algorithm which is named the error back-propagation algorithm, trains the

multilayer perceptrons in applications in order to successfully solve some challenging and

diverse problems. Error correction learning rule forms the basis for this algorithm. It may be

considered a generalization of an equally popular adaptive filtering algorithm: the least mean

square (LMS) algorithm for the special case of a single layer neuron [29].

Error back-propagation learning is comprised of two passes through the different layers of the

network which are a forward pass and a backward pass. The forward pass contains an activity

pattern (input vector) whose effect propagates through the network one layer after another and

is applied to the network’s sensory nodes. Consequently, an output set is created as the real

network response. In the duration of the forward pass, all of the synaptic weights of the

47

network are unchanging. In the duration of the backward pass, all the synaptic weights are

adjusted according to an error correction rule. Particularly, the real network response is taken

out of a desired (target) response to come up with an error signal. The error signal is

propagated back through the network, in contrast to the direction of synaptic connections, thus

the naming “error back-propagation”. The synaptic weights are adjusted such that the real

network response moves nearer to the desired response statistically. The error back-

propagation algorithm is known in the literature as the back-propagation algorithm, or simply,

back-prop, as well. The learning process carried out with the algorithm is referred to as the

back-propagation learning.

There are three distinguishing characteristics of a multilayer perceptron:

1. There is a nonlinear function involved in the model of each neuron. The nonlinearity

mentioned here is smooth, in other words, differentiable everywhere. A generally

employed nonlinearity form which is sufficient for this requirement is a sigmoidal

nonlinearity that the following logistic function defines:

)

exp(

1

1

j

j

v

y

(3.15)

where ݒ is the induced local field (i.e. the weighted sum of all synaptic inputs plus the

bias) of neuron ݆, and ݕ is the output of the neuron.

2. One or more layers of hidden neurons which do not belong to the input or output of

the network can be found in the network. The network is capable of learning complex

duties by extracting increasingly significant specialties from the input patterns

(vectors) due to these hidden neurons.

3. The network performs a high connectivity degree which is decided by the network

synapses. A change in the network’s connectivity obligates a change in the population

of synaptic connections or their weights.

48

The multilayer perceptron derives its computational power when these characteristics are

combined with the capability of learning from experience through training. The back-

propagation algorithm has great significance in neural networks since it supplies a

computationally efficient method in order to train multilayer perceptrons.

Fig. 3.6 demonstrates the architectural graph of a multilayer perceptron with one hidden layer

and an output layer. The illustrated network is fully connected meaning that a neuron in one

layer of the network is connected to all the nodes/neurons in the previous layer. Signal flow

through the network progresses in a forward direction, from left to right and on a layer-by-

layer basis. The value of each neuron is computed by first summing the weighted sums and

the bias and then applying ݂(sum) (the sigmoid function) to calculate the neuron’s activation.

Input Output

⋮ ⋮

⋮ ⋮

⋮

⋮

Bias

Figure 3.6 Multilayer feedforward network [11]

Next, the training processes of the three layer feedforward network will be analyzed. Firstly,

three stages describing the feedforward phase in the network are: input (I), hidden (H) and

output (O) layers.

Input Layer (I): The input of the hidden layer is equal to the output of the input layer.

H

I

Input

Output