Reed Solomon Doc

101
Contents Abstract Chapter 1 1.1 OVERVIEW 1.2 LITERATURE SURVEY 1.3 OBJECTIVE OF THESIS Chapter 2: ERROR CONTROL CODING 2.1. INTRODUCTION 2.2 ERROR DETECTION AND ERROR CORRECTION 2.3 ERROR DETECTION AND CORRECTION CODES 2.3.1 TERMINOLOGIES AND DEFINITIONS USED IN EDAC CODES 2.3.2 CONCURRENT ERROR DETECTION SCHEMES 2.3.2.1 Parity Codes 2.3.2.2 Checksum Codes 2.3.2.3 m-out-of-n Codes 2.3.2.4 Berger Codes 2.3.3 Concurrent Error Correction Schemes 2.3.3.1 Bose – Chaudhuri – Hocquenqhem (BCH) Codes 2.3.3.2 Hamming Single Error – Correcting Codes 2.3.3.2 Burst Error Correcting Codes Chapter 3: GALOIS FIELD ARITHMETIC 3.1 INTRODUCTION 3.2 DEFINITION OF GALOIS FIELD 3.3 PROPERTIES OF GALOIS FIELDS 3.4 CONSTRUCTION OF GALOIS FIELDS 3.5 GALOIS FIELD ARITHMETIC Chapter 4:REED SOLOMON CODES 4.1 INTRODUCTION 4.2 RS CODES BACKGROUND 4.3 Characteristics of the RS Encoder 4.4 DESIGN AND IMPLEMENTATION 4.5 SELF-CHECKING RS ENCODER 4.6 CONCURRENT ERROR DETECTION SCHEME OF THE RS DECODER Chapter 5:Field Programmable Gate Array (FPGA) 5.1 History of FPGA 1

Transcript of Reed Solomon Doc

Page 1: Reed Solomon Doc

ContentsAbstract

Chapter 11.1 OVERVIEW1.2 LITERATURE SURVEY1.3 OBJECTIVE OF THESIS

Chapter 2: ERROR CONTROL CODING2.1. INTRODUCTION2.2 ERROR DETECTION AND ERROR CORRECTION2.3 ERROR DETECTION AND CORRECTION CODES2.3.1 TERMINOLOGIES AND DEFINITIONS USED IN EDAC CODES2.3.2 CONCURRENT ERROR DETECTION SCHEMES

2.3.2.1 Parity Codes2.3.2.2 Checksum Codes2.3.2.3 m-out-of-n Codes2.3.2.4 Berger Codes

2.3.3 Concurrent Error Correction Schemes2.3.3.1 Bose – Chaudhuri – Hocquenqhem (BCH) Codes2.3.3.2 Hamming Single Error – Correcting Codes2.3.3.2 Burst Error Correcting Codes

Chapter 3: GALOIS FIELD ARITHMETIC3.1 INTRODUCTION3.2 DEFINITION OF GALOIS FIELD3.3 PROPERTIES OF GALOIS FIELDS3.4 CONSTRUCTION OF GALOIS FIELDS3.5 GALOIS FIELD ARITHMETIC

Chapter 4:REED SOLOMON CODES4.1 INTRODUCTION4.2 RS CODES BACKGROUND4.3 Characteristics of the RS Encoder

4.4 DESIGN AND IMPLEMENTATION4.5 SELF-CHECKING RS ENCODER4.6 CONCURRENT ERROR DETECTION SCHEME OF THE RS DECODER

Chapter 5:Field Programmable Gate Array (FPGA)5.1 History of FPGA5.2 Basic concepts of FPGA5.3 FPGA Advantage5.4 Language Used in FPGA5.5 Importance of HDLs

Chapter 6: INTRODUCTION TO VERILOG6.1 Introduction

1

Page 2: Reed Solomon Doc

6.2 Behavioral level6.3 Register-Transfer Level6.4 Gate Level6.5 History of Verilog6.6 Features of Verilog HDL6.7 Simulation6.8 SYNTHESIS:

2

Page 3: Reed Solomon Doc

Concurrent Error Detection in Reed–SolomonEncoders and Decoders

ABSTRACT

Reed–Solomon (RS) codes are widely used to identify and correct errors

in transmission and storage systems. When RS codes are used for high reliable

systems, the designer should also take into account the occurrence of faults in

the encoder and decoder subsystems. In this project, self-checking RS encoder

and decoder architectures are presented. The RS encoder architecture exploits

some properties of the arithmetic operations in .

These properties are related to the parity of the binary representation of

the elements of the Galois Field. In the RS decoder, the implicit redundancy of

the received codeword, under suitable assumptions explained in this paper,

allows implementing concurrent error detection schemes useful for a wide range

of different decoding algorithms with no intervention on the decoder architecture.

Moreover, performances in terms of area and delay overhead for the proposed

circuits are presented.

3

Page 4: Reed Solomon Doc

CHAPTER 1

INTRODUCTION

Digital communication system is used to transport an information bearing signal from the source to a user destination via a communication channel. The information signal is processed in a digital communication system to form discrete messages which makes the information more reliable for transmission. Channel coding is an important signal processing operation for the efficient transmission of digital information over the channel. It was introduced by Claude E. Shannon in 1948 by using the channel capacity as an important parameter for error – free transmission. In channel coding the number of symbols in the source encoded message is increased in a controlled manner in order to facilitate two basic objectives at the receiver: error detection and error correction. Error detection and error correction to achieve good communication is also employed in electronic devices. It is used to reduce the level of noise and interferences in electronic medium. The amount of error detection and correction required and its effectiveness depends on the signal to noise ratio (SNR).

1.1 OVERVIEW

Every information signal has to be processed in a digital communication system before it is transmitted so that the user at the receiver end receives an error – free information. A digital communication system has three basic signal processing operations: source coding, channel coding and modulation. A digital communication system is shown in the block diagram given below:

Figure-1.1: Block Diagram of a Digital Communication System

4

Page 5: Reed Solomon Doc

In source coding, the encoder maps the digital generated at the source output into another signal in digital form. The objective is to eliminate or reduce redundancy so as to provide an efficient representation of the source output. Since the source encoder mapping is one-to – one, the source decoder on the other end simply performs the inverse mapping, thereby delivers to the user a reproduction of the original digital source output. The primary benefit thus gained from the application of source coding is a reduced bandwidth requirement.

In channel coding, the objective for the encoder is to map the incoming digital signal into a channel input and for the decoder is to map the channel output into an output signal in such a way that the effect of channel noise is minimized. That is the combined role of the channel encoder and decoder is to provide for a reliable communication over a noisy channel. This provision is satisfied by introducing redundancy in a prescribed fashion in the channel encoder and exploiting it in the decoder to construct the original encoder input as accurately as possible. Thus in source coding, redundant bits are removed whereas in channel coding, redundancy is introduced in a controlled manner.

Then modulation is performed for the efficient transmission of the signal over the channel. Various digital modulation techniques could be applied for modulation such as Amplitude Shift Keying (ASK), Frequency- Shift Keying (FSK) or Phase – Shift Keying (PSK). The addition of redundancy in the coded messages implies the need for increased transmission bandwidth. Moreover, the use of coding adds complexity to the system, especially for the implementation of decoding operations in the receiver. Thus bandwidth and system complexityhas to be considered in the design trade – offs in the use of error - control coding to achieve acceptable error performance.

Different errors correcting codes can be used depending on the properties of the system and the application in which the error correcting is to be introduced. Generally error – correcting codes have been classified into block codes and convolutional codes. The distinguishing feature for the classification is the presence or absence of memory in the encoders for the two codes. To generate a block code, the incoming information stream is divided into blocks and each block is processed individually by adding redundancy in accordance with a prescribed algorithm. The decoder processes each block individually and corrects errors by exploiting redundancy.Many of the important block codes used for error – detection are cyclic codes. These are also called cyclic redundancy check codes.

In a convolutional code, the encoding operation may be viewed as the discrete – time convolution of the input sequence with the impulse response of the encoder. The duration of the impulse response equals the memory of the encoder. Accordingly, the encoder for a convolutional code operates on the incoming message sequence, using a “sliding window”equal in duration to its own memory. Hence in a convolutional code, unlike a block code where codewords are produced on a block- by – block basis, the channel encoder accepts message bits as continuous sequence and thereby generates a continuous sequence of encoded bits at a higher rate.

5

Page 6: Reed Solomon Doc

1.2 LITERATURE SURVEY

Channel coding is a widely used technique for the reliable transmission and reception of data. Generally systematic linear cyclic codes are used for channel coding. In 1948, Shannon introduced the linear block codes for complete correction of errors [1]. Cyclic codes were first discussed in a series of technical notes and reports written between 1957 and 1959 by Prange [2], [3], [4]. This led directly to the work published in March and September of 1960 by Bose and Ray-Chaudhuri the BCH codes. [5], [6], [7]. In 1959, Irving Reed and Gus Solomon described a new class of error-correcting codes called Reed-Solomon codes [8]. Originally Reed-Solomon codes were constructed and decoded through the use of finite field arithmetic [9], [10] which used nonsingular Vandermonde matrices [10]. In 1964 Singleton showed that this was the best possible error correction capability for any code of the same length and dimension [11]. Codes that achieve this "optimal" error correction capability are called Maximum Distance Separable (MDS). Reed-Solomon codes are by far the dominant members, both in number and utility, of the class of MDS codes. MDS codes have a number of interesting properties that lead to many practical consequences.

The generator polynomial construction for Reed-Solomon codes is the approach mostcommonly used today in the error control literature. This approach initially evolvedindependently from Reed-Solomon codes as a means for describing cyclic codes. Gorenstein and Zierler then generalized Bose and Ray-Chaudhuri's work to arbitrary Galois fields of size p m , thus developing a new means for describing Reed and Solomon's "polynomial codes" [12]. It was described that vector c is a code word in the code defined by g(x) if and only if its corresponding code polynomial c(x) is a multiple of g(x). So the information symbols could be easily mapped onto code words. All valid code polynomials are multiples of the generator polynomial. It follows that any valid code polynomial must have as roots the same 2t consecutive powers of α that form the roots of g(x). This approach leads to a powerful and efficient set of decoding algorithms.

After the discovery of Reed-Solomon codes, a search began for an efficient decodingalgorithm. In 1960, Reed and Solomon proposed a decoding algorithm based on the solution of sets of simultaneous equations [8].Though much more efficient than a look-up table, Reed and Solomon's algorithm is still useful only for the smallest Reed-Solomon codes. In 1960 Peterson provided the first explicit description of a decoding algorithm for binary BCH codes [13], His "direct solution" algorithm is quite useful for correcting small numbers of errors but becomes computationally intractable as the number of errors increases. Peterson's algorithm was improved and extended to non - binary codes by Gorenstein and Zierler (1961) [12], Chien (1964) [14], and Forney (1965) [15]. These efforts were productive, but Reed-Solomon codes capable of correcting more than six or seven errors still could not be used in an efficient manner.

In 1967, Berlekamp demonstrated his efficient decoding algorithm for both non - binary BCH and Reed-Solomon codes [16], [17]. Berlekamp's algorithm allows for the efficient

6

Page 7: Reed Solomon Doc

decoding of dozens of errors at a time using very powerful Reed-Solomon codes. In 1968 Massey showed that the BCH decoding problem is equivalent to the problem of synthesizing the shortest Linear Feedback Shift Register capable of generating a given sequence [18]. Massey then demonstrated a fast shift register-based decoding algorithm for BCH and Reed-Solomon codes that is equivalent to Berlekamp's algorithm. This shift register-based approach is now referred to as the Berlekamp-Massey algorithm.

In 1975 Sugiyama, Kasahara, Hirasawa, and Namekawa showed that Euclid's algorithm can also be used to efficiently decode BCH and Reed- Solomon codes [19]. Euclid's algorithm is a means for finding the greatest common divisor of a pair of integers. It can also be extended to more complex collections of objects, including certain sets of polynomials with coefficients from finite fields.

As mentioned above, Reed – Solomon codes are based on the finite fields so they can beextended or shortened. In this thesis Reed-Solomon codes used for decoding in the compact discs are encoded and decoded. The generator polynomial approach has been used for encoding and decoding of data.

1.3 OBJECTIVE OF THESIS

The objectives of the thesis are:1. To analyze the important characteristics of various coding techniques that could

be used for error control in a communication system for reliable transmission of digital information over the channel.

2. To study the Galois Field Arithmetic on which the most important and powerful ideas of coding theory are based.

3. To study the Reed – Solomon codes and the various methods used for encoding and decoding of the codes to achieve efficient detection and correction of the errors.

4. Analysis of the simulation results of the Reed – Solomon encoder and decoder.

7

Page 8: Reed Solomon Doc

CHAPTER 2

ERROR CONTROL CODING

2.1 INTRODUCTION

The designer of an efficient digital communication system faces the task of providing a system which is cost effective and gives the user a level of reliability. The information transmitted through the channel to the receiver is prone to errors. These errors could be controlled by using Error- Control Coding which provides a reliable transmission of data through the channel. In this chapter, a few error control coding techniques are discussed that rely on systematic addition of redundant symbols to the transmitted information. Using these techniques, two basic objectives at the receiver are facilitated: Error Detection and Error Correction.

2.2 ERROR DETECTION AND ERROR CORRECTION

When a message is transmitted or stored it is influenced by interference which can distort the message. Radio transmission can be influenced by noise, multipath propagation or by other transmitters. In different types of storage, apart from noise, there is also interference which is due to damage or contaminant sin the storage medium. There are several ways of reducing the interference. However, some interference is too expensive or impossible to remove. One way of doing so is to design the messages in such ways that the receiver can detect if an error has occurred or even possibly correct the error too. This can be achieved by Error – Correcting Coding. In such coding the number of symbols in the source encoded message is increased in a controlled manner, which means that redundancy is introduced [20].

To make error correction possible the symbol errors must be detected. When an error has been detected, the correction can be obtained in the following ways:(1) Asking for a repeated transmission of the incorrect codeword (Automatic repeat Request (ARQ)) from the receiver.(2) Using the structure of the error correcting code to correct the error (Forward ErrorCorrection (FEC)).

It is easier to detect an error than it is to correct it. FEC therefore requires a higher number of check bits and a higher transmission rate, given that a certain amount of information has to be transmitted within a certain time and with a certain minimum error probability. The reverse is also true; if the channel offers a certain possible transmission rate, ARQ permits a higher information rate than FEC, especially if the channel has a low error rate. FEC however has the advantage of not requiring a reply channel. The choice in each particular case therefore depends on the properties of the system or on the

8

Page 9: Reed Solomon Doc

application in which the error – correcting is to be introduced. In many applications, such as radio broadcasting or Compact Disc (CD), there is no reply channel. Another advantage of the FEC is that the transmission is never completely blocked even if the channel quality falls below such low levels that ARQ system would have completely asked for retransmission. In a system using FEC, the receiver has no realtime contact with the transmitter and can not verify if the data was received correctly. It mustmake a decision about the received data and do whatever it can to either fix it or declare an alarm.

There are two main methods to introduce Error- Correcting Coding. In one of them the symbol stream is divided into block and coded. This consequently called Block Coding. In the other one a convolution operation is applied to the symbol stream. This is called Convolutional Coding.

Figure-2.1: Error Detection and Correction

FEC techniques repair the signal to enhance the quality and accuracy of the receivedinformation, improving system performance. Various techniques used for FEC are described in the following sections.

9

Page 10: Reed Solomon Doc

2.3 ERROR DETECTION AND CORRECTION CODESThe telecom industry has used FEC codes for more than 20 years to transmit digital datathrough different transmission media. Claude Shannon first introduced techniques for FEC in 1948 [1]. These error-correcting codes compensated for noise and other physical elements to allow for full data recovery.

For an efficient digital communication system early detection of errors is crucial in preserving the received data and preventing data corruption. This reliability issue can be addressed making use of the Error Detection And Correction (EDAC) schemes for concurrent error detection (CED).

The EDAC schemes employ an algorithm, which expresses the information message, such that any of the introduced error can easily be detected and corrected (within certain limitations), based on the redundancy introduced into it. Such a code is said to be e-error detecting, if it can detect any error affecting at most e-bits e.g. parity code, two-rail code, m-out-of-n, Berger ones etc.

Similarly it is called e-error correcting, if it can correct e-bit errors e.g. Hamming codes, Single Error Correction Double Error Detection (SECDED) codes, Bose Choudhary – Hocquenqhem (BCH) Codes, Residue codes, Reed Solomon codes etc. As mentioned earlier both error detection and correction schemes require additional check bits for achieving CED. An implementation of this CED scheme is shown in Figure

Figure-2.2: Concurrent Error Detection Implementation

Here the message generated by the source is passed to the encoder which adds redundant check bits, and turns the message into a codeword. This encoded message is then sent through the channel, where it may be subjected to noise and hence altered. When this message arrives at the decoder of the receiver, it gets decoded to the most likely message. If any error had occurred during its transmission, the error may either get detected and necessary action taken (Error Detection scheme) or the error gets corrected and the operations continue (Error Correction Scheme).Detection of an error in an error detection scheme usually leads to the stalling of the operation in progress and results in a possible retry of some or all of the past

10

Page 11: Reed Solomon Doc

computations. On the other hand the error correction scheme permits the process to continue uninterrupted, but usually requires higher amount of redundancy than the error detection. Each of these schemes has varied applications, depending on the reliability requirements of the system.

2.3.1 TERMINOLOGIES AND DEFINITIONS USED IN EDAC CODES

An EDAC Scheme is said to be1. Unidirectional: when all components affected by a multiple error change their

values in only one direction from, say, 0 to 1, or vice versa, but not both.2. Asymmetric: if its detecting capabilities are restricted to a single error type (0 to

1 or 1 to 0). These codes are useful in applications which expect only singletype of error during their operation.

3. Linear: if the sum of two codewords (encoded data) is also a codeword. Sumhere means adding two binary block bit wise XOR.

4. Non-separable or Non-systematic: if the check-bits are embedded within thecodeword, and cannot be processed concurrently with the information, else isreferred to as separable or systematic codes.

5. Block codes: if the codewords can be considered as a collection of binaryblocks, all of the same length, say n. Such codes are characterized by the factthat the encoder accepts k information symbols from the information source andappends a set of r redundant symbols derived from the information symbols, inaccordance with the code algorithm.

6. Binary: if the elements (or symbols) can assume either one of two possiblestates (0 and 1).

7. Cyclic: if it is a parity check code with the additional property that every cyclicshift of the word is also a code word.

8. Forward Error Correcting: if enough extra parity bits are transmitted alongwith the original data, so as to enable the receiver to correct a predeterminedmaximum corrupted data, without any further retransmissions.

9. Backward Error Correcting: if the redundancy is enough only to detect errorsand retransmission is required.

The Hamming Distance of two codewords x, y ∈ n F denoted by d(x, y), is theminimum number of bits in which they differ .i.e., the number of 1s in a XOR b. Adistance d code can detect (d-1) bit errors and correct [(d-1)/2] bit errors.

The Error Syndrome is a defined function which is used to exactly identify the errorlocation by addition of the bad parity bit locations. It is the vector sum of the receivedparity digits and the parity check digits recomputed from the received informationdigits. A Codeword is a block of n symbols that carries the k information symbols andthe r redundant symbols (n = k + r).

For a (n, k) block code, the rate of the code defined as the ratio of number ofinformation bits to the length of code k/n.

11

Page 12: Reed Solomon Doc

2.3.2 CONCURRENT ERROR DETECTION SCHEMES

Schemes for Concurrent Error Detection (CED) find wide range of applications, since only after the detection of error, can any preventive measure be initiated. The principle of error detecting scheme is very simple, an encoded codeword needs to preserve some characteristic of that particular scheme, and a violation is an indication of the occurrence of an error. Some of the CED techniques are discussed below.

2.3.2.1 Parity Codes

These are the simplest form of error detecting codes, with a hamming distance of two (d=2), and a single check bit (irrespective of the size of input data). They are of two basic types: Odd and Even. For an even-parity code the check bit is defined so that the total number of 1s in the code word is always even; for an odd code, this total is odd. So, whenever a fault affects a single bit, the total count gets altered and hence the fault gets easily detected. A major drawback of these codes is that their multiple fault detection capabilities are very limited.

2.3.2.2 Checksum Codes

In these codes the summation of all the information bytes is appended to the information as bbit checksum. Any error in the transmission will be indicated as a resulting error in the checksum. This leads to detection of the error. When b=1, these codes are reduced to parity check codes. The codes are systematic in nature and require simple hardware units.

2.3.2.3 m-out-of-n Codes

In this scheme the codeword is of a standard weight m and standard length n bits. Whenever an error occurs during transmission, the weight of the code word changes and the error gets detected. If the error is a 0 to 1 transition an increase in weight is detected, similarly 1 to 0 leads to a reduction in weight of the code, leading to easy detection of error. This scheme can be used for detection of unidirectional errors, which are the most common form of error in digital systems.

2.3.2.4 Berger Codes

Berger codes are systematic unidirectional error detecting codes. They can be considered as an extension of the parity codes. Parity codes have one check bit, which can be considered as the number of information bits of value 1 considered in modulo 2. On the

12

Page 13: Reed Solomon Doc

other hand Berger codes have enough check bits to represent the count of the information bits having value 0. The number of check bits (r) required for k-bit information is given by

r = [log2 (k − 1)]

Of all the unidirectional error detecting codes that exist [21] suggests, m-out-of-n codes to be the most optimal. These codes however, are not of much application because of its nonseparable nature. Amongst the separable codes in use, the Berger codes have been proven to be most optimal, requiring the smallest number of check bits [22].The Berger Codes, however, are not optimal when only t unidirectional errors need to bedetected instead of all unidirectional errors. For this reason a number of different modified Berger codes exist: Hao Dong introduced a code [23] that accepts slightly reduced error detection capabilities, but does so using fewer check bits and smaller checker sizes. In this code the number of check bits is independent of the number of information bits. Bose and Lin [24] have introduced their own variation on Berger codes and Bose [25] has further introduced a code that improves on the burst error detection capabilities of his previous code, where erroneous bit are expected to appear in groups.

Blaum [26] further improves on Bose-Lin Code. Favalli proposes an approach where code cost is reduced because of the graph theoretic optimization [27].

2.3.3 Concurrent Error Correction Schemes

Error-correcting codes (ECC) were first developed in the 1940s following a theorem of Claude Shannon that showed that almost error-free communication could be obtained over a noisy channel [1]. The quality of the recovered signal will however depend on the error correcting capability of the codes.

Error correction coding requires lower rate codes than error detection, but is a basic necessity in safety critical systems, where it is absolutely critical to get it right first time itself. In these special circumstances, the additional bandwidth required for the redundant check-bits is an acceptable price.

Over the years, the correcting capability of the error correction schemes have graduallyincreased with constrained number of computation steps. Concurrently, the time and hardware cost to perform a given number of computational steps have also greatly decreased. These trends have led to greater application of these error-correcting techniques.

One application of ECC is to correct or detect errors in communication over channels where the errors appear in bursts, i.e. the errors tend to be grouped in such a way that several neighboring symbols are incorrectly detected. Non – binary codes are used to correct such errors. Since the error is always a number different from zero in the field, it is always one in the binary codes. In a non – binary code the error can take many values and the magnitude of the error has to be determined to correct the error. Some of the non – binary codes are discussed in the following sections.

13

Page 14: Reed Solomon Doc

2.3.3.1 Bose – Chaudhuri – Hocquenqhem (BCH) Codes

BCH codes are the most important and powerful classes of linear block codes which are cyclic codes with a wide variety of parameters. The most common BCH codes are characterized as follows. Specifically, for any positive integer m (equal to or greater than 3) and t [less than (2m −1) / 2 ] there exists a binary BCH code with the following parameters:

Block length: = 2m −1 nNumber of message bits k ≥ n − mtMinimum distance 2 1 min d ≥ t +

Where m is the number of parity bits and t is number of errors that can be corrected.Each BCH code is a t – error correcting code in that it can detect and correct up to t random errors per codeword. The Hamming single – error correcting codes can be described as BCH codes. The BCH codes offer flexibility in the choice of code parameters, block length and code rate.

2.3.3.2 Hamming Single Error – Correcting Codes

Hamming codes can also be defined over the non – binary field. The parity check matrix is designed by setting its columns as the vectors of GF(p) m whose first non – zero element equals one. There are n = ( p −1) /( p −1) m such vectors and any pair of these is linearly independent.

If p = 3 and r = 3, then a Hamming single error – correcting code is generated.These codes represent a primitive and trivial error correcting scheme, which is key to theunderstanding of more complex correcting schemes. Here the codewords have a minimum Hamming distance 3 (i.e. d = 3), so that one error can be corrected, two errors detected. For enabling error correction, other than the error detection, the location of the error must also be identified. So for one-bit correction on an n-bit frame, if there is an error in one out of the total n bit positions, it must be identified otherwise it must be stated that there is no error. Once located, the correction is trivial: the bit is inverted.

Hamming codes have the advantage of requiring fewest possible check bits for their codelengths, but suffer from the disadvantage that, whenever more than single error occurs, it is wrongly interpreted as a single-error, because each non-zero syndrome is matched with one of the single-error events. Thus it is inefficient in handling burst errors.

2.3.3.2 Burst Error Correcting Codes

The transmission channel could be memory less or it may be having some memory. If thechannel is memory less then the errors may be independent and identically distributed. .Sometimes it is possible that the channel errors exhibit some kind of memory. The most

14

Page 15: Reed Solomon Doc

common example of this is burst errors. If a particular symbol is in error, then the chances are good that its immediate neighbors are also wrong. Burst errors occur for instance in mobile communications due to fading and in magnetic recording due to media defects. Burst errors can be converted to independent errors by the use of an interleaver.

A burst error can also be viewed as another type of random error pattern and be handledaccordingly. But some schemes are particularly well suited to dealing with burst errors.Cyclic codes represent one such class of codes. Most of the linear block codes are either cyclic or are closely related to the cyclic codes. An advantage of cyclic codes over most other codes is that they are easy to encode. Furthermore, cyclic codes posses a well defined mathematical structure called the Galois Field, which has led to the development of a very efficient decoding schemes for them.

Reed Solomon codes represent the most important sub-class of the cyclic codes [5], [14].

15

Page 16: Reed Solomon Doc

CHAPTER 3

GALOIS FIELD ARITHMETIC

3.1 INTRODUCTIONIn chapter 2, various types of the error correcting codes were discussed. Burst errors areefficiently corrected by using cyclic codes. The Galois field or the Finite Fields are extensively used in the Error - Correcting Codes (ECC) using the Linear Block Codes. In this chapter these finite fields are discussed thoroughly. The Galois Field is a finite set of elements which has defined rules for arithmetic. These roots are not algebraically different from those used in the arithmetic with ordinary numbers but the only difference is that there is only a finite set of elements involved. They have been extensively used in Digital Signal Processing (DSP), Pseudo- Random Number Generation, Encryption and Decryption protocols in cryptography.

The design of efficient multiplier, inverter and exponentiation circuits for Galois Fieldarithmetic is needed for these applications.

3.2 DEFINITION OF GALOIS FIELD

A Finite Field is a field with a finite field order (i.e., number of elements), also called a Galois field. The order of a finite field is always a prime or a power of a prime . For each prime power, there exists exactly one finite field GF(p m ).A Field is said to be infinite if it consists of infinite number of elements, for e.g. Set of real numbers, complex numbers etc. Finite field on the other hand consist of finite number of elements.

GF(p m ) is an extension field of the ground field GF(p), where m is a positive integer. For p = 2, GF(2 m ) is an extension field of the ground field GF(2) of two elements (0,l). GF(2 m ) is a vector space of dimension m over GF(2) and hence is represented using a basis of m linearly independent vectors. The finite field (2 ) m GF contains (2m −1) non zero elements. All finite fields contain a zero element and an element, called a generator or primitive element α , such that every non-zero element in the field can be expressed as a power of this element. The existence of this primitive element (of order 2 m -1) is asserted by the fact that the nonzero elements of GF(2 m ) form a cyclic group.Encoders and decoders for linear block codes over GF(2 m ), such as Reed-Solomon codes, require arithmetic operations in GF(2 m ). In addition, decoders for some codes over GF(2), such as BCH codes, require computations in extension fields GF(2 m ). In GF(2 m ) addition and subtraction are simply bitwise exclusive-or. Multiplication can be performed by several approaches, including bit serial, bit parallel (combinational), and software. Division requires the reciprocal of the divisor, which can be computed in

16

Page 17: Reed Solomon Doc

hardware using several methods, including Euclid’s algorithm, lookup tables, exponentiation, and subfield representations. With the exception of division, combinational circuits for Galois field arithmetic are straightforward. Fortunately, most decoding algorithms can be modified so that only a few divisions are needed, so fast methods for division are not essential.

3.3 PROPERTIES OF GALOIS FIELDS

1. In GF(2 m ) fields, there is always a primitive element α , such that you can express every element of GF(2 m ) except zero as a power of α [28]. You can generate every field GF(2 m ) using a primitive polynomial over GF(2), and the arithmetic performed in the GF(2 m ) field is modulo this primitive polynomial.

2. If α is a primitive element of GF(2 m ), its conjugate α 2m is also primitive elements of GF(2 m ).

3. If α is an element of order n in GF(2 m ), all its conjugates have the same order n.4. If α , an element in GF(2 m ), is a root of a polynomial f(x) over GF(2), then all

the distinct conjugates of α , also elements in GF(2 m ), are roots of f(x).5. The 2 m- 1 nonzero elements of GF(2^m ) form all the roots of x m 2 - 1 - 1 = 0.

The elements of GF(2 m ) form all the roots of x 2^m - x = 0.

3.4 CONSTRUCTION OF GALOIS FIELDS

A Galois field GF (2 m ) with primitive element α is generally represented as (0, 1,α ,α^2 ,………. α 2^k−2). The simplest example of a finite field is the binary field consisting of the elements (0, 1). Traditionally referred to as GF(2) 2 , the operations in this field are defined as integer addition and multiplication reduced modulo 2. Larger fields can be created by extending GF(2) into vector space leading to finite fields of size 2 m . These are simple extensions of the base field GF(2) over m dimensions. The field GF(2 m ) is thus defined as a field with 2 m elements each of which is a binary m-tuple. Using this definition, m bits of binary data can be grouped and referred to it as an element of GF(2 m ). This in turn allows applying the associated mathematical operations of the field to encode and decode data [10].Let the primitive polynomial be φ (x), of degree m over GF(2 m ). Now any th i element of the field is given by

Hence all the elements of this field can be generated as powers ofα . This is the polynomial representation of the field elements, and also assumes the leading coefficient of φ (x) to be equal to 1. Figure 3.1 shows the Finite field generated by the primitive polynomial 1 + α +α^2 +α^3 +α^4 +α^8 represented as GF( 2^8 ) or GF(256).

17

Page 18: Reed Solomon Doc

Figure 3.1: Representation of some elements in GF( 2^8)

Note that here, the primitive polynomial is of degree 4, and the numeric value of α isconsidered purely arbitrary. Using the irreducibility property of the polynomial φ (x), this can be proven that this construction indeed produces a field.The primitive polynomial is used in a simple iterative algorithm to generate all the elements of the field. Hence different polynomials will generate different fields. In the work [29], the author claims that even though there are numerous choices for the irreducible polynomials, the fields constructed are all isomorphic.

3.5 GALOIS FIELD ARITHMETIC

Galois Field Arithmetic (GFA) is very attractive in several ways. All the operations thatgenerally result in overflow or underflow in traditional mathematics gets mapped on to a value inside the field because of modulo arithmetic followed in GFA, hence rounding issues also get automatically eliminated.

GF generally facilitates the representation of all elements with a finite length binary word. For e.g. in GF(2 m ), all the operands are of m-bits, where m is always a number smaller than the conventional bus width of 32 or 64. This in turn introduces a huge amount of parallelism into the GFA operations. Further, we assume ‘m’ to be a multiple of 8, or a power of 2, because of its inherent convenience for sub-word parallelism [30].To study the GFA an introduction to the mathematical concepts of the trace and dual basis are necessary [31], [32].

18

Page 19: Reed Solomon Doc

Definition 1: The trace of an element β which belongs to GF(2 m ) is defined as

Definition 2: A basis { μj } in GF(2 m ) is a set of m linearly independent elements in GF(2^ m ),where 0 ≤ j ≤ m-1 .Definition 3: Two bases { μj } and { λk } are the dual of one another if

CHAPTER 4

19

Page 20: Reed Solomon Doc

REED SOLOMON CODES

4.1 INTRODUCTION

High reliable data transmission and storage systems frequently use error

correction codes (ECC) to protect data. By adding a certain grade of redundancy

these codes are able to detect and correct errors in the coded information. In the

design of high reliable electronics systems both the Reed-Solomon (RS) encoder

and decoder should be self checking in order to avoid faults in these blocks

which compromise the reliability of the whole system. In fact, a fault in the

encoder can produce a no correct codeword, while a fault in the decoder can

give a wrong data word even if no errors occur in the codeword transmission.

Therefore, great attention must be paid to detect and recover faults in the

encoding and decoding circuitry. Nowadays, the most used error correcting

codes are the RS codes, based on the properties of the finite field arithmetic. In

particular, finite fields with 2m elements are suitable for digital implementations

due to the isomorphism between the addition, performed modulo 2, and the XOR

operation between the bits representing the elements of the field.

The use of the XOR operation in addition and multiplication allows to use

parity check-based strategies to check the presence of faults in the RS encoder,

while the implicit redundancy in the codeword is used either for correct erroneous

data and for detect faults inside the decoder block.

4.2 RS CODES BACKGROUND

In this section, a short background on RS codes is outlined. In, more

information about finite fields and RS codes are provided. The finite fields used in

digital implementations are in the form GF (2m), where represents the number of

20

Page 21: Reed Solomon Doc

bits of a symbol to be coded. An element a(x) 2 GF (2m) is a polynomial with

coefficients ; 1g and can be seen as a symbol of m bits .

The addition of two elements a(x) and b(x) 2 GF (2m) is the sum modulo 2 of the

coefficients ai and bi, i.e., is the bitwise XOR of the two symbols a and b. The

multiplication of two elements a(x) and b(x) 2 GF(2m) requires the multiplication of

the two polynomials followed by the reduction modulo i(x), where i(x) is an

irreducible polynomial of degree m. Multiplication can be implemented as an

AND-XOR network, as explained.

The RS (n; k) code is defined by representing the data word symbols as

elements of the field GF (2m) and the overall data word is treated as a polynomial

d(x) of degree k1 with coefficient in GF(2m). The RS codeword is then generated

by using the generator polynomial g(x). All valid code words are exactly divisible

by g(x). The general form of g(x) is

where

and a primitive element of the field, i.e.,

The code words of a separable RS(n; k) code correspond to the polynomial c(x)

with degree n - 1 that can be generated by using the following formulas:

Where p(x) is a polynomial with degree less than n - k representing the parity

symbols. In practice, the encoder takes k data symbols and adds 2t parity

symbols obtaining a n symbol codeword. The 2t parity symbols allow the

correction of up to t symbols containing errors in a codeword. Defining the

21

Page 22: Reed Solomon Doc

Hamming distance of two polynomials a(x) and b(x) of degree n as the number of

coefficients of the same degree that are different, , and

the Hamming weight W(a(x)) as the number of non-zero coefficients of a(x), i.e.,

it is easy to prove that H(a(x); b(x)) = W(a(x) - b(x)).

In a RS (n; k) code the Hamming distance between two code words is n-

k. After the transmission of the coded data on a noisy channel the decoder

receives as input a polynomial c(x) = c(x) +e(x), where e(x) is the error

polynomial. The RS decoder identifies the position and magnitude of up to t

errors and it is able to correct them.

In other words the decoder is able to identify the e(x) polynomial if the

Hamming weight W (e(x)) is not greater than t. The decoding algorithm provides

as output the codeword that is the only codeword having an Hamming distance

not greater than t from the received polynomial c(x).

METHODOLOGY AND PREVIOUS WORK

In this section, the motivations of the design methodology used for the

proposed implementations are described starting from an overview of the

presented literature.

A radiation-tolerant RS encoder hardened against space radiation effects

through circuit and layout techniques is presented. Single and multiple parity bits

schemes are presented to check the correctness of addition and multiplication in

polynomial basis representation of finite fields.

The authors extend the techniques presented to detect faults occurring in

the RS encoder, achieving the self checking property for the RS encoder

implementation. Moreover, a method to obtain CED circuits for finite field

multipliers and inverters has been proposed. Since both the RS encoder and

decoder are based on GF(2m) addition, multiplication, and inversion, their self-

22

Page 23: Reed Solomon Doc

checking implementation can be obtained by using CED implementations of

these basic arithmetic operations.

Moreover, a self-checking algorithm for solving the key equation (that is a

part of the overall decoding algorithm) has been introduced. Exploiting the

algorithm proposed and substituting the elementary operations with the

corresponding CED implementation for the other parts of the decoding algorithm

a self-checking decoder can been implemented. This approach can be used for

the encoder, that use only addition and constant multiplication and is illustrated in

the following subsection, but it is unusable for the decoder as described later in

this paper and a specific technique will be explained in the successive section.

4.3 Characteristics of the RS Encoder

I. INTRODUCTION

Reed-Solomon error correcting codes (RS codes) are widely used in

communication systems and data storages to recover data from possible errors

that occur during transmission and from disc error respectively. One typical

application of the RS codes is the Forward Error Correction (FEC), shown in Fig.

1, in the optical network G.709, which has a fast transmission rate of 40 Gbps.

Before data transmission, the encoder attaches parity symbols to the data

using a predetermined algorithm before transmission. At the receiving side, the

decoder detects and corrects a limited predetermined number of errors occurred

23

Page 24: Reed Solomon Doc

during transmission. Transmitting the extra parity symbols requires extra

bandwidth compared to transmitting the pure data. However, transmitting

additional symbols introduced by FEC is better than retransmitting the whole

package when at least an error has been detected by the receiver. Many

implementation of the RS codec is targeted to ASIC design and only a few

papers discuss about synthesizing the RS codec toward reconfigurable devices.

Implementing Reed- Solomon codec on reconfigurable devices is attractive for

two main reasons. FPGAs provide flexibility where the algorithm parameters can

be altered to provide different error correction capabilities. They also provide a

rapid development cycle, resulting in a short time to market, which is a major

factor in industry.

The objective of this work is to implement a generic Reed- Solomon VHDL

code to measure the performance of the RS codec on Altera’s StratixII. The

performance of the implemented RS codec will be compared to the performance

of Altera’s RS codec. The performance metrics to be used are the area occupied

by the design and the speed at which the design can run. The Reed-Solomon

code to be implemented is RS (255,223). This project over the theory behind

Reed-Solomon code, the architecture of the implemented RS codec, the

preliminary results and future work extending the current research.

II. REED-SOLOMON THEORY

A Reed-Solomon code is a block code and can be specified as RS(n,k) as

shown in Fig. 2. The variable n is the size of the codeword with the unit of

symbols, k is the number of data symbols and 2t is the number of parity symbols

Each symbol contains s number of bits.

24

Page 25: Reed Solomon Doc

The relationship between the symbol size, s, and the size of the

codeword, n, is given by (1). This means that if there are s bits in one symbol,

there could exist 2s−1 distinct symbols in one codeword, excluding the one with

all zeros.

n = 2s − 1 (1)

The RS code allows correcting up to t number of symbol errors where t is

given by

A. Galois Field

The Reed-Solomon code is defined in the Galois field, which contains a

finite set of numbers where any arithmetic operations on elements of that set will

result in an element belonging to the same set. Every element, except zero, can

be expressed as a power of a primitive element of the field. The non-zero field

elements form a cyclic group defined based on a binary primitive polynomial. An

addition of two elements in the Galois field is simply the exclusive-OR (XOR)

operation. However, a multiplication in the Galois field is more complex than the

standard arithmetic. It is the multiplication modulo the primitive polynomial used

to define the Galois field. For example, a Galois field, GF(8), is constructed with

the primitive polynomial p(z) = z3+z+1 based on the primitive element _ = z

25

Page 26: Reed Solomon Doc

The GF(8) will be the basis for the Reed-Solomon code, RS(7,3). Because

each symbol has

log2 (8) = 3

Bits, the variables for RS code are one

n = 23 − 1 = 7, k = 3

(Arbitrarily chosen to balance between the number of information and

parity symbols in one codeword),

t = n−k 2 = 2.

Given a stream of data,

the stream of data can be represented as a polynomial of

B. Encoder

26

Page 27: Reed Solomon Doc

The transmitted codeword is systematically encoded and defined in (3) as

a function of the transmitted message m(x), the generator polynomial g(x) and

the number of parity symbols 2t.

c(x) = m(x)X2t + m(x) mod g(x) (3)

where g(x) is the generator polynomial of degree 2t and given by

The variable _ is a root of the binary primitive polynomial of degree t. In

OTN G.709, the binary primitive polynomial is defined as

x8 + x4 + x3 + x2 + 1.

C. DecoderAfter going through a noisy transmission channel, the encoded data can

be represented as

r(x) = c(x) + e(x) (5)

Where e(x) represents the error polynomial with the same degree as c(x)

and r(x). Once the decoder evaluates e(x), the transmitted message, c(x), is then

recovered by adding the received message, r(x), to the error polynomial, e(x), as

shown in Equation 6.

C(x) = r(x) + e(x) = c(x) + e(x) + e(x) = c(x) (6)

Note that e(x) + e(x) = 0 because addition in Galois field is equivalent to an

exclusive-OR and e(x) XOR e(x) = 0.

27

Page 28: Reed Solomon Doc

Five functional blocks that form the decoder are:

1) Syndrome Calculator: In this block, errors are detected by calculating

the syndrome polynomial, S(x) as shown in (7). This is used by the Key Equation

functional block.

Where

When S(x) = 0, the received codeword is error free. Else, the Key Equation

Solver will use S(x) to generate the error locator polynomial, _(x), and the error

evaluator polynomial, (x).

2) Key Equation Solver: The Key Equation is an equation that describes the

relationship between the syndromes

Solving (14) gives error locator polynomial _(x) and the error evaluator

polynomial (x), which can be represented in the general form shown in (10) and

(11) respectively.

28

Page 29: Reed Solomon Doc

The error locator polynomial, (x), has a degree of e < t. The error evaluator

polynomial, (x) has degree at most e−1 to determine the magnitude of e errors.

There are different algorithms that have been used to solve the key equation and

two common ones are the Euclidean algorithm and the

Berlekamp-Massey (BM) algorithm as shown in Fig.3 and Fig.4,

respectively. A glance at the flowcharts, shown in Fig.3 and Fig.4, reveals that

the Euclidean algorithm has simpler structure than the BM algorithm does.

However, it needs a significant amount of logic elements to implement the

polynomial division function. On the other hand, the BM algorithm has a complex

structure, but uses fewer gates to be implemented. In this research project, the

BM algorithm is chosen to be implemented because of its low degree of

hardware utilization to solve the key equation.

3) Error Locator: The locations of the errors are determined based on the error

locator polynomial,^(x) (10). Each is plugged into

(10). If the location of an error is c, where c is derived

based on in the Galois Field. This process is known as the Chien search

algorithm.

4) Error Evaluator: The magnitude of the errors is determined based on the error

evaluator polynomial,

For every symbol in the codeword,

4.4 DESIGN AND IMPLEMENTATION

A. Encoder

29

Page 30: Reed Solomon Doc

The encoder is architected using the Linear Feedback Shift Register

Design. The coefficients , are derived Fig. 4. The Berlekamp-Massey

algorithm based on (12).

g(x) = x16 + 59x15 + 13x14 + 104x13 + 189x12 + 68x11 + 209x10 + 30x9 + 8x8

+ 163x7 + 65x6 + 41x5 + 229x4 + 98x3 + 50x2 + 36x + 59 (12)

Each message is accompanied by a pulse signal, which indicates the

beginning of a message. After 239 clock cycles, the encoder starts concatenating

the 16 calculated parities to the message to make a codeword of 255 symbols.

B. Decoder

The high level architecture of the decoding data path is shown in Fig. 6.

The decoder first calculates the syndrome of the receiving codeword to detect

any potential errors occurred during transmission. If the syndrome polynomial,

S(x), is not zero, the receiving codeword is therefore erroneous and will be

corrected if the number of erroneous symbols is less than eight.

30

Page 31: Reed Solomon Doc

1) Syndrome Calculator: The syndrome takes in codeword after codeword at a

rate of 1 symbol/clock cycle. The i start signal indicates the beginning of each

codeword. The syndrome architecture is shown in Fig. 7.

The coefficients are obtained by solving (13). After 255 clock cycles,

S(x) is ready to be processed by the Key Equation Solver.

31

Page 32: Reed Solomon Doc

2) Key Equation Solver: The Key Equation solver waits on the i start signal

before capturing the syndrome polynomial, S(x). The Berlekamp algorithm is

implemented using a state machine design as shown in Fig. 8.

The state machine is designed from the BM flowchart, Fig 4. The state

machine is initialized each time a syndrome, S(x) is ready to be processed by the

Key Equation Solver to generate ^(x). Once ^(x) is found, calculated using 14.

3) Chien Search Error Location: The chien’s architecture is shown in Fig. 9.

The chien’s and Forney’s algorithm for calculating the error locations and values

are described as follows:

For i = 1 to 255 then

End If

End For

32

Page 33: Reed Solomon Doc

The chien’s algorithm calculates the location of the erroneous symbols in

each codeword. Considering the above algorithm cc i and ri represent the i-th

symbol in the corrected codeword and the received i th polynomial, respectively.

is the derivative of and also defined as Therefore,

4) Forney’s method for error values:

Fig. 10 shows the architecture of the forney error evaluator. It implements

part of the algorithm described above. An”INVERSE ROM” is implemented as a

look-up table to store the inverse field of the Galois field elements since current

state of art’s reconfigurable devices has resources for look-up tables.

4.5 SELF-CHECKING RS ENCODER

The implementation of RS encoders are usually based on an LFSR, which

implements the polynomials division over the finite field. In Fig. 1, the

implementation of an RS encoder is shown. The additions and multiplications are

performed on GF (2m) and gi is the coefficients of the generator polynomial g(x).

The RS encoder architecture is composed by slice blocks containing a constant

multiplier, an adder, and a register (see shaded block in Fig. 1). The number of

slices to implement for an RS (n; k) code is n -k. The self-checking

33

Page 34: Reed Solomon Doc

implementation requires the insertion of some parity prediction blocks and a

parity checker. The correctness of each slice is checked by using the architecture

shown in Fig. 2.

34

Page 35: Reed Solomon Doc

The input and output signals to the slice are as follows.

• Ain is the registered output of the previous slice.

• Pin is the registered parity of the previous slice.

• Fin is the feed-back of the LFSR.

• PFin is the parity of the feed-back input.

• Aout is the result of the multiplication and addition operation.

• Pout is the predicted parity of the result.

The parity prediction block is implemented by using (5). It must be noticed

that some constrains in the implementation of the constant multiplier must be

added in order to avoid interference between different outputs when a fault

occurs. These interferences are due to the sharing of intermediate results

between different outputs and, therefore, can be avoided by using networks with

fan-out equal to one: considering the field-programmable gate array (FPGA)

implementation of constant multiplier, this constrain is not a serious drawback.

In fact, each output bit is computed by implementing a XOR network

requiring a very limited number of LUTs: for example, considering the field

GF(28) and an FPGA based on four-inputs LUTs, three LUT’s in the worst case

are required. Table I reports the overhead introduced for different constant gi

without resource sharing in the case of GF(28).

The predicted parity bit and the output of each slice are evaluated by the

parity checker block as shown in Fig. 3, and an error indicator informs if a

difference between the predicted parity bit and the parity

of the m slice outputs is detected.

35

Page 36: Reed Solomon Doc

The parity checker block checks if the parity of the inputs is even or odd.

The self checking implementation of the parity checker is realized with a two-rail

circuit. The two outputs are each equal to the parity of one of the two disjoint

subsets of the inputs, as proposed. The fault-free behavior of the checker, when

a correct set of inputs is provided (i.e., no faults occur in the slices) is the

following: the output codes 01 or 10 are generated for an odd parity checker or

the output codes 00 or 11 for an even parity checker. If the checker receive as

input an erroneous codeword (i.e., a fault occurs in a slice) the checker provides

the output codes 11 or 00 for an odd parity checker or the output codes 01 or 10

for an even parity checker.

Also, if a fault occurs in the checker the outputs provided are 11 or 00 for

an odd parity checker or the output codes 01 or 10 for an even parity checker.

This considerations guarantee the self-checking property of the checker. It can

be noticed that, due to the LFSR-based structure of the RS encoder, there are no

control state machines to be protected against faults.

Therefore, the use of the described self-checking arithmetic structures

allows checking the entire RS encoder. The evaluations in terms of area and

delay of this structure has been carried out by using a Xilinx Virtex II FPGA as

the target device and the design flow has been performed by using the Xilinx

ISE foundation framework.

36

Page 37: Reed Solomon Doc

Table I reports the area of each of the blocks described in this section.

The adder is implemented by using one LUT for each output, while the area of

the constant multipliers and of the parity prediction block depends by the

coefficients gi. In Table I, the row named “additional logic” represents the logic

added to the slice in order to predict the parity bit. The number of LUTs required

to implement the parity checker depends by the number of slices of the encoder,

i.e., the number n - k of check bits of the RS code.

In particular, implementing the parity checker as a network of XOR gates,

the number of LUTs is d((n-k)(m+ 1))=(3)e. Starting from the result shown in

Table I, the area overhead has been computed for the given case. The overhead

50% and it is independent from the number of check symbols (n - k). In fact, for

each check symbol (m = 8) the overhead for the single slice is about six LUTs,

plus the overhead due to the parity checker (three LUTs). The equation

describing the overhead is

The characterization of the critical path is different for each slice,

depending on the complexity of the constant multiplier gi. In the worst case, the

constant multiplier gi implemented by using an eight XOR network requires three

LUTs, therefore, in the worst case path five LUTs are crossed.

37

Page 38: Reed Solomon Doc

In order to compute the critical path for the overall self checking encoder

architecture, the following additional signal paths must be considered:

• Path crossing the parity prediction block that is comparable with the path of the

worst-case constant multiplier;

• Path crossing the parity checkers. This path depends by the number of bits

provided as input to the checker. In fact, the number of required LUTs is equal to

the number of levels of the four inputs XOR network, that is dlog4 (n -k) (m + 1)

e.

The number of levels of the two-rail parity checker increases very slowly

with the growth of the number of checks symbols, and therefore, do not represent

a problem for the maximum frequency of the self checking decoder.

4.6 CONCURRENT ERROR DETECTION SCHEME OF THE RS DECODER

In Fig. 4, the CED implementation of the RS decoder is shown. Its main

blocks are as follows.

RS decoder, i.e., the block to be checked.

An optional error polynomial recovery block (the shaded block shown in

Fig. 4). This block is needed if the RS decoder does not provide at the

output the error polynomial coefficients.

Hamming weight counter, that checks the number of nonzero coefficients

of the error polynomial.

Codeword checker, that checks if the output data of the RS decoder form

a correct codeword.

38

Page 39: Reed Solomon Doc

Error detection block that take as inputs the output of the Hamming weight

counter and of the codeword checker and provides error detection signal if

a fault in the RS decoder has been detected.

The RS decoder can be considered as a black box performing an

algorithm for the error detection and correction of the input data (the coefficients

of the received data forming the polynomial c(x). The error polynomial recovery

block is composed by a shifter register of length L (the latency of the decoder)

and by a GF(2m) adder having as operands the coefficients of c(x) and c(x).

The Hamming weight counter is composed by the following:

A comparator indicating (at each clock cycle) if the e(x) coefficients are

zero;

A counter that takes into account the number of nonzero coefficients;

A comparator between the counter output and t that is the maximum

allowed number of nonzero elements.

The codeword checker block checks if the reconstructed c(x) is a

codeword, i.e., if it is exactly divisible for the generator polynomial g(x). The

following two implementations of this block are proposed.

Implementation 1:

It is based on the computation of the remainder of the polynomial division

between c(x) and g(x). If all the coefficients of the remainder polynomial are zero

then the polynomial c(x) is a correct codeword. The remainder of the division by

g(x) is exactly the function of the systematic RS encoder.

Therefore, a systematic RS encoder with the same g(x) polynomial of the

decoder is used if c(x) is a codeword. Faults in the decoder can be detected

ignoring either g(x) and also ignoring how the operation in GF (2m) is performed.

39

Page 40: Reed Solomon Doc

We only need to reuse the same RS encoder used to create the codeword for the

computation of the remainder c(x) obtained from the decoder. The drawback of

this implementation is the additional latency introduced by the RS encoder, which

is n-k clock cycles. This latency must be considered by the error detection block

that must wait n - k clock cycles to check the two properties defined in Section III.

The area occupation of the RS encoder is smaller than the area occupation of

the decoder; therefore, the overhead introduced by this block is about 15% of the

decoder area.

Implementation 2:

The codeword checker block is based on the so-called syndrome

calculation. This operation is the first to be performed in the decoder; therefore,

conceptually this approach implies a partial duplication of the RS decoder and

implies the knowledge of the used Galois field and the roots of g(x). The

syndrome calculations imply the evaluation of the received polynomial c(x) for the

values of x in the set A, with , i.e., A is the set of the roots of

g(x). The received polynomial c(x) is exactly divisible for g(x) if and only if it is

exactly divisible for all the monomials , where is a root of g(x). The

polynomial is divisible by is zero.

Therefore, the received polynomial is a codeword if and only if all the

computed syndromes are zero. The syndromes computation block is composed

by a GF (2m) constant multiplier, an adder and an m-bit register. The output of

this block is valid one clock cycle later than the computation of the last coefficient

of the polynomial. The area occupation of the syndrome calculation block is

equivalent to the encoder area occupation. In fact, in both cases we need n- k

blocks composed by an adder, a constant multiplier and an m-bit register.

The main difference between implementation 1 and 2 is the latency of the

codeword checker block. The error detection block takes as inputs the outputs of

40

Page 41: Reed Solomon Doc

the Hamming weight counter and the outputs of the codeword checker. Its

implementation depends from the chosen implementation of the codeword

checker. If we use implementation 1 the error detection block must delay the

output of the Hamming weight counter for n -k clock cycles and checks if all the

coefficients of the remainder polynomial are zero.

On the other hand, if we use the syndromes calculation block the inputs

are the computed syndromes and the error detection block checks if all the

received symbols are zero. The additional blocks used to detect faults inside the

decoder are susceptible to faults and, therefore, their implementation must

assure the self-checking property, in order to face the age old question of “who

checks the checker.” For the codeword checker and the error polinomiyal

generator blocks only register and GF (2m) addition and constant multiplication

are used and, therefore, the same consideration of Section IV can be used to

obtain the self-checking property of these blocks. For the counters and the

comparator used in the Hamming weight counter and error detection blocks,

many efficient techniques can be found in literature.

41

Page 42: Reed Solomon Doc

CHAPTER 5Field Programmable Gate Array (FPGA)

5.1 History of FPGA

The historical roots of FPGAs are in complex programmable logic devices

(CPLDs) of the early to mid 1980s. A Xilinx co-founder invented the field

programmable gate array in 1984. CPLDs and FPGAs include a relatively large

number of programmable logic elements. CPLD logic gate densities range from

the equivalent of several thousand to tens of thousands of logic gates, while

FPGAs typically range from tens of thousands to several million.

The primary differences between CPLDs and FPGAs are architectural. A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-ofproducts logic arrays feeding a relatively small number of

clocked registers. The result

of this is less flexibility, with the advantage of more predictable timing delays and

a higher logic-to-interconnect ratio. The FPGA architectures, on the other hand,

are dominated by interconnect. This makes them far more flexible (in terms of the

range of designs that are practical for implementation within them) but also far

more complex to design for.

Another notable difference between CPLDs and FPGAs is the presence in

most FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories. Some FPGAs have the capability of partial re-

configuration that lets one portion of the device be re-programmed while other

portions continue running.

42

Page 43: Reed Solomon Doc

5.2 Basic concepts of FPGA

FPGA is device that contains a matrix of reconfigurable gate array logic

circuitry. The programmable logic components can be programmed to duplicate

the functionality of basic logic gates such as AND, OR, XOR and NOT or more

complete combinational functions such as decoders or simple mathematical

functions. Most FPGA includes memory elements in these programmable logic

components which consist of simple flip-flops or more complete blocks of

memory. When FPGA is configured, the internal circuitry is connected in a way

which creates a hardware implementation of the software application. Unlike

processors, FPGAs uses dedicated hardware for processing logic and doesn’t

require an operating system.

FPGAs consist of three components:

array of programmable logic blocks

with look-up tables (LUTs), registers, multiplexors

programmable interconnect

I/O blocks around the perimeter

Block Diagram of FPGA

43

Page 44: Reed Solomon Doc

The performance of an application is not affected when additional process

added to the FPGA since is parallel in nature and do not have to compete to use

the same resource. FPGA can enforce critical interlock logic and can be

designed to prevent I/O forcing by an operator. Unlike hardwired printed circuit

board (PCB) designs, which have fixed and limited hardware resources, FPGA

based system can literally rewire its internal circuitry to allow reconfiguration after

the control system is deployed to the field.

FPGA devices deliver more performance and reliability of dedicated

hardware circuitry. Thousands of discrete components can be replaced by using

a single FPGA which incorporate millions of logic gates in a single integrated

circuit. The internal resources of an FPGA chip consists of a matrix of

configurable blocks (CLB) connected to periphery of I/O blocks. Signal are routed

within FPGA matrix by programmable interconnect switched and wire routes.

5.3 FPGA Advantage

FPGA have many advantages. One of that many advantage of FPGA is

the only unified flow that allows you to design for any silicon, vendor as well as

language. The various silicones are PLD, Platform FPGA, structured ASIC, ASIC

Prototypes, ASICs and SOCs. There are also many vendors available such as

Altera, Xilinx, Actel, Atmel, ChipExpress, Lattice and etc. There are many

languages that can be used to program a FPGA such as VHDL, Verilog, System

Verilog, C, C++, PSL and SVA.

FPGA is good used for large designs and it is also reconfigurable. When

creating designs, we can use simple VHDL or verilog commands to design a

complex FPGA design. Moreover, by using FPGA, it is able to deliver the

technical edge such as optimizing FPGA time closure with Precision Synthesis,

advanced timing analysis and optimizing timing closure with I/O optimization and

PCB integration. FPGA is also able to optimize the design process by reducing

by half the design time with rapid development process.

44

Page 45: Reed Solomon Doc

5.4 Language Used in FPGA

For a long time, programming languages such as FORTRAN, Pascal, and

C were being used to describe computer programs that were sequential in

nature. Similarly, in the digital design field, designers felt the need for a standard

language to describe digital circuits. Thus, Hardware Description Languages

(HDLs) came into existence. HDLs allowed the designers to model the

concurrency of processes found in hardware elements. Hardware description

languages such as Verilog HDL and VHDL became popular. Verilog HDL

originated in 1983 at Gateway Design Automation. Later, VHDL was developed

under contract from DARPA. Both Verilog® and VHDL simulators to simulate

large digital circuits quickly gained acceptance from designers.

Even though HDLs were popular for logic verification, designers had to

manually translate the HDL-based design into a schematic circuit with

interconnections between gates. The advent of logic synthesis in the late 1980s

changed the design methodology radically. Digital circuits could be described at

a register transfer level (RTL) by use of an HDL. Thus, the designer had to

specify how the data flows between registers and how the design processes the

data. The details of gates and their interconnections to implement the circuit were

automatically extracted by logic synthesis tools from the RTL description.

Thus, logic synthesis pushed the HDLs into the forefront of digital design.

Designers no longer had to manually place gates to build digital circuits. They

could describe complex circuits at an abstract level in terms of functionality and

data flow by designing those circuits in HDLs. Logic synthesis tools would

implement the specified functionality in terms of gates and gate interconnections.

HDLs also began to be used for system-level design. HDLs were used for

simulation of system boards, interconnect buses, FPGAs (Field Programmable

Gate Arrays), and PALs (Programmable Array Logic). A common approach is to

45

Page 46: Reed Solomon Doc

design each IC chip, using an HDL, and then verify system functionality via

simulation.

Today, Verilog HDL is an accepted IEEE standard. In 1995, the original

standard IEEE 1364-1995 was approved. IEEE 1364-2001 is the latest Verilog

HDL standard that made significant improvements to the original standard.

5.5 Importance of HDLs

HDLs have many advantages compared to traditional schematic-based design.

Designs can be described at a very abstract level by use of HDLs.

Designers can write their RTL description without choosing a specific

fabrication technology. Logic synthesis tools can automatically convert the

design to any fabrication technology. If a new technology emerges,

designers do not need to redesign their circuit. They simply input the RTL

description to the logic synthesis tool and create a new gate-level netlist,

using the new fabrication technology. The logic synthesis tool will optimize

the circuit in area and timing for the new technology.

By describing designs in HDLs, functional verification of the design can be

done early in the design cycle. Since designers work at the RTL level, they

can optimize and modify the RTL description until it meets the desired

functionality. Most design bugs are eliminated at this point. This cuts down

design cycle time significantly because the probability of hitting a

functional bug at a later time in the gate-level netlist or physical layout is

minimized

Designing with HDLs is analogous to computer programming. A textual

description with comments is an easier way to develop and debug circuits.

This also provides a concise representation of the design, compared to

gate-level schematics.Gate-level schematics is almost incomprehensible

for very complex designs.

46

Page 47: Reed Solomon Doc

CHAPTER 6

INTRODUCTION TO VERILOG

6.1 Introduction Verilog is a HARDWARE DESCRIPTION LANGUAGE (HDL). A hardware

description Language is a language used to describe a digital system, for

example, a microprocessor or a memory or a simple flip-flop. This just means

that, by using a HDL one can describe any hardware (digital) at any level. Verilog

is one of the HDL languages available in the industry for designing the Hardware.

Verilog allows us to design a Digital design at Behavior Level, Register Transfer

Level (RTL), Gate level and at switch level. Verilog allows hardware designers to

express their designs with behavioral constructs, deterring the details of

implementation to a later stage of design in the final design.

Verilog supports a design at many different levels of abstraction. Three of

them are very important:

Behavioral level

Register-Transfer Level

Gate Level

6.2 Behavioral level

This level describes a system by concurrent algorithms (Behavioral). Each

algorithm itself is sequential, that means it consists of a set of instructions that

are executed one after the other. Functions, Tasks and Always blocks are the

main elements. There is no regard to the structural realization of the design.

6.3 Register-Transfer Level

47

Page 48: Reed Solomon Doc

Designs using the Register-Transfer Level specify the characteristics of a

circuit by operations and the transfer of data between the registers. An explicit

clock is used. RTL design contains exact timing possibility; operations are

scheduled to occur at certain times.

6.4 Gate Level

Within the logic level the characteristics of a system are described by

logical links and their timing properties. All signals are discrete signals. They can

only have definite logical values (`0', `1', `X', `Z`). The usable operations are

predefined logic primitives (AND, OR, NOT etc gates). Using gate level modeling

might not be a good idea for any level of logic design. Gate level code is

generated by tools like synthesis tools and this net list is used for gate level

simulation and for backend.

6.5 History of Verilog

Verilog was started initially as a proprietary hardware modeling language

by Gateway Design Automation Inc. around 1984. It is rumored that the original

language was designed by taking features from the most popular HDL language

of the time, called HiLo as well as from traditional computer language such as C.

At that time, Verilog was not standardized and the language modified itself in

almost all the revisions that came out within 1984 to 1990.

Verilog simulator was first used beginning in 1985 and was extended

substantially through 1987.The implementation was the Verilog simulator sold by

Gateway. The first major extension was Verilog-XL, which added a few features

and implemented the infamous "XL algorithm" which was a very efficient method

for doing gate-level simulation.

48

Page 49: Reed Solomon Doc

The time was late 1990. Cadence Design System, whose primary product

at that time included Thin film process simulator, decided to acquire Gateway

Automation System. Along with other Gateway product, Cadence now became

the owner of the Verilog language, and continued to market Verilog as both a

language and a simulator.

At the same time, Synopsys was marketing the topdown design

methodology, using Verilog. This was a powerful combination. In 1990, Cadence

recognized that if Verilog remained a closed language, the pressures of

standardization would eventually cause the industry to shift to VHDL.

Consequently, Cadence organized Open Verilog International (OVI), and in 1991

gave it the documentation for the Verilog Hardware Description Language.

This was the event which "opened" the language. OVI did a considerable

amount of work to improve the Language Reference Manual (LRM), clarifying

things and making the language specification as vendor-independent as

possible.In 1990. Soon it was realized, that if there were too many companies in

the market for Verilog, potentially everybody would like to do what Gateway did

so far - changing the language for their own benefit. This would defeat the main

purpose of releasing the language to public domain.

As a result in 1994, the IEEE 1364 working group was formed to turn the

OVI LRM into an IEEE standard. This effort was concluded with a successful

ballot in 1995, and Verilog became an IEEE standard in December, 1995. When

Cadence gave OVI the LRM, several companies began working on Verilog

simulators. In 1992, the first of these were announced, and by 1993 there were

several Verilog simulators available from companies other than Cadence. The

most successful of these was VCS, the Verilog Compiled Simulator, from

Chronologic Simulation. This was a true compiler as opposed to an interpreter,

which is what Verilog-XL was. As a result, compile time was substantial, but

simulation execution speed was much faster. In the meantime, the popularity of

Verilog and PLI was rising exponentially.

49

Page 50: Reed Solomon Doc

Verilog as a HDL found more admirers than well-formed and federally

funded VHDL. It was only a matter of time before people in OVI realized the need

of a more universally accepted standard. Accordingly, the board of directors of

OVI requested IEEE to form a working committee for establishing Verilog as an

IEEE standard.

The working committee 1364 was formed in mid 1993 and on October 14,

1993, it had its first meeting. The standard, which combined both the Verilog

language syntax and the PLI in a single volume, was passed in May 1995 and

now known as IEEE Std. 1364- 1995. After many years, new features have been

added to Verilog, and new version is called Verilog 2001. This version seems to

have fixed lot of problems that Verilog 1995 had. This version is called 1364-

2000. Only waiting now is that all the tool vendors implementing it.

6.6 Features of Verilog HDL

Verilog HDL has evolved as a standard hardware description language. Verilog

HDL offers many useful features

Verilog HDL is a general-purpose hardware description language that is

easy to learn and easy to use. It is similar in syntax to the C programming

language. Designers with C programming experience will find it easy to

learn Verilog HDL.

Verilog HDL allows different levels of abstraction to be mixed in the same

model. Thus, a designer can define a hardware model in terms of

switches, gates, RTL, or behavioral code. Also, a designer needs to learn

only one language for stimulus and hierarchical design.

Most popular logic synthesis tools support Verilog HDL. This makes it the

language of choice for designers.

50

Page 51: Reed Solomon Doc

All fabrication vendors provide Verilog HDL libraries for postlogic synthesis

simulation. Thus, designing a chip in Verilog HDL allows the widest choice

of vendors.

The Programming Language Interface (PLI) is a powerful feature that

allows the user to write custom C code to interact with the internal data

structures of Verilog. Designers can customize a Verilog HDL simulator to

their needs with the PLI.

6.7 SimulationSimulation is the process of verifying the functional characteristics of

models at any level of abstraction. We use simulators to simulate the the

Hardware models. To test if the RTL code meets the functional requirements of

the specification, see if all the RTL blocks are functionally correct. To achieve this

we need to write testbench, which generates clk, reset and required test vectors.

We use waveform output from the simulator to see if the DUT (Device

under Test) is functionally correct. Most of the simulators comes with waveform

viewer, as design becomes complex, we write self checking testbench, where

testbench applies the test vector, compares the output of DUT with expected

value.

There is another kind of simulation, called timing simulation, which is

done after synthesis or after P&R (Place and Route). Here we include the gate

delays and wire delays and see if DUT works at rated clock speed. This is also

called as SDF simulation or gate simulation.

6.8 SYNTHESIS:

Synthesis is process in which synthesis tool like design compiler or

Synplify takes the RTL in Verilog or VHDL, target technology, and constrains as

input and maps the RTL to target technology primitives.

51

Page 52: Reed Solomon Doc

Synthesis tool after mapping the RTL to gates, also do the minimal

amount of timing analysis to see if the mapped design meeting the timing

requirements.

SOFTWARE DETAILS:

SPARTAN-III FPGA FAMILY:

Architectural Description:

Spartan-III Array:

The Spartan-III user-programmable gate array is composed of five major

configurable elements:

IOBs provide the interface between the package pins and the internal

logic.

CLBs provide the functional elements for constructing most logic.

Dedicated block RAM memories of 4096 bits each.

Clock DLLs for clock-distribution delay compensation and clock domain

control.

Versatile multi-level interconnects structure.

Values stored in static memory cells control all the configurable logic

elements and interconnect resources. These values load into the memory cells

on power-up, and can reload if necessary to change the function of the device.

Each of these elements will be discussed in detail.

Input/Output Block:

52

Page 53: Reed Solomon Doc

The Spartan-III IOB, as seen in Figure, features inputs and outputs that

support a wide variety of I/O signaling standards. These high-speed inputs and

outputs are capable of supporting various state of the art memory and bus

interfaces.

The three IOB registers function either as edge-triggered D-type flip-flops

or as level- sensitive latches. Each IOB has a clock signal (CLK) shared by the

three registers and independent Clock Enable (CE) signals for each register. In

addition to the CLK and CE control signals, the three registers share a Set/Reset

(SR). For each register, this signal can be independently configured as a

synchronous Set, a synchronous Reset, an asynchronous Preset, or an

asynchronous Clear.

Figure Spartan-III Input/Output Block (IOB)

Input Path:

A buffer in the Spartan-II IOB input path routes the input signal either

directly to internal logic or through an optional input flip-flop. An optional delay

53

Page 54: Reed Solomon Doc

element at the D-input of this flip-flop eliminates pad-to-pad hold time. The delay

is matched to the internal clock-distribution delay of the FPGA, and when used,

assures that the pad-to-pad hold time is zero.

Output Path:

The output path includes a 3-state output buffer that drives the output

signal onto the pad. The output signal can be routed to the buffer directly from

the internal logic or through an optional IOB output flip-flop.

The 3-state control of the output can also be routed directly from the

internal logic or through a flip-flip that provides synchronous enable and disable.

Each output driver can be individually programmed for a wide range of low-

voltage signaling standards. Each output buffer can source up to 24 mA and sink

up to 48 mA. Drive strength and slew rate controls minimize bus transients.

Storage Elements:

Storage elements in the Spartan-II slice can be configured either as

edge-triggered D-type flip-flops or as level-sensitive latches. The D inputs can be

driven either by function generators within the slice or directly from slice inputs,

bypassing the function generators.

Block RAM:

Spartan-III FPGAs incorporate several large block RAM memories. These

complement the distributed RAM. Block RAM memory blocks are organized in

columns. All Spartan-III devices contain two such columns, one along each

vertical edge. These columns extend the full height of the chip. Each memory

54

Page 55: Reed Solomon Doc

block is four CLBs high, and consequently, a Spartan-II device eight CLBs high

will contain two memory blocks per column, and a total of four blocks.

.

Design Implementation:

The place-and-route tools (PAR) automatically provide the

implementation flow described in this section. The practitioner takes the EDIF

netlist for the design and maps the logic into the architectural resources of the

FPGA (CLBs and IOBs, for example).

The placer then determines the best locations for these blocks based on

their interconnections and the desired performance. Finally, the router

interconnects the blocks. The PAR algorithms support fully automatic

implementation of most designs. For demanding applications, however, the user

can exercise various degrees of control over the process. User partitioning,

placement, and routing information are optionally specified during the design-

entry process.

The implementation of highly structured designs can benefit greatly from

basic floor planning. The implementation software incorporates Timing Wizard

timing-driven placement and routing. Designers specify timing requirements

along entire paths during design entry. The timing path analysis routines in PAR

then recognize these user-specified requirements and accommodate them.

Timing requirements are entered on a schematic in a form directly

relating to the system requirements, such as the targeted clock frequency, or the

maximum allowable delay between two registers. In this way, the overall

performance of the system along entire signal paths is automatically tailored to

55

Page 56: Reed Solomon Doc

user-generated specifications. Specific timing information for individual nets is

unnecessary.

Configuration:

Configuration is the process by which the bit stream of a design, as

generated by the Xilinx development software, is loaded into the internal

configuration memory of the FPGA.

Spartan-III devices support both serial configurations, using the

master/slave serial and JTAG modes, as well as byte-wide configuration

employing the Slave Parallel mode.

Modes

Spartan-III devices support the following four configuration modes:

• Slave Serial mode

• Master Serial mode

• Slave Parallel mode

• Boundary-scan mode

The Configuration mode pins (M2, M1, and M0) select among these

configuration modes with the option in each case of having the IOB pins either

pulled up or left floating prior to configuration.

Serial Modes:

56

Page 57: Reed Solomon Doc

There are two serial configuration modes: In Master Serial mode, the FPGA

controls the configuration process by driving CCLK as an output. In Slave Serial

mode, the FPGA Passively receives CCLK as an input from an external agent

(e.g., a microprocessor, CPLD, or second FPGA in master mode) that is

controlling the configuration process. In both modes, the FPGA is configured by

loading one bit per CCLK cycle. The MSB of each configuration data byte is

always written to the DIN pin first.

Slave Parallel Mode:

The Slave Parallel mode is the fastest configuration option. Byte-wide data is

written into the FPGA. A BUSY flag is provided for controlling the flow of data at

a clock frequency FCCNH above 50 MHz.

Slave Serial Mode:

In Slave Serial mode, the FPGAs CCLK pin is driven by an external source,

allowing FPGAs to be configured from other logic devices such as

microprocessors or in a Daisy-chain configuration.

Master Serial Mode:

In Master Serial mode, the CCLK output of the FPGA drives a Xilinx PROM

which feeds a serial stream of configuration data to the FPGA’s DIN input.

Operating Modes:

Block RAM memory supports two operating modes.

• Read Through

• Write Back

57

Page 58: Reed Solomon Doc

Figure: Configuration Flow Diagram

58

Page 59: Reed Solomon Doc

Read Through (One Clock Edge):

The read address is registered on the read port clock edge and data

appears on the output after the RAM access time. Some memories may place

the latch/register at the outputs, Depending on the desire to have a faster clock-

to-out versus setup time. This is generally considered to be an inferior solution

since it changes the read operation to an asynchronous function with the

possibility of missing an address/control line transition during the generation of

the read pulse clock.

Write Back (One Clock Edge):

The write address is registered on the write port clock edge and the data input is

written to the memory and mirrored on the write port input.

Features

7.5 ns pin-to-pin logic delays on all pins

fCNT to 125 MHz

72 macrocells with 1,600 usable gates

Up to 72 user I/O pins

5 V in-system programmable (ISP)

Endurance of 10,000 program/erase cycles

Program/erase over full commercial voltage and temperature range

Enhanced pin-locking architecture

90 product terms drive any or all of 18 macrocells within Function Block

Flexible 36V18 Function Block

Global and product term clocks, output enables, set and reset signals

Extensive IEEE Std 1149.1 boundary-scan (JTAG) support

Programmable power reduction mode in each macrocell

Slew rate control on individual outputs

59

Page 60: Reed Solomon Doc

User programmable ground pin capability

Extended pattern security features for design protection

High-drive 24 mA outputs

3.3 V or 5 V I/O capability

Advanced CMOS 5V FastFLASH technology

Supports parallel programming of more than one XC9500 concurrently

Available in 44-pin PLCC, 84-pin PLCC, 100-pin PQFP and 100-pin TQFP

packages

Xilinx ISE

The Xilinx ISE tools allow you to use schematics, hardware description languages (HDLs), and specially designed modules in a number of ways.

Schematics are drawn by using symbols for components and lines for

wires.Xilinx Tools is a suite of software tools used for the design of digital circuits

implemented using Xilinx Field Programmable Gate Array (FPGA) or Complex

Programmable Logic Device (CPLD).

The design procedure consists of (a) design entry, (b) synthesis and

implementation of the design, (c) functional simulation and (d) testing and

verification. Digital designs can be entered in various ways using the above CAD

tools: using a schematic entry tool, using a hardware description language (HDL)

– Verilog or VHDL or a combination of both. In this lab we will only use the

design flow that involves the use of Verilog HDL. The CAD tools enable you to

design combinational and sequential circuits starting with Verilog HDL design

specifications.

The steps of this design procedure are listed below:

1. Create Verilog design input file(s) using template driven editor.

2. Compile and implement the Verilog design file(s).

60

Page 61: Reed Solomon Doc

3. Create the test-vectors and simulate the design (functional simulation) without

using a PLD (FPGA or CPLD).

4. Assign input/output pins to implement the design on a target device.

5. Download bitstream to an FPGA or CPLD device.

6. Test design on FPGA/CPLD device

A Verilog input file in the Xilinx software environment consists of the following segments:

• Header: module name, list of input and output ports.

• Declarations: input and output ports, registers and wires.

• Logic Descriptions: equations, state machines and logic functions.

• End: endmodule

ModelSim 6.2C

ModelSim is our UNIX, Linux, and Windows-based simulation and debug

environment, combining high performance with the most powerful and intuitive

GUI in the industry.

F E A T U R E S : Unified Coverage Database (UCDB) which is a central point for managing,

merging, viewing, analyzing and reporting all coverage information.

Source Annotation. The source window can be enabled to display the

values of objects during simulation or when reviewing simulation results

logged to WLF.

Finite State Machine Coverage for both VHDL and Verilog is now

supported.

Code Coverage results can now be reviewed post-simulation using the

graphical user environment.

Simulation messages are now logged in the WLF file and new capabilities

for managing message viewing are provided in the message viewer.

SystemC is now supported for x86 Linux 64-bit platforms.

61

Page 62: Reed Solomon Doc

Transaction recording and viewing is supported for SystemC using the

SCV transaction recording facilities.

The GUI debug and analysis environment continues to evolve to provide

greater user-customization and better performance.

SystemVerilog for design support continues to expand with many new

constructs added for this release.

Message logging and viewing. Simulation messages are now logged in

the WLF and new capabilities for managing message viewing are

provided. Messages are organized by their severity and type.

B E N E F I T S : The best mixed-language environment and performance in the industry.

The intuitive GUI makes it easy to view and access the many powerful

capabilities of ModelSim. There is no learning curve as the debug

environment is common across all languages.

All ModelSim products are 100% standards based. This means your

investment is protected, risk is lowered, reuse is enabled, and productivity

is enhanced.

Award-winning technical support.

62

Page 63: Reed Solomon Doc

Figure 8.a Model Sim

High-Performance Simulation Environment:

ModelSim combines high performance and high capacity with the most

advanced code coverage and debugging capabilities in the industry. ModelSim

offers unmatched flexibility by supporting 32 and 64 bit UNIX and Linux and 32

bit Windows®-based platforms. Model Technology™ was the first to put the

award-winning single kernel simulator (SKS) technology in the hands of

engineers, enabling transparent mixing of VHDL, Verilog, and SystemC in one

design, using a common, intuitive graphical interface for development and debug

at any level, regardless of the language.

The combination of industry-leading performance and capacity with the

best integrated debug and analysis environment make ModelSim the simulator of

choice for both ASIC and FPGA design. The best standards and platform support

in the industry make it easy to adopt in the majority of process and tool flows.

63

Page 64: Reed Solomon Doc

Verilog for Design:

ModelSim fully supports the Verilog design constructs, providing new

capabilities that aid in modeling at higher levels of abstraction. Some of the most

significant design productivity features include:

Interfaces

Enumerated, structures, unions, and user-defined types

Assignment and increment/decrement operators

Enhanced procedural blocks

Jump statements

Dynamic arrays

Associative arrays

Default task and function arguments and named argument association

Packages and global declaration ModelSim’s native support of Verilog also

includes a fully integrated debug environment.

64

Page 65: Reed Solomon Doc

CONCLUSION

In this project self-checking architectures for an RS encoder and decoder

are described. The parity properties of the binary representation of the elements

of GF (2m) has been studied and a method for a self-checking implementation of

the arithmetic structures used in the RS encoder has been proposed. The

problems related to the presence of undetected faults in parity check-based

schemes have been faced by imposing some constrains in the logical net-list

implementation for the constant multiplier.

Evaluations of area and delay overhead for the self-checking RS encoder

have been provided. For the self-checking RS decoder two main properties of the

fault free decoder have been identified and used to detect faults inside the

decoder. The proposed method can be used for a wide range of algorithm

implementing the decoder function. Some concurrent error detection schemes

have been explained in the paper and some evaluations of area overhead have

been provided. Our method is no intrusive, i.e., the decoder architecture is not

modified. This fact enables the use of the reusability concept, for the design of

very complex digital systems.

65

Page 66: Reed Solomon Doc

Simulation results

The simulation output waveform of the top module of Reed Solomon

66

Page 67: Reed Solomon Doc

The simulation output waveform of the top module of Reed Solomon

67

Page 68: Reed Solomon Doc

Synthesis results

RTL schematic of top module of self checking reed Solomon

68

Page 69: Reed Solomon Doc

RTL schematic of sub module of self checking reed Solomon

69

Page 70: Reed Solomon Doc

RTL schematic of inner sub block module of self checking reed Solomon

70

Page 71: Reed Solomon Doc

REFERENCES

1. R. E. Blahut, Theory and Practice of Error Control Codes. Reading, MA:

Addison-Wesley Publishing Company, 1983.

2. A. R. Masoleh and M. A. Hasan, “Low complexity bit parallel architectures for

polynomial basis multiplication over GF(2m), computers,” IEEE Trans. Comput.,

vol. 53, no. 8, pp. 945–959, Aug. 2004.

3. J. Gambles, L. Miles, J. Has, W. Smith, and S. Whitaker, “An ultra-low power,

radiation-tolerant reed Solomon encoder for space applications,” in Proc. IEEE

Custom Integr. Circuits Conf., 2003, pp. 631–634.

4. A. R. Masoleh and M. A. Hasan, “Error Detection in Polynomial Basis

Multipliers over Binary Extension Fields,” in Lecture Notes in Computer Science.

New York: Springer-Verlag, 2003, vol. 2523, pp.515–528.

5. S. B. Sarmadi and M. A. Hasan, “Concurrent error detection of polynomial

basis multiplication over extension fields using a multiple-bit parity scheme,” in

Proc. IEEE Int. Symp. Defect Fault Tolerance VLSI Syst., 2005, pp. 102–110.

6. G. C. Cardarilli, S. Pontarelli, M. Re, and A. Salsano, “Design of a self

checking reed solomon encoder,” in Proc. 11th IEEE Int. On-Line Test.Symp.

(IOLTS’05), 2005, pp. 201–202.

7. G. C. Cardarilli, S. Pontarelli, M. Re, and A. Salsano, “A self checking Reed

Solomon encoder: Design and analysis,” in Proc. IEEE Int. Symp. Defect Fault

Tolerance VLSI Syst., 2005, pp. 111–119.

71

Page 72: Reed Solomon Doc

8. M. Gossel, S. Fenn, and D. Taylor, “On-line error detection for finite field

multipliers,” in Proc. IEEE Int. Symp. Defect Fault Tolerance VLSI Syst., 1997,

pp. 307–311.

9. Y.-C. Chuang and C.-W. Wu, “On-line error detection schemes for a systolic

finite-field inverter,” in Proc. 7th Asian Test Symp., 1998, pp. 301–305.

10. M. Boyarinov, “Self-checking algorithm of solving the key equation,” in Proc.

IEEE Int. Symp. Inf. Theory, 1998, p. 292.

11. C. Bolchini, F. Salice, and D. Sciuto, “A novel methodology for designing TSC

networks based on the parity bit code,” in Proc. Eur. Design Test Conf., 1997, pp.

440–444.

12. Altera Corp., San Jose, CA, “Altera Reed-Solomon compiler user guide

3.3.3,” 2006.

13. Xilinx, San Jose, CA, “Xilinx logicore Reed-Solomon decoder v5.1,” 2006.

14. D. Nikolos, “Design techniques for testable embedded error checkers,

computers,” Computer, vol. 23, no. 7, pp. 84–88, Jul. 1990.

15. P. K. Lala, Fault Tolerant and Fault Testable Hardware Design. Englewood

Cliffs, NJ: Prentice-Hall, 1985.

72