Finite-Precision Analysis of Demappers and Decoders for LDPC-Coded M-QAM Systems

IEEE TRANSACTIONS ON BROADCASTING, VOL. 55, NO. 2, JUNE 2009 239

Finite-Precision Analysis of Demappers and Decodersfor LDPC-Coded M-QAM Systems

Marco Baldi, Franco Chiaraluce, Member, IEEE, and Giovanni Cancellieri

Abstract—LDPC codes are state-of-art error correcting codes,included in several standards for broadcast transmissions. Itera-tive soft-decision decoding algorithms for LDPC codes reach ex-cellent error correction capability; their performance, however, isstrongly affected by finite-precision issues in the representation ofinner variables. Great attention has been paid, in recent literature,to the topic of quantization for LDPC decoders, but mostly focusingon binary modulations and analysing finite precision effects in adisaggregrated manner, i.e., considering separately each block ofthe receiver. Modern telecommunication standards, instead, oftenadopt high order modulation schemes, e.g. -QAM, with the aimto achieve large spectral efficiency. This puts additional quantiza-tion problems, that have been poorly debated in previous litera-ture. This paper discusses the choice of suitable quantization char-acteristics for both the decoder messages and the received sam-ples in LDPC-coded systems using -QAM schemes. The anal-ysis involves also the demapper block, that provides initial likeli-hood values for the decoder, by relating its quantization strategywith that of the decoder. A new demapper version, based on ap-proximate expressions, is also presented, that introduces a slightdeviation from the ideal case but yields a low complexity hardwareimplementation.

Index Terms—Demodulation, digital communication, error cor-rection codes, fixed point arithmetic, quantization.

I. INTRODUCTION

T HE current scenario of error correcting codes is domi-nated by schemes using Soft-Input Soft-Output (SISO)

decoding. Among them, an important role is played by Low-Density Parity-Check (LDPC) codes, that permit to approachthe theoretical Shannon limit [1], [2], while ensuring reducedcomplexity.

For such reason, these codes have been included in some re-cent telecommunication standards [3]–[5]. The second genera-tion of Digital Video Broadcasting (DVB) standards, in partic-ular, considers LDPC codes in place of more conventional con-catenated schemes formed by Reed-Solomon and convolutionalcodes, that were adopted in first generation DVB standards.Similarly, the second version of the satellite DVB (DVB-S2)standard includes LDPC codes in conjunction with BCH codes[3]. LDPC codes will be probably adopted also in the upcomingsecond generation of the terrestrial DVB (DVB-T2) standard,that will replace soon its present version [6]. Possible technolo-gies to be included in such new standard are currently underevaluation [7].

Manuscript received May 15, 2008; revised February 09, 2009. First pub-lished April 28, 2009; current version published May 22, 2009.

The authors are with the Department of Biomedical Engineering, Elec-tronics and Telecommunications, Polytechnic University of Marche, 60131Ancona, Italy (e-mail: [email protected]; [email protected], [email protected]).

Digital Object Identifier 10.1109/TBC.2009.2016498

Fig. 1. Block diagram of the LDPC-coded� -QAM system.

Based on the above considerations, a relevant issue concernscomparison between the error rate performance that is achiev-able by using LDPC codes and that ensured by other schemesemploying SISO decoding. An example of such comparison willbe given in Section II for the important case of the Digital VideoBroadcasting - Return Channel Satellite (DVB-RCS) standard[8].

Moreover, modern broadcast communications are character-ized by increasing throughput requirements; this is true, for ex-ample, for the DVB-T2 standard, that must support High Def-inition Television (HDTV) services. So, there is the need oflarge spectral efficiencies, that is usually satisfied by employinghigh order modulation schemes [9], [10]. The DVB-T standardadopts QPSK, 16-QAM and 64-QAM schemes in conjunctionwith OFDM, and probably the same will be for DVB-T2.

Another issue in broadcast transmissions concerns com-plexity of the decoder implementation, that can be somehowreduced by introducing suitable approximations [11]. In partic-ular, in SISO decoders, complexity is strongly affected by thefinite-precision representation of the inner variables.

The aim of this paper is to study finite-precision effects onan LDPC coded -QAM system of the type depicted in Fig. 1;it employs binary LDPC codes in conjunction with high ordermodulation schemes [12]. The meaning of the various blocksand quantities involved in Fig. 1 will be explained in detail inSections IV and V.

This topic has been already discussed in previous literature,but most of previous works were limited to consider binarymodulation. Higher order modulation schemes, like -QAM,whose adoption is justified by the need to increase the spectralefficiency, put a number of additional problems. In particular,

0018-9316/$25.00 © 2009 IEEE

240 IEEE TRANSACTIONS ON BROADCASTING, VOL. 55, NO. 2, JUNE 2009

they require to model the effect of the demapper block (i.e.,the symbol-to-metric calculator) and to refine the optimizationprocedure for saving the number of quantization bits withoutincurring significant performance losses. This can suggest, inparticular, the adoption of suitable non-uniform quantizationschemes, that are able to face efficiently the clipping effect. Ifnot controlled, this effect can cause the appearance of remark-able and unexpected error floors.

After having derived, through examples, a quantitative eval-uation of the quantization and clipping effects for the proposedscenario, we discuss a non-uniform quantization law that repre-sents a good trade-off for both waterfall and error floor perfor-mance. Differently from previous proposals, this non-uniformquantization scheme is specifically targeted to overcome clip-ping issues arising in -QAM systems, while maintaining rea-sonably small the number of quantization bits. This solution isobtained through a simple compander-like approach, and can beimplemented by exploiting uniform quantization hardware.

We discuss also the relationship that should exist between thenumber of bits to use in the quantization of the received signalsand the extrinsic messages, in such a way as to ensure compa-rable quantization errors. This analysis is based on theoretical ar-guments on the demapper functions. Finally, we propose a lowcomplexity receiver scheme that, requiring Look-Up Tables withreduced size, can be convenient in a hardware implementation.

The organization of the paper is as follows. In Section IIwe present a comparison between the performance of LDPCcodes and standard turbo codes for the DVB-RCS application.In Section III we provide a short overview of previous workson the quantization problem, that is the main issue of the paper.In Section IV we describe the system model. In Section V wediscuss the choice of the quantization law for the decoder mes-sages. In Section VI we find the relationship that should existbetween the input signals quantization and the decoder mes-sages quantization in order to have comparable accuracies. InSection VII we develop an approximate analysis of the receiverthat permits to express directly the number of quantization bitsand, most of all, can be used to conceive a more efficient imple-mentation. Finally, Section VIII concludes the paper.

II. EXAMPLES OF TURBO-LIKE CODES IN DVB

The introduction of turbo codes has substantially changedthe scenario of forward error correction, and started a revisionprocess of traditional coding schemes in practical applications.

Turbo codes are able to achieve very good correction per-formance, and to approach the Shannon capacity limit. This isdue to the adoption of soft-decision decoding algorithms im-plementing the so-called “turbo principle”, which consists in aniterated exchange and update of inner messages estimating thereliability of each received bit.

A very similar decoding approach characterizes LDPC codes,first introduced by Gallager [13] and then, recently, rediscoveredby the scientific community. It can be shown that turbo decodingis an instance of Pearl’s “Belief Propagation” algorithm, alreadyimplemented in LDPC decoders [14]; so, both turbo and LDPCcodes can be included in a wider class of “turbo-like” codes.

Due to their excellent performance, turbo-like codes are beingadopted in an increasing number of telecommunication stan-

dards and applications, with a special focus on Digital VideoBroadcasting.

The DVB-S2 standard, in particular, makes use ofsemi-random LDPC codes, characterized by a parity-checkmatrix obtained through the concatenation of a non-structuredblock and a staircase lower triangular block (that facilitates sys-tematic encoding). LDPC codes with structured parity-checkmatrices can reduce both the encoding and decoding com-plexity; among them, an important class is represented byQuasi-Cyclic LDPC (QC-LDPC) codes, that can be encodedthrough very simple circuits based on barrel shift registers.

The DVB-RCS standard, that deals with the implementationof interactive channels for satellite applications, includes in-stead a turbo code for error correction. Its turbo encoder uses adouble binary circular recursive systematic convolutional code,an optimized two-level interleaver and a puncturing map to dealwith variable rates [15]. The information block lengths are alsovariable, ranging from 12 bytes to 216 bytes.

The performance of double binary turbo codes has been com-pared with that of structured LDPC codes in [16]. The authorsconclude that, for high code length and rate, LDPC codes oftenoutperform turbo codes (they observed that the two schemesachieve comparable performance for rate 3/4 and block length1152, while, for higher rate and block length, LDPC codes canbe better than turbo codes).

So, it seems interesting to investigate the applicability ofstructured LDPC codes in those applications where ratherlong and high-rate codes are needed. As a first example, wehave considered the DVB-RCS standard turbo code mentionedabove, with MPEG2 information block size (that is 188 bytes[17]) and code rate 4/5.

For comparison, we have simulated two LDPC codes de-signed with different approaches. Both of them have dimension

and length , that are coincident with those ofthe turbo code. The first LDPC code has been designed by meansof the Progressive Edge Growth (PEG) algorithm [18], that aimsat maximizing the girth length within the Tanner graph. The as-sociated parity-check matrix is non-structured; it has columnweight 3 and row weight ranging between 14 and 16.

The second LDPC code is structured and consists of aQC-LDPC code designed through the “Random DifferenceFamilies” (RDF) approach [19]. It is characterized by aparity-check matrix formed by a row of 5 binary circulantblocks, i.e., . Each block has size 376376, and row/column weight 5, 4, 5, 4 and 5, respectively. Thisimplies that matrix has average column weight androw weight .

By assuming, without loss of generality, that the block isnon-singular, a very simple systematic form for the generatormatrix of the considered QC-LDPC code is as follows:

(1)

where superscripts and denote inversion and transposition,respectively. Matrix is formed by a 1504 1504 identity

BALDI et al.: FINITE-PRECISION ANALYSIS OF DEMAPPERS AND DECODERS FOR LDPC-CODED M-QAM SYSTEMS 241

matrix followed by a column of 4 binary circulant blocks (sinceobtained as the product of circulant blocks). So, the encoder im-plementation is very simple, and basically consists in translatingthe last column of blocks into circuits based on barrel shift reg-isters.

With the purpose of extending the comparison to higher ratecodes, we have also considered another QC-LDPC code, stilldesigned through the RDF approach, with the same dimension

of the previous one, but rate (i.e., length). Its parity-check matrix is a row of nine circulant

blocks with size 188 188, each having row/column weightequal to 4 or 5. The row weight of the whole matrix is ,while its average column weight is .

Looking at the DVB-RCS turbo code, it should be noted thatthe code rate 8/9 is higher than those considered in the standard.However, the optimized two-level interleaver can be used alsofor this rate, while the puncturing rule can be easily extended,this way giving us the opportunity to make a comparison for ahigher rate. Strictly speaking, however, it is quite evident thatthis code cannot be considered a standard code; for this reason,we call it “DVB-RCS-like” turbo code with rate 8/9.

Fig. 2 reports the simulated performance of the consideredcodes over the AWGN channel, by using BPSK modulation andin absence of quantization. In simulating turbo codes, we haveused 8 iterations for rate 3/4, and 15 iterations for rate 8/9; thesechoices are adequate for achieving satisfactory convergence ofthe decoding algorithm. From the figure, we observe that theperformance of turbo codes and LDPC codes is similar at bothrates. The turbo codes exhibit a slightly earlier waterfall, thatyields an initial coding gain against LDPC codes, for smallsignal-to-noise ratio and high error rates. For smaller error rates,however, the curves of the LDPC codes show a more favorableslope, and intersect those of the turbo codes. This means that, co-herent with the conclusions in [16], the LDPC codes can providea valid alternative to the turbo codes for high signal-to-noise ra-tios. If we focus on the FER curve for codes with rate 8/9, inparticular, an error floor effect appears in the turbo code perfor-mance, while the LDPC code has no evident floor, at least in theexplored region of FER values.

So, well-designed high-rate LDPC codes are less exposedthan turbo codes to floors for error rates of practical interest.The presence of error floor in the performance of LDPC codes iseven more rare when adopting high order modulations, that arewidely used in modern telecommunication standards. In suchcase, however, “artificial” floors may arise when implementingquantized versions of the LDPC decoder, due to approximationand clipping of intrinsic messages. This motivates our work and,in the following sections, we will study quantization effects forthe considered high-rate QC-LDPC code, in conjunction withhigh order modulation schemes.

III. OVERVIEW OF PREVIOUS WORK ON QUANTIZATION

The existence of finite-precision issues in LDPC decodersis well consolidated: in [20] the “Parity Likelihood Ratio”approach, rather than the more conventional “Log-LikelihoodRatio” (LLR) approach, is proposed to overcome some quanti-zation problems that appear when the Sum-Product decoding

Fig. 2. Comparison of turbo and LDPC codes for DVB-RCS: (a) bit error rate(BER) and (b) frame error rate (FER) versus the signal-to-noise ratio per bit�� .

algorithm is applied. In [2] it is clearly stated that adaptive quan-tization schemes (unfeasible in many practical applications)can exploit better the channel capacity. Besides quantizationof decoder messages, in [21] quantization of the receivedsamples is considered, concluding that a 4-bit representationis a good trade-off between performance and complexity. Thisconclusion, however, is established only for binary (BPSK)modulation, neglecting the impact of the demapper block. Onthe other hand, in such paper the authors study the decoderstructure and propose non-uniform quantization to implementhyperbolic functions.

A similar analysis is developed in [22], where low com-plexity versions of the logarithmic Sum Product Algorithm(LLR-SPA) are presented. The authors show that core hyper-bolic functions of the LLR-SPA decoder can be effectivelyimplemented through a uniform quantization or a piece-wiselinear approximation, in the latter case with negligible perfor-mance loss. The relevant issue of an optimal trade-off betweenresolution and dynamic range for decoding non-binary LDPC


codes when used with BPSK modulation is instead addressedin [23].

Many authors suggest the adoption of 6-bit quantization forthe decoder messages as the best trade-off between performanceand complexity in coded binary modulation [21], [24]–[26]. Thesame choice can be adopted for low-complexity versions of theSum-Product Algorithm, like the Min-Sum variant [27]. But,in this case, it is also proved that less quantization bits in theimplementation of a Min-Sum LDPC decoder can yield a slightperformance degradation [28].

When considering higher order modulation schemes, morebits are necessary to represent both the received samples andthe decoder messages without incurring significant performanceloss. Only a few papers are devoted to study such more involvedsituation. An example is in [29], where the authors consideronly uniform quantization schemes. Moreover, quantization isapplied to the decoder messages (with the peculiarities of -arysystems), while that of the received samples is neglected.

Even the several proposals of non-uniform quantizationschemes are generally addressed to binary systems [30]; on theother hand a valuable alternative to non-uniform quantizationconsists in the Soft-Bit decoding approach presented in [31].

The references above evidence the need for deepening thestudy in the case of -ary modulation schemes. An improvedanalysis should take into account the joint effects of the decoderand the demapper blocks. Actually, this is one of the targets ofthe present paper, and our proposed solutions will be discussedin the next sections.

IV. SYSTEM MODEL

The analysis we have developed is quite general and can beapplied, with some distinctions, to any value of . However,for better evidence, in the following we will mainly refer to thespecific case of a 16-QAM constellation. For any equal to aneven power of 2, a Gray labeling can be adopted to match everysequence of encoded bits to each symbol. An example of Graylabeling for is shown in Fig. 3; we will refer to it in thesubsequent analysis.

Attention will be focused on the high-rate QC-LDPC codedescribed in Section II. It has length and dimension

, coincident with the size of an MPEG2 TransportStream (TS) packet [17]. The code rate is ; so, byassuming , the spectral efficiency is about 3.6 bit/s/Hz,that is a large enough value for most broadcast applications.

Let us look at Fig. 1. The LDPC encoder maps each -bitword produced by the source into an -bit LDPC codeword.Each codeword is then passed to the mapper and modu-lator block, that transforms groups of code bitsinto a symbol of the bi-dimensional -QAM constellation.The modulated signal is then transmitted over an AdditiveWhite Gaussian Noise (AWGN) channel. At the receiverside, the demapper block is a maximum a posteriori (MAP)symbol-to-bit metric calculator, that is able to produce aninitial likelihood value for each received bit (such values aredenoted as intrinsic or channel messages). These messagesserve as input for the Sum-Product Algorithm (SPA), that startsiterating and, at each iteration, produces updated versions ofthe extrinsic and the a posteriori messages [32]. The former

Fig. 3. Gray labeling for 16-QAM.

are used as input for the subsequent iteration (if needed), whilethe latter represent the decoder output, and serve to obtain anestimated codeword that is subject to the hard decision and theparity-check test. The efficiency of this scheme, which is verysimple to implement, has been tested even in comparison withmore involved solutions, like those based on multilevel codingformats, showing everywhere excellent error rate performance[12]; therefore, it is often preferred in practical applications.

Simulations have been carried out over the AWGN channel.As the QAM constellation is not geometrically uniform, thesimulated information patterns cannot be fixed (the all zero se-quence would be the canonical choice) but are generated by arandom, uncorrelated, source.

V. QUANTIZATION OF DECODER MESSAGES

A. Outline of the Decoding Algorithm

Let and be the magnitude of the in-phase and quadraturecomponents, respectively, for each received pass-band signal.The latter, denoted by , is the sum of a symbol

of the constellation and a sample of a whiteGaussian noise , where and are independentGaussian random variables with zero mean and variance .Moreover, let us denote by the -th code bitassociated with the symbol, by the subset of signals whoselabel has the value , and by the subset of signalswhose label has the value .

The LLR of the coded bit , given the received signal , canbe expressed as:

(2)

The values (2), calculated for all bits of a codeword, are theintrinsic messages given as input to the belief propagation algo-rithm. They serve to initialize extrinsic messages, that are thenupdated through the iterated exchange of messages between


variable and check nodes in the Tanner graph representing thecode. At the end of each iteration, a posteriori messages are cal-culated and, based on their sign, an estimate of the transmittedcodeword is derived. The procedure stops when all the parity-check equations are satisfied or when the maximum number ofiterations, fixed a priori, is reached.

The detailed description of the SPA algorithm for LDPC de-coding can be found in several books and papers (see [33] and[34], for example) and is here omitted for the sake of brevity.

B. Uniform Midtread Quantization of the Decoder Messages

As stated in the Introduction, in a practical implementation,all the decoder messages are quantized, resulting in a perfor-mance degradation compared to the ideal behavior, that is ob-tained by assuming real (double precision floating point) vari-ables for the involved quantities. In principle, we consider uni-form midtread quantization, that converts the real value into aword of bits. The corresponding law is reported in (3), where

is the saturation threshold, is the quantization step (depen-dent on the number of bits ) and and are the exact andthe uniform quantized values, respectively:

(3)

In this expression, “ ” represents the floor function, thatgives the largest relative number smaller than, or equal to, its ar-gument. When considering uniform midtread quantization, twoequivalent approaches are possible: direct fixed point represen-tation and integer rescaling. In direct fixed point representa-tion, bits of each word are reserved for the fractional part(this is often denoted as ) and is quantized by con-verting it into its nearest value. In this case, the quan-tization step is and the saturation threshold is

. In the integer rescaling approach, instead, thesaturation threshold is fixed in advance and the dynamic range

is divided into uniform intervals, each with am-plitude . In this case, the value of can bedenoted through the -bit interval indexit is associated with, or through its fixed point value, coincidentwith the product of the interval index by the amplitude (whosefixed point representation must be suitably chosen). The set ofall the possible values of can be stored in an -bitindexed Look-Up Table (LUT) or calculated, any time, througha suitable multiplier circuit.

The integer rescaling approach requires an extra step forreconstructing the quantized values and, depending on thethreshold choice, can yield a non optimal use of the fixedpoint representation. However, these drawbacks are overcomewhen the decoder involves only linear operations. For example,the Min-Sum approximate version of the LLR-SPA decoderrequires additions for variable nodes update, minimum searchoperations for check nodes update, and sign operations forestimating each bit when the decoder stops iterating. In thiscase, all the quantities involved in the decoding process can bescaled by , and the whole decoder can work on integer values,without the need of fixed point representation. Furthermore,the intrinsic messages can be normalized into a fixed range; for

Fig. 4. Max intrinsic message amplitude versus � �� for different bit posi-tions.

example, if the demapper output is divided by its max amplitude(that, in a practical implementation, cannot diverge), the inputLLRs are normalized into the range . This way, thedynamic range of the decoder messages and their quantizationthreshold become independent of the signal-to-noise ratio. Inparticular, the choice of unitary threshold implies clippingof the updated messages but not that of initial messages, andthis occurs independently of the signal-to-noise ratio. For thisreason, we adopt the integer rescaling approach.

On the other hand, when using the Min-Sum approximate ver-sion of the decoder, the amount of memory required to store theextrinsic information can be reduced through other strategies([35], [36]), due to the fact that, at each iteration, extrinsic mes-sages associated with a check node can assume only two fixedvalues. This is not the case of the SPA decoder, in which ex-trinsic information can assume arbitrary values.

C. Effect of Quantization and Clipping

Because of the inherent complexity of the decoding process,an analytical approach able to express the impact of the quanti-zation/clipping effect would be very difficult to face. Moreover,theoretical arguments permit to obtain only asymptotic results

[2], that could be quite distant from practical cases.Thus, we resort to numerical simulations.

We consider, for the decoder messages (that is intrinsic, ex-trinsic and a posteriori messages), the quantization character-istic of (3) with and ; the value of such pa-rameters can be optimized through a series of numerical sim-ulations. As regards the threshold, in particular, a preemptiveanalysis is possible based on the intrinsic messages. If we limitthe Gauss plane to a finite square area of side around thesignal constellation, it is possible to calculate, through (2), themax intrinsic message amplitude as a function of the averagesignal-to-noise ratio per bit, , and the bit position. Thisis shown in Fig. 4 for and the constellation of Fig. 3.

Fig. 4 shows that, in the considered range of signal-to-noiseratios, the input LLR can assume very high values. As we expectthat the clipping effect has a negative impact on the performanceof the decoder, according with this figure, the value of shouldbe set very large. It is interesting to observe that the problem is


Fig. 5. Performance of the considered LDPC code for uniform (Msg � � � ) and non-uniform (Msg � � � � � ) midtread decoder messages quantization:(a) BER versus � �� ; (b) FER versus � �� .

emphasized by the need to use rather high signal-to-noise ratiosbecause of the adoption of the -ary modulation. In the case ofusing BPSK, which is a more conventional choice, the problemwould be much less dramatic. This is because, for a given codeand desired error rate, the signal-to-noise ratios for BPSK aremuch smaller, and the required value of can be reduced ac-cordingly.

The negative effect of clipping on the initial messages hasbeen confirmed through numerical simulations, whose resultsare reported in Fig. 5. In running simulations, we have adoptedthe LLR-SPA, with a maximum number of iterations equal to100. The same will be for the other performance curves shownin the sequel. From Fig. 5, we see that the curves of BER andFER corresponding to and (i.e., ) showa significant error floor; even if the resolution is increased (forexample, by setting and , i.e., ) theerror floor remains. This confirms that the error-floor behavior,in these cases, is mainly due to the effect of clipping intrinsicmessages.

On the other hand, if the clipping effect is avoided, for ex-ample by increasing the dynamic range though maintaining uni-tary step (that happens when and are chosen),the error floor is mitigated (this is evident from the FER curve).Better and better performance can be achieved by increasingalso the quantization resolution: the choice of and

(i.e., ) ensures, in fact, excellent perfor-mance.

However, the values of and, most of all, , required toobtain the best performance, when employing the quantizationcharacteristic described by (3), can become prohibitively high.Therefore, in the next subsection, we introduce a non-uniformquantization characteristic, that is logarithmic in the quantiza-tion interval amplitudes.

D. Proposal of a New Non-Uniform Quantization Function

Given the real value and a positive real number , that wecall the logarithmic “factor”, let us define

, and . The proposednon-uniform quantization characteristic is as follows:

(4)

where is the non-uniform quantized version of . This newcharacteristic has more dense quantization levels for small inputvalues and more sparse quantization levels for high input values,in line with the observation that nearly zero LLRs (that are re-sponsible for the decoder most uncertain condition) are moresensitive to quantization effects than high LLRs (that representa firm belief condition).

Non-uniform quantization, according with (4), is obtainedthrough a classic compander approach based on uniformmidtread quantization. Such a choice, however, implies a


more involved hardware realization when (even linear) oper-ations must be performed on quantized values; so an accuratecomplexity assessment must be done when considering thissolution.

For and , the choice of impliesthat the quantization characteristic expressed by (4) has the min-imum interval amplitude equal to , that coincideswith the lowest step already considered for uniform quantiza-tion. We have applied the non-uniform quantization with thischoice of the parameters and the simulated performance is alsoshown in Fig. 5. We see that, by reducing the impact of the clip-ping effect, the logarithmic characteristic avoids the presenceof the error floor, even assuming . More specifically, theBER and FER curves relative to non-uniform quantization with

and are only a small fraction of dB far fromthose corresponding to uniform quantization with and

, despite the former system adopts a smaller numberof quantization bits. In conclusion, law (4), although more in-volved to implement, seems quite suitable in the region of lowerror rates.

VI. QUANTIZATION OF THE RECEIVED SIGNALS

The effect of the quantization on the input received sam-ples can be related, through a simple analytical approach, to thedecoder messages quantization. An estimate of the number ofquantization bits for the input signals can be easily foundthat is compatible with the resolution adopted for the mes-sages, so avoiding introduction of further performance degra-dation.

A. Estimate of the Maximum Quantization Error

The sub-system processing the received samples should im-plement (2): once having obtained and , as the results of ananalog-to-digital conversion, these values are used to calculatethe for each set of codeword bits ( , inthe considered 16-QAM example). Coherent with the approachfollowed in Section V, the output of the demapper block is thenquantized.

Noting by the dynamic range of the input and (forexample, in Fig. 3) and by the number of quantiza-tion bits adopted, under the hypothesis of using uniform midrisequantization (that is preferable, at the input, for a number ofpractical reasons [37]), the quantization step is .The maximum quantization error at the input, for and , re-spectively, is , and it reflects on a max-imum error on the LLR of the -th bit. Obviously, thispropagated error depends on the value of , and a suitable de-sign criterion should consist in choosing an that satisfies thecondition:

(5)

In (5), represents the constant interval amplitude in the caseof uniform LLR quantization, while it can be replaced by theminimum interval amplitude when non-uniform LLR

Fig. 6. Estimated number of quantization bits for the received signals.

quantization is adopted. If (5) is verified, the signal quantiza-tion has no impact on the decoder messages quantization, andthe BER performance is exactly the same achievable with un-quantized input samples. can be approximated throughthe following expression:

(6)

Partial derivatives appearing in (6) can be easily computedstarting from (2); the final result is:

(7)

In this formula, and are implicit in ; on the other hand,in (7) the noise variance is present and it influences the result.

B. Optimization of the Signal Quantization Parameters

By computing through (7) and inserting it in con-dition (5), we are able to find couples of values that,regardless and , ensure an error on the LLRs, as induced bythe quantization of the received samples, not larger than thatpermitted for extrinsic messages quantization. Noting by thedistance between adjacent symbols in the 16-QAM constella-tion ( in Fig. 3), the following relationship holds:

SNR(8)

where SNR is the ratio between the average signal power andthe noise power. Therefore, , for fixed , depends on theaverage signal-to-noise ratio per bit. A plot of versus ,based on (5) (where has been considered) and (6),is shown in the “exact” plots of Fig. 6 for the first two bits. Theapproximate points must not be considered in this phase; theirmeaning will be described in Section VII. The analysis for thethird and fourth bit provides identical results, with the position


Fig. 7. Performance of the considered LDPC code for uniform midrise samples quantization (Sig� �� ) and uniform midtread decoder messages quantization(Msg � � � ): (a) BER versus � �� ; (b) FER versus � �� .

, because of the intrinsic symmetry of the constellation,that will be further discussed in Section VII.

The required value of , for each bit, is a step-wise in-creasing function of . Clearly, in order to satisfy condi-tion (5) in a given range of values and for all the bit posi-tions, it is necessary to assume the greatest (i.e., most stringent)value of . As an example, for (which implies

for the considered code and constellation), thesuggested value is .

This estimate can be used to forecast the actual performance.For the sake of verification, we have considered uniform quan-tization of the decoder messages (that is the most critical case,having constant resolution) and repeated, in Fig. 7, the simu-lation in Fig. 5, but now considering also the quantization ofthe received samples for different numbers of quantization bits

.Coherent with the theory, the curve with is exactly

superposed to the unquantized one. Anyway, we also see thatthe simulated performance degradation for a lower can bevery small, and even with it remains below 0.2 dB.This result is not surprising: the value of obtained by im-posing (5) is quite conservative; it aims to ensure that the erroron the received samples is always not greater than that on the de-coder messages. When such a condition is unsatisfied, it is notrealistic to think that performance becomes immediately bad:first of all the threshold at the right hand side of (5) could be

exceeded for a small fraction of time and by a limited amount;secondly, the sensitivity of the decoding algorithm on the ini-tial condition should be taken into account, so that it is not surethat any excess translates into an additional error. Although af-fordable in principle (the former in analytical terms by using theprobability density functions of the received samples, the latterby using empirical rules drawn by simulation) this further studyis rather involved and does not permit to derive general con-clusions. For this reason, the value of calculated by meansof (5) only represents a “sufficient” condition to obtain the de-sired good performance. On the other hand, one can object thatsuch an overestimate (in the specified sense) of the value ofobliges to operate with a number of quantization bits unaccept-ably high. However, it should be noticed that the value ofonly affects the demapper, not the decoder (whose registers areinvolved in the message passing algorithm), and a simple solu-tion can be adopted to reduce the complexity of such block. Thisnew proposal is described in the following section.

VII. DEMAPPER BASED ON APPROXIMATE EXPRESSIONS

A. Second Order Approximation

When the value of SNR (and then of ) is sufficientlyhigh, (7) can be greatly simplified by considering, in each sum,the leading term only. This dominant contribution is due to thesignals and that, for


Fig. 8. Comparison between the exact and approximate LLRs for the first twobits, as a function of � (fixed �), at � �� .

Fig. 9. Comparison between the exact and approximate LLRs for the first twobits, as a function of � (fixed �), at � �� .

each , are at minimum distance from the received sample .This technique coincides with the log-sum approximation andhas been successfully applied for both product codes [38] andconvolutional codes [39].

Actually, by imposing this simplification and taking into ac-count (8), (7) becomes:

SNR(9)

This relationship is very simple and more expressive than (7):first of all we notice a linear dependence on the SNR (sucha dependence is necessarily more involved in the rigorous ex-pression). Moreover, in general, it can be further simplified. Forexample, looking at the 16-QAM constellation in Fig. 3, it is

Fig. 10. Circuit for the evaluation of � �� .

easy to see that and have always in common the in-phasecomponent (i.e., ) or the quadrature component (i.e.,

) and that the maximum difference between the un-equal components is . By replacing (9) in (5), together withthe highlighted maximum value, with simple algebra we find:

(10)

where is the smallest integer greater than .This result is shown in the “approximate” plot of Fig. 6, as a

function of , and compared to the exact one (for bits 1 and2). Both the exact and approximate curves exhibit, as obvious, astaircase behavior. Small regions usually exist, for low/mediumsignal-to-noise ratios, where the approximate formula can pro-vide a value of one bit higher than that given by the exact for-mula. Actually, these regions are practically indistinguishable,in the range considered, for the first bit, whilst they areevident for the second bit. This is due to the fact that, when thesecond bit is considered, the maximum difference between thedominant contributions in is smaller than . So, in prin-ciple, an adaptive quantization can be conceived, that varies thevalue of according with the bit position. Anyway, it is clearthat such a procedure would be difficult to manage in a practicalimplementation.

The same simplification used in (9) can be also introducedin the LLR expression (2). This looks like the classic max-logapproximation. Under the same hypotheses, (2) becomes:

(11)

The residual difference between and , that is dueto the approximation, is appreciable for small signal-to-noiseratios. An example is shown in Fig. 8, for ,where and are plotted as a function of , for anarbitrary . The difference becomes smaller and smaller for in-creasing signal-to-noise ratios and, at the values of ofinterest (i.e., those required to have low error rates), it is usu-ally acceptable for all bits. An example is shown in Fig. 9 for

; in this case the exact and approximate curvesare almost overlaid. In comparison with Fig. 8, it is interestingto observe the very different LLR’s dynamics.


B. Simplified Demapper

The acceptability of the approximation suggests a simple so-lution to reduce considerably the complexity of the demapperblock. The exact expression for , in fact, requires the im-plementation of a processor able to calculate , givenits inputs. An alternative solution would be to store the valuesof in a LUT indexed on , , (i.e. the quantizedversions of , , , respectively).

Looking at (11), instead, a smarter solution is possible. Due tothe linearity in theSNR, the -bit level indexes for thequantizedversionofcan be stored in the LUT, in place of those of . Thisway, the dependence on the SNR is eliminated, and the

-bit output words only depend on the -bit input words,regardless of the channel. To reconstruct the value offrom each -bit value, if needed, the circuit shown in Fig. 10can be adopted. It multiplies each level index by the fixedpoint representation of . This circuit uses anSNR value that is continuously estimated at the receiver side,for example by using the signal-mean square error (S/MSE)ratio. When multiplication is performed, it is easy to showthat, if is the number of bits used to represent (the alwayspositive quantity) and the -bit index includesone sign bit, the output value of can be representedthrough bits, at the most. However, as stated inSection V, when the decoder involves only linear operations,it can be normalized in such a way as to be independent ofthe signal-to-noise ratio. In this case the demapper does notperform the multiplication step and the LUT output is theinitial extrinsic message. The proposed solution permits toimplement only one LUT that contains the quantized valuesof and has -bit addresses, being the greatestvalue obtained by applying the analysis shown in the previoussection to the considered SNR range.

C. Reduction of the LUTs Size

The LUT size is

(12)

i.e., it consists of bits in the 16-QAMcase. This value can be further reduced taking into account thefollowing considerations.

Fig. 11 shows the subsets and for the 16-QAM constel-lation of Fig. 3, calculated for all the bit positions .From Figs. 11(a) and 11(b), we notice that the values ofand depend only on the quadrature component, as intheir expressions we have . Similarly, fromFigs. 11(c) and 11(d), it is evident that and de-pend only on the in-phase component, as in their expressionswe have . Moreover, we notice that thetwo subsets and coincide with and , respectively,when an axial symmetry around the bisector of the first and thirdquadrant is applied. Therefore, the values of coincidewith those of , for . Similarly, and coincidewith and under the same transformation, so the values of

coincide with those of , for .

Fig. 11. Subsets � (diamond markers) and � (square markers) forthe 16-QAM constellation of Fig. 3: (a) � and � ; (b) � and � ;(c) � and � ; (d) � and � .

Fig. 12. Circuit employing the 16-QAM demapper LUT with reduced size.

Therefore, the same LUT can be used to obtain the values ofand , such as those of and .

Hence, the address word length can be halved simply by addingan “input selector” block before the LUT, that is able to forwardonly the right component of each input signal on the basis of thebit position. The corresponding circuit is plotted in Fig. 12.

In this case, only the values of and are storedin the LUT. As previously shown, such values only depend on(therefore they can be calculated for an arbitrary value of ) andcoincide with those of and for . Hence,when the switch in Fig. 12 is in position “A”, the quadraturecomponent of the input signal is used as index for the LUT, andthe values of and are available at its outputs.On the contrary, when the switch is in position “B”, the in-phasecomponent of the input signal is used as index, and the valuesof and are available at the two outputs. Hence,the LUT shown in Fig. 12 consists of bits,and it is times smaller than that in Fig. 10.

The same arguments hold for any Gray-labeled constellationof signals, with even. In all these cases, the demapper


block can be implemented by means of a LUT with -bit ad-dresses and -bit outputs, that is, with size

(13)

However, it should be noted that, for demapping each re-ceived sample, the circuit in Fig. 12 requires two LUT accesses,while that in Fig. 10 requires only one access. Therefore, twooptimized circuits should be used in order to obtain the samelatency as for the original scheme. Nevertheless, if we considertwo implementations of the optimized circuit and compare theirsize with that of the original one, we obtain a size gain equalto

(14)

The value of depends on the number of quantization bitsused for the received samples, , as expected. As shown in theprevious sections, this number can be quite high (up to 10), thusyielding a considerable size gain when adopting the optimizedcircuit.

VIII. CONCLUSION

Modern telecommunications require more and more re-liable and spectrally efficient transmissions. Reliability canbe achieved by using LDPC codes, while spectral efficiencyrequires the adoption of high order modulation, like -QAM,schemes. Practical implementation of these solutions needsto reconsider many of the conclusions already drawn for themore classic LDPC coded binary modulations. The largersignal-to-noise ratio required, as a counterpart to the improvedspectral efficiency, makes the -ary modulated scheme muchmore sensitive to the clipping effect, to the point that unex-pected error floors can appear if the system parameters are notcorrectly designed. In principle, the number of quantization bitsneeded can become very large, thus making the system quiteunfeasible. To solve the problem, attractive solutions seemto be the adoption of non-uniform quantization and in deepanalysis of the demapper block functionalities. By exploitingsymmetry properties and taking into account the peculiaritiesof the quantities involved in the decision process, efficientdemapping can be achieved with minimum size LUTs. Therole of the quantization of the incoming signals can be alsocontrolled in such a way as to avoid altering the trade-off foundin the quantization of the decoder messages.

We have studied these aspects for the case of DVB compatibleLDPC codes, in conjunction with -QAM modulation. For thesake of clarity, the results presented in this paper have been re-ferred to the specific case of 16-QAM, but most of the analysisand the proposed new ideas can be easily extended to higherorder constellations and, in principle, to -ary systems withdifferent modulations.

ACKNOWLEDGMENT

The authors wish to thank Giambattista Di Donna and SergioBianchi, at Siemens, for their contribution and helpful discus-sion.

REFERENCES

[1] D. MacKay and R. Neal, “Near Shannon limit performance of lowdensity parity check codes,” Electronics Letters, vol. 33, no. 6, pp.457–458, Mar. 1997.

[2] T. Richardson and R. Urbanke, “The capacity of low-density parity-check codes under message-passing decoding,” IEEE Trans. Inform.Theory, vol. 47, no. 2, pp. 599–618, Feb. 2001.

[3] Digital Video Broadcasting (DVB); Second Generation Framing Struc-ture, Channel Coding and Modulation Systems for Broadcasting, Inter-active Services, News Gathering and Other Broadband Satellite Appli-cations, ETSI EN Std. 302 307 (v1.1.2), Jun. 2006, Rev. 1.1.1.

[4] IEEE P802.11 Wireless LANs WWiSE Proposal: High throughput ex-tension to the 802.11 Standard, IEEE Std. 11-04-0886-00-000n, Aug.2004.

[5] IEEE Standard for Local and Metropolitan Area Networks - Part 16:Air Interface for Fixed and Mobile Broadband Wireless Access Sys-tems - Amendment for Physical and Medium Access Control Layersfor Combined Fixed and Mobile Operation in Licensed Bands, IEEEStd. 802.16e-2005, Dec. 2005.

[6] Digital Video Broadcasting (DVB); Framing Structure, ChannelCoding and Modulation for Digital Terrestrial Television, ETSI ENStd. 300 744 (v1.5.1), Nov. 2004.

[7] “DVB-T2 call for technologies,” Digital Video Broadcasting Project,Tech. Rep. SB 1644r1, Apr. 2007.

[8] Digital Video Broadcasting (DVB); Interaction channel for SatelliteDistribution Systems, ETSI EN Std. 301 790 (v1.4.1), Sep. 2005.

[9] N. H. Tran and H. H. Nguyen, “Signal mappings of 8-ary constellationsfor bit interleaved coded modulation with iterative decoding,” IEEETrans. Broadcast., vol. 52, no. 1, pp. 92–99, Mar. 2006.

[10] B. Rong, T. Jiang, X. Li, and M. R. Soleymani, “Combine LDPC codesover GF(q) with q-ary modulations for bandwidth efficient transmis-sion,” IEEE Trans. Broadcast., vol. 54, no. 1, pp. 78–84, Mar. 2008.

[11] S. Papaharalabos, M. Papaleo, P. T. Mathiopoulos, M. Neri, A. Vanelli-Coralli, and G. E. Corazza, “DVB-S2 LDPC decoding using robustcheck node update approximations,” IEEE Trans. Broadcast., vol. 54,no. 1, pp. 120–126, Mar. 2008.

[12] Y. Li and W. Ryan, “Bit-reliability mapping in LDPC-coded modula-tion systems,” IEEE Commun. Lett., vol. 9, no. 1, pp. 1–3, Jan. 2005.

[13] R. G. Gallager, “Low-density parity-check codes,” IRE Trans. Inform.Theory, vol. IT-8, pp. 21–28, Jan. 1962.

[14] R. J. McEliece, D. J. C. MacKay, and J.-F. Cheng, “Turbo decoding asan instance of Pearl’s “belief propagation” algorithm,” IEEE J. Select.Areas Commun., vol. 16, no. 2, pp. 140–152, Feb. 1998.

[15] C. Douillard, M. Jézéquel, C. Berrou, N. Brengarth, J. Tousch, andN. Pham, “The turbo code standard for DVB-RCS,” in Proc. SecondInternational Symposium on Turbo Codes, Brest, France, Sep. 2000,pp. 535–538.

[16] T. Lestable, E. Zimmerman, M.-H. Hamon, and S. Stiglmayr, “Block-LDPC codes vs duo-binary turbo-codes for European next generationwireless systems,” in Proc. IEEE VTC-2006 Fall, Montréal, Canada,Sep. 2006, pp. 1–5.

[17] Information Technology - Generic Coding of Moving Pictures and As-sociated Audio Information - Part 1: System, ISO/IEC Std. 13 818-1,1996.

[18] X. Y. Hu and E. Eleftheriou, “Progressive edge-growth Tannergraphs,” in Proc. IEEE Global Telecommunications Conference(GLOBECOM’01), San Antonio, TX, Nov. 2001, vol. 2, pp. 995–1001.

[19] M. Baldi and F. Chiaraluce, “Cryptanalysis of a new instance ofMcEliece cryptosystem based on QC-LDPC codes,” in Proc. IEEEISIT 2007, Nice, France, Jun. 2007, pp. 2591–2595.

[20] L. Ping and W. Leung, “Decoding low density parity check codes withfinite quantization bits,” IEEE Commun. Lett., vol. 4, no. 2, pp. 62–64,Feb. 2000.

[21] T. Zhang, Z. Wang, and K. Parhi, “On finite precision implementationof low density parity check codes decoder,” in IEEE International Sym-posium on Circuits and Systems ISCAS 2001, Sydney, NSW, May 2001,vol. 4, pp. 202–205.

[22] X.-Y. Hu, E. Eleftheriou, D.-M. Arnold, and A. Dholakia, “Effi-cient implementations of the sum-product algorithm for decodingLDPC codes,” in Proc. IEEE Global Telecommunications Confer-ence (GLOBECOM’01), San Antonio, TX, Nov. 2001, vol. 2, pp.1036E–1036E.

[23] H. Wymeersch, H. Steendam, and M. Moeneclaey, “Computationalcomplexity and quantization effects of decoding algorithms for non-bi-nary LDPC codes,” in Proc. IEEE Int. Conf. on Acoustic, Speech andSignal Processing, ICASSP 2004, Montreal, Canada, May 2004, vol. 4,pp. 669–672.


[24] S. Kim, G. Sobelman, and J. Moon, “Parallel VLSI architectures fora class of LDPC codes,” in Proc. IEEE ISCAS 2002, Scottsdale, AZ,May 2002, vol. 2, pp. II-93–II-96.

[25] S. L. Howard, C. Schlegel, and V. C. Gaudet, “Degree-matched checknode decoding for regular and irregular LDPCs,” IEEE Trans. CircuitsSyst. II, vol. 53, no. 10, pp. 1054–1058, Oct. 2006.

[26] L. Yang, H. Liu, and C.-J. Richard Shi, “Code construction andFPGA implementation of a low-error-floor multi-rate low-densityparity-check code decoder,” IEEE Trans. Circuits Syst. I, vol. 53, no.4, pp. 892–904, Apr. 2006.

[27] D. Oh and K. K. Parhi, “Performance of quantized min-sum decodingalgorithms for irregular LDPC codes,” in Proc. IEEE ISCAS 2007, NewOrleans, LA, May 2007, pp. 2758–2761.

[28] Z. Cui and Z. Wang, “Efficient message passing architecture for highthroughput LDPC decoder,” in Proc. IEEE ISCAS 2007, New Orleans,LA, May 2007, pp. 917–920.

[29] M. Shen, H. Niu, H. Liu, and J. Ritcey, “Finite precision implemen-tation of LDPC coded M-ary modulation over wireless channels,” inProc. Asilomar Conference on Signals, Systems and Computers, Nov.2003, vol. 1, pp. 114–118.

[30] Z. Cui and Z. Wang, “A 170 Mbps (8176, 7156) quasi-cyclic LDPCdecoder implementation with FPGA,” in Proc. IEEE ISCAS 2006, Kos,Greece, May 2006, pp. 5095–5098.

[31] S. Howard, V. Gaudet, and C. Schlegel, “Soft-bit decoding of regularlow-density parity-check codes,” IEEE Trans. Circuits Syst. II, vol. 52,no. 10, pp. 646–650, Oct. 2005.

[32] J. Hagenauer, E. Offer, and L. Papke, “Iterative decoding of binaryblock and convolutional codes,” IEEE Trans. Inform. Theory, vol. 42,no. 2, pp. 429–445, Mar. 1996.

[33] S. Lin and D. J. Costello, Error Control Coding, Second ed. UpperSaddle River, NJ, USA: Prentice-Hall, Inc., 2004.

[34] D. J. C. MacKay, “Good error correcting codes based on very sparsematrices,” IEEE Trans. Inform. Theory, vol. 45, no. 2, pp. 399–432,Mar. 1999.

[35] A. Hunt, “Hyper-Codes: High-Performance Low-ComplexityError-Correcting Codes,” Master’s thesis, , Carleton University,Ottawa, Canada, 1998.

[36] A. Hunt, J. Lodge, and S. Crozier, “Method of Enhanced Max-Log-aPosteriori Probability Processing,” U.S. Patent 6145114, Nov. 2000.

[37] Private Communication. Apr. 2006, Siemens, Cassina de’ Pecchi, Italy.[38] R. Pyndiah, A. Picart, and A. Glavieux, “Performance of block turbo

coded 16-QAM and 64-QAM modulations,” in Proc. IEEE GlobalTelecommunications Conference (GLOBECOM’95), Singapore, Nov.1995, vol. 2, pp. 1039–1043.

[39] F. Tosato and P. Bisaglia, “Simplified soft-output demapper for binaryinterleaved COFDM with application to HIPERLAN/2,” in Proc. IEEEICC 2002, New York, May 2002, vol. 2, pp. 664–668.

Marco Baldi was born in Macerata, Italy, in 1979.He received the “Laurea” degree (summa cum laude)in Electronics Engineering in 2003, and the Doctoraldegree in Electronics, Informatics and Telecommu-nications Engineering in 2006 from the PolytechnicUniversity of Marche, Ancona, Italy. At present, heis a post-doctoral researcher and contract Professorat the same university. His main research activity isin channel coding, with particular interest in linearblock codes for symmetric and asymmetric channels,low-density parity-check (LDPC) codes and their ap-

plication in cryptography.

Franco Chiaraluce (M’06) was born in Ancona,Italy, in 1960. He received the “Laurea in IngegneriaElettronica” (summa cum laude) from the Universityof Ancona in 1985. Since 1987 he joined the De-partment of Electronics and Automatics of the sameuniversity. At present, he is an Associate Professorat the Polytechnic University of Marche. His mainresearch interests involve various aspects of com-munication systems theory and design, with specialemphasis on error correcting codes, sensor networks,cryptography and multiple access techniques. He is

co-author of more than 180 papers and two books. He is member of IEEE andIEICE.

Giovanni Cancellieri was born in Florence, Italy,in 1952. He received the degrees in ElectronicEngineering and in Physics from the Universityof Bologna. Since 1986 he is Full Professor ofTelecommunications at the Polytechnic Universityof Marche. His main research activities are focusedon optical fibers, radio communications and wirelesssystems, with special emphasis on channel codingand modulation systems. He is co-author of aboutone hundred fifty papers, five books of scientificcontents, and two international patents. Since 2003

he is president of CReSM (Centro Radioelettrico Sperimentale G. Marconi).

Finite-Precision Analysis of Demappers and Decoders for LDPC-Coded M-QAM Systems

Documents

Transcript of Finite-Precision Analysis of Demappers and Decoders for LDPC-Coded M-QAM Systems