IEEE - LT Decode

7/27/2019 IEEE - LT Decode

1/5

ISIT 2009, Seoul, Korea, June 28 - July 3, 2009

LT Codes Decoding: Design and AnalysisFeng Lu

Electrical and Computer Engineering DepartmentUniversity of California, San Diego

Email: [email protected]

Abstract-LT codes provide an efficient way to transfer information over erasure channels. Past research has illustrated thatLT codes can per form well for a large number of input symbols.However, it is shown that LT codes have poor performancewhen the number of input symbols is small. We notice thatthe poor performance is due to the design of the LT decodingprocess. In this respect, we present a decoding algorithm calledfull rank decoding that extends the decodability of LT codes byusing Wiedemann algorithm. We provide a detailed mathematicalanalysis on the rank of the random coefficient matrix to evaluatethe probability of successful decoding for our proposed algorithm.Our studies show that our proposed method reduces the overheadsignificantly in the cases of small number of input symbols yetpreserves the simplicity of the original LT decoding process.

I. INTRODUCTIONThe concept of digital fountain codes was first introduced

by Byers et al. [1] in 1998 for information distribution. It iss timulated on an analogy to fountain spraying water drops,which are collected into a bucket. When a bucket is beingfilled from a fountain, which droplet falling into the bucket isnot as important, as long as the bucket is being filled. Once thebucket is full, collection process ends, and further processingon decoding the content in the bucket will take place. Withthis idea, servers are spraying stochastically encoded packetsof data, which clients then collect. Once enough packets areobtained, the original data can be reconstructed. Throughoutthe process , which packets are obtained should not matter.An ideal digital fountain code should exhibit the followingproperties. Given a message that would require k packets oflength l for transmission, this code would allow the sourceto generate an infinite supply of encoded packets of length l.The client can then reconstruct the original message once anyk encoded packets of the infinite supply of encoded packetsare received. Moreover, the complexity of the reconstructionprocess should be linear in k.Approximations to a digital fountain has existed for some

time, such as the Low-Density-Priority Code (LDPC) code[2] and the Reed-Solomon (RS) code [3], which are generatedby loosening the requirements in various ways. LT codes[4] are the first real izat ion of a true digita l fountain code.The symbol length for the code can be arbitrary, from one-bitbinary symbols to general l-bit symbols. If an original file canbe described by k symbols, then each coded packet can beindependently generated, and the content of the original filecan be recovered from any k + 0 (Jk In2 (k / 8)) packets withthe probability of 1 - 8 by an average of 0 (k In(k / 8)) symbol

Chuan Heng Foh, Jianfei Cai and Liang-Tien ChiaSchool of Computer Engineering

Nanyang Technological University, Singapore 639798Email: {aschfoh.asjfcai.asltchia}@ntu.edu.sg

operations. Following [4], Raptor codes were developed byShokrollahi [5] as rateless fountain codes with even smallerencoding and decoding complexities than LT codes. Raptorcodes make use of a pre-coder before the typical LT process.It has been shown in [5] that LT codes perform very wellfor large values of k. By optimizing the degree distribution,the packet overhead E can be as small as 3% for k > 100000.In reality, small numbers of symbols are often encountered.For example, in video streaming applications, k can be thenumber of frames in a GOP, which is typically in the rangeof 10 to 100. Under this configuration, high packet overheadis observed [7], [8]. Hyytia et al. [7] optimize the configuration parameters of the degree distribution for LT codes forsmall numbers of symbols. They proposed two approaches,namely Markov chain and combination methods. However,as indicated in the paper, the methods are not scalable, andcan only handle symbols k ~ 10 (numerically up to 30 forthe combination method). The authors in [8] define a newdegree distribution function and apply the Gaussian elimination method [9] for decoding. While the packet overhead E isreduced, the decoding complexity increases significantly.Addressing the shortcoming of the LT codes and otherapproaches ment ioned above, we propose a new decodingprocess called full rank decoding algorithm. While retainingthe original LT encoding and decoding process in maximalpossible extent to preserve the low complexity benefit of LTcodes, our proposed decoding process introduces a method thatextends the decodability of LT decoding process preventingLT decoding from terminating prematurely. We present theproposed full rank decoding algorithm in the next section. InSection III, a detailed mathematical analysis on the rank of therandom coefficient matrix is given to evaluate success decoding probability for our proposed algorithm. Numerical resultsare shown in Section IV to demonstrate the effectiveness ofour method, with important conclusions drawn in Section V.

II . FULL RANK LT DECODING PROCESSA. LT Decoding DrawbacksIn the LT decoding process, a symbol is taken of f from

the ripple each time and the related packets are updatedaccordingly. The process terminates when there is no moresymbol left in the ripple, which is said to be successful if allsymbols are recovered. Hence, in the process, it is importantnot to let the symbols in the ripple run out before all symbolscan be recovered. Since the arrival of new symbols into the

978-1-4244-4313-0/09/$25.00 2009 IEEE 2492


2/5

2ISIT 2009, Seoul, Korea, June 28 - July 3, 2009Algorithm 1 Full rank LT decoding

symbols and often require high computing power. We shallpresent our method in the following.Let M denote the coefficient matrix. M is of size n x k for

decoding k symbols with n packets. To ease our discussion,we let n == k for the time being. The case of n > k will bediscussed later. The task of decoding is essentially to solve

for x where x is the k x l vector of the symbols, y is the n x lvector of received packets, l is the length of a symbol, and Mis defined over GF (2).Rather than the entire symbol set, our decoding procedure

only needs to solve for a par ticular symbol , i.e. the borrowedsymbol . This allows us to use a solver that deals with solvingfor only a single symbol with much lesser computational cost.The first task is to identify a collection of row vectors from Msuch that the row vectors eliminates all other symbols exceptthe borrowed symbol. Let x' be a k x 1 vector over GF(2)describes the selection of row vectors, where xj == 1 indicatesthat the j th row vector is selected and xj == 0 indicatesotherwise. Let u, be a k x 1 unit vector where the unique 1locates at the index i which is also the index of the borrowedsymbol. We then need to solve the following

1: rank(M):=k, ripple: r, borrowed symbol buffer: bu2: while not all source symbols are recovered3: re lease all degree one symbols to the r ipple4: if ripple is empty5: compute the index of borrowed symbol bj as in (1)6: put index bj into buffer bu7: recover the borrowed symbol8: release the borrowed symbol into ripple9: end if10: remove a symbol rz from the ripple r11: for all encoding symbols c;12: if Ci includes rz13: c; :== c; EB r i14: update Vi(j):=O and reduce degree of Ci by one15: end if16: end for17: end while

(2)x==y

B. Full Rank DecodingKnowing that the LT decoding process terminates early due

to lack of symbols in the ripple, in this paper, we propose amethod, called full rank decoding to extend the decodabilityof LT codes. Specifically, given a set of packets of full rank,whenever the ripple is empty causing an early termination,a par ticular symbol is borrowed and decoded through someother method, and then placed into the ripple for the LTdecoding process to continue. This procedure is repeated untilthe LT decoding process terminates with a success. Note thatin the case of full rank, any picked borrowed symbol can bedecoded with a sui table method to allow continuat ion of theLT decoding process. Unlike the previous approach [8], [10]where the decoding procedure completely or partially relieson Gaussian el iminat ion, our proposed decoding algori thmmain ly uses LT decoding to recover symbols, and triggersWiedemann algorithm when LT decoding fails in order torecover a borrowed symbol, then returns back to LT decodingto recover subsequent symbols.We use the following criteria to decide which symbolshould be borrowed. Considering decoding k symbols withn packets, given the above analogy between a packet anda linear equation, let VI, V2 , ... , Vn denote the coefficientvectors with the entries in GF (2) at the intermediate decodingprocess when the r ipple is empty. Let b denote the index ofthe borrowed symbol, and bj is dec ided by

b == argmaxj (V(j)IV :== L ~ = l Vi)' (1)

ripple also depends on the degrees of the received packets,the degree dis tr ibution used in the encoding is also a crucia lfactor to the final success of the decoding process.Our further investigation on the LT decoding process reveals

that when LT decoding process terminates and reports afailure, often the undecodable packets can be decoded torecover all symbols within the packets by other means, suchas Gaussian elimination [10]. In other words, when viewing apacket as an equation formed by combining linearly a numberof variables (or symbols) in GF(2), the set of availableequations (or packets) may give a full rank such that anumerical solver (or decoder) can determine all variables (orsymbols). Attributing to the design of the LT decoding process,the method recovers only part ia l but not all symbols.

Let S be the space spanned by this sequence, where M,acts in a nonsingular way on S. Let M; be the operator M trestricted to S, and define the minimal polynomial ofM; to be

for x' where M, == M T . The fact that M is of full rankguarantees the existence of a solution. Finally, the innerproduct of (x', y) gives the borrowed symbol. We use theefficient Wiedemann algor ithm [11] to solve (3). The vectoru, is first used to generate the so-called Krylov sequence

The above sum operation is defined in real number field, andV (j) gives the value at index j of the vector V. The basic ideahere is to choose the symbol that is carried by mos t packets.Incorporat ing borrowed symbols into the LT decoding, ourproposed full rank LT decoding is shown in Algor ithm 1.C. Recovering the Borrowed SymbolOne of the critical designs in our proposed idea is the

recovery of a borrowed symbol. We need to seek for a suitablemethod that can recover only a single symbol using a lowcomputat ional cost. Common techniques such as Gauss ianelimination are inadequate as they are designed to recover all

M tx ' == u, (3)

2493


3/5

ISIT 2009, Seoul, Korea, June 28 - July 3, 2009f(z), where the coefficient of f(z) E GF(2). Let d == deg(f)and f[i] be the coefficient of zi in f(z). Provided that f(z)can be found, the solution to (3) can be quickly obtained by

dx' == L f [ i ] M ~ - l u i 'i=l

Algorithm 2 shows how the borrowed symbol is obtained.

(4)

III. RANDOM MATRIX RANKIt is understood that the probability of successful decoding

for our proposed full rank algorithm is equal to the probabilitythat the coefficient matrixM formed by the received packets isof full rank. In other words, as long as M is of full rank, ourproposed algorithm guarantees the success of the decoding.Studying of the rank in a random matrix gives insight to thedecoding performance of our proposed full rank algorithm.

Algorithm 2 Solving the borrowed symbol A. Random matrix rank when n == k

where b == j - i + a and 0 ::; a ::; i,O ::; b ::; (k - i). Thus,the probability that V q + I of degree (a+ b) happens to be inthe arrangement as revealed in Fig. 1 is ( : t ~ k ~ i ) .a+bBased on the above analysis, we can derive the statetransition probability Pij , i.e. the probability that the sum of(q + 1) vectors has degree j given that the sum of the first qvectors has degree i,

where 0d is the probability that an encoded packet has adegree of d. This degree distribution is defined in LT codesgoverned by two other configuration parameters '\,8 [4].The very last result describes the evolution of degree when

an additional vector is added. This allows us to determine thedegree distribution of the sum of any number of vectors. Weshall define a transition matrix Tr with dimension (k + 1) x(k+l), and each element Tr i j == P(i-1)(j-1). Let nq denotes

(6)

(7)

2:: Da+b (:)\kbi) , i < jO::;a::;min(k- j,i) (a+b)b=j-i+a2:: Da+b (:)\kbi) , i = jl::;a::;min(k- j,i) (a+b)b=j-i+a2:: Da+bG) \kbi) , i > ji- j : : ;a: : ;min(k-j ,i) (a+b)b=j-i+a

We first consider the special case of n == k, where Mbecomes a k x k matrix over GF(2). Let Vi be the it h rowvector of M. The row vectors are linearly dependent if thereexists a nonzero vector (C1,"" Ck) E GF (2)k that satisfies2::7=1 CiVi == O. If M is said to have a full rank, any linearcombination of coefficient vectors (VI,V2, . .. ,Vk) will notproduce O. Consider a non-zero vector c == (C1,"" Ck) EGF(2)k, with exactly q non-zero coordinates. Without loss ofgenerality, let us assume that the first q coordinates of carenon-zero. Define Pq to be the probability that 2::;=1 CiVi == O.Clearly, it is the same as the probabil ity that 2::;:::; Vi == Vqbecause the operations are defined over GF(2).Suppose that summing the first q vectors resulting a vector

with degree i, i.e. a vector that carries i number of Is. Wewould like to analyze the condit ions that its degree changesto j with one additional vector added. As illustrated in Fig. 1,we can compute the degree of the (q+ 1)th vector, V q+ I , as

(5)

1: compute M ~ U i for i == 0,1, . . . , 2k - 12: set S == 0 and go(z) == 13: set U s+ I to be the (s+ 1)s t unit vector4: extract from the results of step 1 the sequence

( i ) 2 k - 1 .Us+I,M] u, i=O (Inner product)5: apply gs(z) to this sequence6: set fS+1 to be the minimal polynomial of the sequenceproduced in step 57: update gs+l == fs+1gs8: set s = s + 1; ifdeg(gs) < k and s < k, go to step 39: use f == gs and compute x' as in (4)10: the borrowed symbol is the inner product of < x', y >

Once x' is solved, the corresponding recovered symbol isobtained as < Mex /, y >, where < . > denotes the vectorinner product.

D. Non-square CaseIn practice, it is likely that more than k packets have to

be received in order to ensure the complete recovery of allsymbols. As a result, the coefficient matrix M will be nonsquare, i.e., the matrix is of size n x k with rank equal to kand n > k.Our approach is to find a n x k matrix Me such that

MjM, will be of full rank. Please note that M; should beof full rank, otherwise, all decoding methods fail. One wayto obtain Me is to randomly set an entry of row i in Me toone with probability Pi = min 2 log k + ~ - i ) . According to[12], with the probability of at least 12.5%, the k x k matrixMjM, is non-singular (i.e. full rank). In this way, with a fewrounds of trials, we can find Me . As we now introduce Me,(3) becomes

The first few steps presented in Algorithm 2 are selfexplanatory. For step 5, given a polynomial g(z) of degreel ld and a sequence {Si}i=O' let s(z) == 2::i=O SiZ?' and bydefinition the result of applying g(z) to the sequence {s; } ~ = o is

{ ( g ( z - l ) s ( z ) ) [ i ] } ~ : g . We apply the BM algori thm [12], [13]in step 6 to find the minimal polynomial. The readers canrefer to [14] for a practical algorithm particular ly designedfor sequences defined in GF(2).

2494


4/5

4ISIT 2009, Seoul, Korea, June 28 - July 3, 2009k - I

Fig . 1. An example of the degree chang ing from i to j with the (q+ 1) t hvector added. Note that the dark region indicates the places of Is. (a) k = 30

,,,,,,,,,

(b) k = 60;; 80 90 100 110Numberof encodedpacketsreceived

0.2

,,,,,,, -"",- ' ~ - - - - - - - ,

40 50 60Numberof encodedpacketsreceivedsum of (q+1) vectors

sumof q vectors

lh(q+1) vectorsb

j= l - a+b

a

Fig . 2. The probability of success decoding for FR and LT with respect tothe number of received packets.

where nq can be determined numerically.We now return our focus back on finding Pq , that is theprobability that

the degree distribution of the sum of q vectors and q 2: 1, thefollowing equation can be establishedTrq -l . (no, n 1 , .. . , nk)T = n q (8)

In order to satisfy (9), the summation on the LHS must givea vector that has the same number of Is as that of v q , andthese Is should be located at the same positions. Knowing thata vector of degree d can produce of possible forms, theprobability for (9) to hold can be derived ask n nq - l"\" ' Hd . HdPq = L, (k) (10)

d=O dIf M is of full rank, then any combination of the coefficientvectors should not result in a linear dependency. Thus, theprobability of full rank can be determined by

k kPf(k) = II (1 - Pq)(q) (II)q=2

q-lLVi =V q ,i= 1

(9)

where Por is the probability that a random vector is linearlyindependent to r other linearly independent row vectors. It ish P - Pf(r+l)easy to see t at or - Pf (r ) .With the establishment of the state recurrentrelationship , the system state probabilities can thenbe determined by solving (13) (see page 5). Notethat (P(I , 1), P(1 ,2 ) , . . . ,P(I ,k))T is initialized to(1,0 ,0 , . .. ,O. Since the transition matrix shown in(13) is of full rank, the methods like eigen decompositionor companion matrix and Jordan normal form [IS] can beutilized to derive a close form expression for P(q , r) .Weset the number of input symbols to be 30, 60, and presentthe numerical results for the probability of full rank versus thenumber of packets received for the setting A = 0.2,0 = 0.1(A and 0 are the configuration parameters for encoding degreedistribution). Wefurther include simulation results to show theaccuracy of our numerical computation. As can be seen fromthe results presented in Fig. 2, the probabilities of full rankclimb quickly as soon as k packets are received. Comparingto the case of LT decoding algorithm [4], given a targeted

success decoding probability, our full rank decoding algorithmuses much lesser received packets to decode the same numberof symbols.as there are of c with exactly q non-zero coordinates.B. Random matrix rank when n > kThis subsection extends the case to the condition where

n > k. Additional handling must be introduced since (II)is derived based on the condition that for a full rank matrixno linear dependency exists for any combination of the rowvectors, which is not true for the case of n > k.Let (q,r) denote the state of the coefficient matrix M whereM consists of q row vectors with rank r , and P(q , r) denotethe corresponding state probability. Our objective is to derivethe state probability of P(n, k). The evolution of the rank ofM can be described by the following recurrent relationshipP(q , r ) = (1 - Por)P(q - 1, r) + po(r- 1)P(q - 1, r - 1)

1 < r :::; k, P(I, 1) = 1, Pok = 0P(q ,r ) = 0, i fq < r (12)

IV. N UMERICAL RESULTS AND DISCUSSIONReference [6] and Section III provide the probabilitiesthat all symbols can be successfully decoded after receivinga particular number of packets for LT and FR decodingalgorithms respectively. Using these results, in Fig. 3, wecompare the expected number of packets received for bothalgorithms. Four different set of configuration parameters areused, and the number of source symbols k ranges from 30 to

100. We compare with two sets of configuration parametersrelated to LT codes where both show consistent results.From Fig. 3, it can be seen that our proposed full rankingdecoding yields much better performance in tenns of theexpected number of packets needed for decoding. It can also beseen that the performance of our proposed algorithm is closeto the ideal performance (i.e. n = k). Conversely, LT codesneed additional packets to compensate for the imperfectness

of received degree distribution to avoid the decoding processfrom terminating prematurely.2495


5/5

ISIT 2009, Seoul, Korea, June 28 - July 3, 2009P(q,1)P(q,2) 1 - PolPo l

(q- l) P(l,l)P(1,2)(13)

P(q ,k -1 )P(q,k) Po (k - l ) 1 - P okP(1 ,k -1 )P(l , k)

Fig. 4. Decoding runtime comparison between FR and LTdecoding.

behavior of a random coefficient matrix which shows that a fullrank can be achieved with low overheads in terms of packets.This result encouraged the design for decoding algorithms likeours that achieve successful decoding upon a full rank. In ourdesign, we introduced the concept of borrowed symbols inthe LT decoding process, where when LT decoding processterminate prematurely, a customized Wiedemann method isactivated to recover a new symbol and place into the ripple sothat the LT decoding process can continue. Our results haveshown a lower expected number of packets needed for therecovery of all symbols and a lower runtime.R EFERENCES

[I] J.w. Byers, M. Luby, M. Mitzenmachert, A. Rege, "A Digital FountainApproach to Reliable Distribution of Bulk Data," ACM SIGCOMMComputer Communication Review, Vol.28, no. 4, pp. 56-67, 1998.[2] R.G. Gallager, "Low-Density Parity-Check Codes". Cambridge, MA:MIT Press, 1963.[3] I.S. Reed, G. Solomon, "Polynomial Codes over Certain Finite Fields,"Journal of the Society for Industrial and Applied Mathematics, Vol. 8,no. 2, pp. 300-304, Jun, 1960.[4] M. Luby, "LT Codes," The43rdAnnual IEEE Symposium on Foundationsof Computer Science, pp. 271-280, Nov, 2002.[5] A. Shokrollahi, "Raptor Codes," IEEE Transactions on InformationTheory, Vol. 52, no. 6, pp. 2551-2567, 2006.[6] R. Karp, M. Luby, A. Shokrollahi, "Finite length analysis of LT codes,"The IEEE International Symposium on Information Theory, 2004.[7] E. Hyytia, T. Tirronen, J. Virtamo, "Optimal Degree Distribution forLT Codes with Small Message Length," The 26th IEEE InternationalConference on Computer Communications INFOCOM, pp. 2576-2580,2007.[8] C. Studholme, I. Blake, "Windowed Erasure Codes," The IEEE International Symposium on Information Theory, pp. 509-5I3, 2006.[9] J. Gentle, "Numerical Linear Algebra for Application in Statistics," pp.87-91, Springer-Verlag, 1998.[10] M. A. Shokrollahi, S. Lassen, and R. Karp, "Systems and processesfor decoding a chain reaction code through inactivation," U.S. Patent7,030,785. April 2006.(I I] D.Wiedemann, "Solving sparse linear equations over finite fields," IEEETransactions on Information Theory, Vol.32, no. I , pp. 54-62, 1986.[12] E. Berlekamp, "Algebraic Coding Theory," McGraw-Hili, New York,1968.[13] J. Massey, "Shift-register synthesis and BCII decoding," IEEE Transactions on Information Theory, Vol. 15, no. I, pp. 122-127, 1969.(14] G. Gong, "Lecture Notes on Sequence Design," Chp. 6,http://comsec.uwaterloo.ca/ggong/C0739x/course.html.[15] R.A. Hom, C.R. Johnson, "Matrix Analysis," Cambridge UniversityPress, 1985.[16] I.E Akyildiz, w. Su, Y. Sankarasubramaniam, E. Cayirci, "WirelessSensor Networks: a Survey" Computer Networks, Vol. 38, no. 4, pp.393-422, 2002.

(b) c-eQ)0XW

40 60 80 100Numberof input symbols(b)

IEEE - LT Decode

Documents

Transcript of IEEE - LT Decode