Self-Test Techniques for Crypto-Devices

5
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 2, FEBRUARY 2010 329 Assume that for the th update of the backtracking, filling ’s in and through is assign-compatible, as such, we can prove that filling ’s in the previous patterns and through is also assign-compatible. REFERENCES [1] A. Chandra and K. Chakrabarty, “Test data compression and test re- source partitioning for system-on-a-chip using frequency-directed run- length codes,” IEEE Trans. Comput., vol. 52, no. 8, pp. 1076–1088, Aug. 2003. [2] A. Jas, J. Gosh-Dastidar, and N. A. Touba, “Scan vector compression/ decompression using statistical coding,” in Proc. IEEE VLSI Test Symp. (VTS), 1999, pp. 114–120. [3] A. Jas, J. G. Dastidar, M. E. Ng, and N. A. Touba, “An efficient test vector compression scheme using selective Huffman coding,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 22, no. 6, pp. 797–806, Jun. 2003. [4] X. Ruan and R. Katti, “An efficient data-independent technique for compressing test vectors in systems-on-a-chip,” in Proc. IEEE Comput. Soc. Annu. Symp. Emerging VLSI Technol. Archit., Washington, DC, 2006, pp. 153–158. [5] X. Kavousianos, E. Kalligeros, and D. Nikolos, “Optimal selective Huffman coding for test-data compression,” IEEE Trans. Comput., vol. 56, no. 8, pp. 1146–1152, Aug. 2007. [6] A. Wuertenberger, C. S. Tautermann, and S. Hellebrand, “A hybrid coding strategy for optimized test data compression,” in Proc. Int. Test Conf. (ITC), Charlotte, NC, , 2003, pp. 451–459. [7] M. Nourani and M. H. Tehranipour, “RL-Huffman encoding for test compression and power reduction in scan applications,” ACM Trans. Des. Autom. Electron. Syst., vol. 10, no. 1, pp. 91–115, Jan. 2005. [8] F. Wolff and C. Papachristou, “Multiscan-based test compression and hardware decompression using LZ77,” in Proc. Int. Test Conf. (ITC), 2002, pp. 331–339. [9] K. J. Balakrishnan and N. A. Touba, “Relationship between entropy and test data compression,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 23, no. 4, pp. 386–395, Apr. 2007. [10] J. Y. Long, J. H. Feng, and L. Zhu, “A new test data compression/ decompression scheme to reduce SoC test time,” in Proc. 6th Int. Conf. ASIC, 2005, vol. 2, pp. 685–688. [11] M. Tehranipoor, M. Nourani, and K. Chakrabarty, “Nine-coded com- pression technique for testing embedded cores in SoCs,” IEEE Trans. VLSI Syst., vol. 13, no. 6, pp. 719–731, Jun. 2005. [12] F. Brglez, D. Bryan, and K. Kozminski, “Combinational profiles of sequential benchmark circuits,” in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), 1989, pp. 1929–1934. [13] J. P. Fang, Y. Hao, H. X. Liu, and K. Li, “A hybrid run-length coding for soc test data compression,” Chin. J. Electron., vol. 33, no. 11, pp. 55–59, Nov. 2005. [14] C. Giri, B. M. Rao, and S. Chattopadhyay, “Test data compression by split-VIHC (SVIHC),” in Proc. Int. Conf. Comput.: Theory Appl. (ICCTA), 2007, pp. 146–150. Self-Test Techniques for Crypto-Devices Giorgio Di Natale, Marion Doulcier, Marie-Lise Flottes, and Bruno Rouzeyre Abstract—This paper describes a generic built-in self-test strategy for de- vices implementing symmetric encryption algorithms. Taking advantage of the inner iterative structures of crypto-cores, test facilities are easily set-up for circular self-test of the crypto-cores, built-in pseudorandom test gen- eration and response analysis for other cores in the host device. Main ad- vantages of the proposed test implementation are an architecture with no visible scan chain, 100% fault coverage on crypto-cores with negligible area overhead, availability of pseudorandom test sources, and very low aliasing response compaction for other cores. Index Terms—Digital circuit testing, security, self-testing. I. INTRODUCTION At the core of an electronic device offering digital security services is the cryptographic coprocessor that executes the cryptographic function. Such “crypto-cores” provide security services such as confidentiality, integrity, and authentication. Because weak “crypto-algorithms,” poor design of the device or hardware physical failures can render the product insecure and place highly sensitive information or infrastructure at risk, validation of crypto-algorithm and test of the hardware implementing such func- tions are essential. U.S. National Institute for Standards and Technology (NIST) orga- nized contest for selecting encryption functions. The Data Encryption Standard (DES) [1] was adopted as national standard in 1976, and the Advanced Encryption Standard (AES) [2] has been selected in October 2000. Since the hardware implementation of DES is not expensive, it is still currently used in many applications, frequently in the form of Triple DES for security improvement [3]. The validation of such algo- rithms as cryptographic functions has been widely discussed as part of the standardization process and is not further discussed here. This paper aims at providing efficient test solutions for possible physical failures on the electronic device implementing the cryptographic algorithm. One approach for providing test solutions at different stages of an IC life cycle consists in including built-in self-test (BIST) resources into the circuit under test (CUT). Classically, storage elements are or- ganized into scan chains and additional hardware is used for feeding the scan chains with pseudorandom test data, and sinking the test re- sponses before analysis of the compressed signature [4]. So, BIST does not provide full controllability and observability of the internal storage elements from the IC interface, namely the scan-in/scan-out pins. This major difference with the external testing strategy makes the scan- based attacks not exploitable [5], [6], and thus avoiding the implemen- tation of related countermeasures (e.g.,[6]–[8]). However, BIST must be implemented at low cost and its efficiency must be demonstrated in terms of fault coverage (FC) and test length. Manuscript received June 10, 2008; revised October 06, 2008. First published April 21, 2009; current version published January 20, 2010. This paper was presented in part at the IEEE International Symposium on Electronic Design, Test and Applications, Hong Kong, January 23–25, 2008. The authors are with the Laboratoire d’Informatique de Robotique et de Mi- croélectronique de Montpellier (LIRMM), Montpellier 34392, France (e-mail: [email protected]). Digital Object Identifier 10.1109/TVLSI.2008.2010045 1063-8210/$26.00 © 2009 IEEE

Transcript of Self-Test Techniques for Crypto-Devices

Page 1: Self-Test Techniques for Crypto-Devices

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 2, FEBRUARY 2010 329

Assume that for the �th update of the backtracking, filling �’s in�������� and �������� through �������� is assign-compatible, as such,we can prove that filling �’s in the previous patterns ���������� and���������� through ������ is also assign-compatible.

REFERENCES

[1] A. Chandra and K. Chakrabarty, “Test data compression and test re-source partitioning for system-on-a-chip using frequency-directed run-length codes,” IEEE Trans. Comput., vol. 52, no. 8, pp. 1076–1088,Aug. 2003.

[2] A. Jas, J. Gosh-Dastidar, and N. A. Touba, “Scan vector compression/decompression using statistical coding,” in Proc. IEEE VLSI Test Symp.(VTS), 1999, pp. 114–120.

[3] A. Jas, J. G. Dastidar, M. E. Ng, and N. A. Touba, “An efficient testvector compression scheme using selective Huffman coding,” IEEETrans. Comput.-Aided Des. Integr. Circuits Syst., vol. 22, no. 6, pp.797–806, Jun. 2003.

[4] X. Ruan and R. Katti, “An efficient data-independent technique forcompressing test vectors in systems-on-a-chip,” in Proc. IEEE Comput.Soc. Annu. Symp. Emerging VLSI Technol. Archit., Washington, DC,2006, pp. 153–158.

[5] X. Kavousianos, E. Kalligeros, and D. Nikolos, “Optimal selectiveHuffman coding for test-data compression,” IEEE Trans. Comput.,vol. 56, no. 8, pp. 1146–1152, Aug. 2007.

[6] A. Wuertenberger, C. S. Tautermann, and S. Hellebrand, “A hybridcoding strategy for optimized test data compression,” in Proc. Int. TestConf. (ITC), Charlotte, NC, , 2003, pp. 451–459.

[7] M. Nourani and M. H. Tehranipour, “RL-Huffman encoding for testcompression and power reduction in scan applications,” ACM Trans.Des. Autom. Electron. Syst., vol. 10, no. 1, pp. 91–115, Jan. 2005.

[8] F. Wolff and C. Papachristou, “Multiscan-based test compression andhardware decompression using LZ77,” in Proc. Int. Test Conf. (ITC),2002, pp. 331–339.

[9] K. J. Balakrishnan and N. A. Touba, “Relationship between entropyand test data compression,” IEEE Trans. Comput.-Aided Des. Integr.Circuits Syst., vol. 23, no. 4, pp. 386–395, Apr. 2007.

[10] J. Y. Long, J. H. Feng, and L. Zhu, “A new test data compression/decompression scheme to reduce SoC test time,” in Proc. 6th Int. Conf.ASIC, 2005, vol. 2, pp. 685–688.

[11] M. Tehranipoor, M. Nourani, and K. Chakrabarty, “Nine-coded com-pression technique for testing embedded cores in SoCs,” IEEE Trans.VLSI Syst., vol. 13, no. 6, pp. 719–731, Jun. 2005.

[12] F. Brglez, D. Bryan, and K. Kozminski, “Combinational profiles ofsequential benchmark circuits,” in Proc. IEEE Int. Symp. Circuits Syst.(ISCAS), 1989, pp. 1929–1934.

[13] J. P. Fang, Y. Hao, H. X. Liu, and K. Li, “A hybrid run-length codingfor soc test data compression,” Chin. J. Electron., vol. 33, no. 11, pp.55–59, Nov. 2005.

[14] C. Giri, B. M. Rao, and S. Chattopadhyay, “Test data compressionby split-VIHC (SVIHC),” in Proc. Int. Conf. Comput.: Theory Appl.(ICCTA), 2007, pp. 146–150.

Self-Test Techniques for Crypto-Devices

Giorgio Di Natale, Marion Doulcier, Marie-Lise Flottes, andBruno Rouzeyre

Abstract—This paper describes a generic built-in self-test strategy for de-vices implementing symmetric encryption algorithms. Taking advantage ofthe inner iterative structures of crypto-cores, test facilities are easily set-upfor circular self-test of the crypto-cores, built-in pseudorandom test gen-eration and response analysis for other cores in the host device. Main ad-vantages of the proposed test implementation are an architecture with novisible scan chain, 100% fault coverage on crypto-cores with negligible areaoverhead, availability of pseudorandom test sources, and very low aliasingresponse compaction for other cores.

Index Terms—Digital circuit testing, security, self-testing.

I. INTRODUCTION

At the core of an electronic device offering digital security services isthe cryptographic coprocessor that executes the cryptographic function.Such “crypto-cores” provide security services such as confidentiality,integrity, and authentication.

Because weak “crypto-algorithms,” poor design of the device orhardware physical failures can render the product insecure and placehighly sensitive information or infrastructure at risk, validation ofcrypto-algorithm and test of the hardware implementing such func-tions are essential.

U.S. National Institute for Standards and Technology (NIST) orga-nized contest for selecting encryption functions. The Data EncryptionStandard (DES) [1] was adopted as national standard in 1976, and theAdvanced Encryption Standard (AES) [2] has been selected in October2000. Since the hardware implementation of DES is not expensive, itis still currently used in many applications, frequently in the form ofTriple DES for security improvement [3]. The validation of such algo-rithms as cryptographic functions has been widely discussed as part ofthe standardization process and is not further discussed here. This paperaims at providing efficient test solutions for possible physical failureson the electronic device implementing the cryptographic algorithm.

One approach for providing test solutions at different stages of anIC life cycle consists in including built-in self-test (BIST) resourcesinto the circuit under test (CUT). Classically, storage elements are or-ganized into scan chains and additional hardware is used for feedingthe scan chains with pseudorandom test data, and sinking the test re-sponses before analysis of the compressed signature [4]. So, BIST doesnot provide full controllability and observability of the internal storageelements from the IC interface, namely the scan-in/scan-out pins. Thismajor difference with the external testing strategy makes the scan-based attacks not exploitable [5], [6], and thus avoiding the implemen-tation of related countermeasures (e.g.,[6]–[8]). However, BIST mustbe implemented at low cost and its efficiency must be demonstrated interms of fault coverage (FC) and test length.

Manuscript received June 10, 2008; revised October 06, 2008. First publishedApril 21, 2009; current version published January 20, 2010. This paper waspresented in part at the IEEE International Symposium on Electronic Design,Test and Applications, Hong Kong, January 23–25, 2008.

The authors are with the Laboratoire d’Informatique de Robotique et de Mi-croélectronique de Montpellier (LIRMM), Montpellier 34392, France (e-mail:[email protected]).

Digital Object Identifier 10.1109/TVLSI.2008.2010045

1063-8210/$26.00 © 2009 IEEE

Page 2: Self-Test Techniques for Crypto-Devices

330 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 2, FEBRUARY 2010

Fig. 1. Implementation of the crypto-algorithm.

Random pattern testability of crypto-cores has been discussed in [9].Authors show how random data and possible errors can be easily prop-agated through typical operations involved in symmetric block encryp-tion algorithms. The paper focus on data paths of nonstandard algo-rithms (e.g., 3-WAY cipher [10]) and the test solution lies on a classicalcentralized BIST architecture where extra test resources are inserted inthe design for test pattern generation (TPG) and output response anal-ysis (ORA) functions.

Authors in [11] proposed a self-test procedure for a 128-bit key AEScore. The inner cyclic behavior is exploited to test the hardware of theround while the key generation module is tested using patterns fromthe round output. Faults on the control part are not considered. Authorsclaim 0.76% of area overhead for self-test mode implementation anda test length of 12 ciphering cycles. In this paper, we propose a BISTsolution for crypto-devices implementing standard symmetric block ci-pher algorithms DES and AES. This is an extension of the work pre-sented in [12] that focuses on a specific AES architecture only. Efficientcircular self-test schemes as well as TPG and ORA functions are easilyimplemented on AES and DES crypto-cores due to the iterative processinvolved in their algorithms. Efficiency of test modes (SELF_TEST,TPG, ORA) in terms of randomness, aliasing, and cost of implementa-tion are discussed and supported by experimental results.

The paper is organized as follows. Section II presents the proposedmodification to the original secure device that allows the implementa-tion of the Self Test, the TPG, and the ORA modes. Section III ana-lyzes testability and randomness properties of block cipher algorithms,while Sections IV and V present experimental results about Self Testand TPG, respectively. Section VI discusses the properties of the de-vice when used as ORA. Eventually, Section VII concludes the paper.

II. IMPLEMENTATION OF THE TEST MODES

Fig. 1 (except shaded areas) presents a typical implementation of theAES or DES crypto-algorithm. It is mainly composed of a Key Genera-tion module and a Round module. In mission mode, after an initial op-eration (XOR between Key and Plaintext for AES, and permutation ofthe plaintext for DES), the plaintext block is looped around the Roundmodule several times (10 for AES, 16 for DES) before the final cipheris loaded into the R-out register, possibly after a final operation like thefinal permutation in DES. The widths of the data are 128 for AES and64 for DES.

The grayed areas in Fig. 1 depict the modifications to support thethree new modes: SELF_TEST, TPG, ORA.

In SELF_TEST, TPG, and mission modes, the SA signal is set to 0,so that the gray XOR operation is transparent. During the first round,

the Select signal is set to 0 to load either the plaintext or the seed forthe generator. This signal is set to 1 during next rounds.

In ORA mode, the input of the circuit is the output of the CUT. SAand Select are set to 1, so that an XOR operation is performed betweenone response of the CUT and the result of the previous round in thecrypto-core. The final signature obtained after compaction of all thetest responses is loaded into R-out. In TPG mode, R-out is loaded everyclock cycle.

During the SELF_TEST mode, an initial message M1 is en-crypted into �� � ������� and the process is repeated n times(�� � � ���������, � � �� � ��). Finally, the output data��� is stored into R-out for comparison with the expected goldenvalue. Diagnostic features can be implemented in SELF_TEST andORA modes using the write-out signal to enable the analysis ofintermediate signatures.

Concerning round key generation, either the keys are precomputedand stored in the circuit or the key generation module calculates thesequence of keys. For the latter case, AES is modified in such a way thatduring SELF-TEST, TPG, and SA modes, the tenth round key is usedas the primary key for the next round key generation. In this way, duringself-test, the key generation module receives as many different stimulias rounds. For DES, this is not necessary because the key generationmodule does not contain any logic. The round keys are simply formedof subsets of bits of the initial key.

For security reasons, we assume a shadow register for storing the keyused during the test mode. Test keys are assumed to be equals for allthe circuits since fault simulation is performed only once. Afterwards,crypto-cores can be differentiated using different secret keys for theirmission modes.

TPG, ORA, and SELF_TEST modes have been implemented in DESand AES circuits. Circuits have been described in VHDL and synthe-sized with Synopsys Design Compiler,1 using a 350-nm CMOS libraryprovided by AMS.2 The area overhead is about 6% (271 cells) for theDES and 3% (507 cells) for AES. For comparison, a 128-bits registerbuilt-in logic block observer (BILBO) [13], which supports TPG andORA functions, requires 859 cells. Our solution is a good alternativefor TPG when the secure device is already present into the design.

III. TESTABILITY AND RANDOMNESS

Security provided by block cipher algorithms such as DES and AESrelies on two main properties named Diffusion and confusion[14], [15].Confusion refers to making the relationship between the key and theciphertext as complex and involved as possible. Diffusion refers to theproperty that redundancy in the statistics of the plaintext is dissipatedin the statistics of the ciphertext. For diffusion to occur, a change in asingle bit of the plaintext should result in changing the value of manyciphertext bits. These properties are supported by the Feistel network[16] for the DES and by the substitution–permutation network for theAES. AES and DES also have two common characteristics. First, theyare iterative algorithms. DES is composed of 16 rounds while AES ismade of 10 rounds. All rounds are (quasi) identical, i.e., the result ofa round is used as the input of the next round. Second, since encryp-tion/decryption are bijective operations for a given key, each round isa bijective operation too (on a set of ��� elements for DES on a set of���� elements for AES).

The diffusion property is a very interesting feature with regard to thetest of their hardware implementation. It implies that every input bit ofa round influences many output bits, i.e., every input line of a round isin the logic cone of many output bits. In other words, an error causedby a fault in the body of the round is very likely to propagate to the

1[Online]. Available: http://www.synopsys.com2[Online]. Available: www.austriamicrosystems.com

Page 3: Self-Test Techniques for Crypto-Devices

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 2, FEBRUARY 2010 331

output. Thus, the circuit is very observable. Moreover, since rounds arebijective, the input logic cone of every output contains many inputs. Inother words, each fault is highly controllable. Therefore, these circuitsare highly testable by nature whatever the implementations. This pointmotivates the implementation of a SELF_TEST mode.

By nature, diffusion and confusion decorrelate the output valuesfrom the input ones. It has been demonstrated that by feeding backthe output to the input, the generated output sequence has randomproperties [17], [18]. This property can be used for building a pseu-dorandom test source within a secure device implementing a DES oran AES core. Note that the test sequence generated from a crypto-coreoutput can be composed of either results of successive encryptions, orresults of successive rounds since these algorithms are iterative. In thefirst case, one pattern is generated after each encryption, i.e., 10 cyclesfor AES and 16 cycles for DES. The randomness of the first solutionhas been investigated in [12] for AES. In the second case, a pattern isgenerated after every round, i.e., after every clock cycle for parallelimplementations. We will refer to 1-round-AES and 1-round-DES inthe sequel of the paper to denote this scheme. For test purposes, thissolution is highly preferable because it reduces the overall test time.However, the randomness of the sequence obtained according to thisscenario must be investigated.

Several empirical metrics can be used for evaluating the randomnessof a sequence. We chose the NIST test battery [19] that is composedof 15 statistical tests. For instance, the first test, called Frequency Test,determines whether the numbers of ones and zeros in the bit stream arebalanced.

We first compared 1.5 Mb streams issued from the rightmost bit of,respectively, 1-round-AES, 1-round-DES, and an LFSRwith internalXORs (primitive polynomial ���� � ���� � ��� � ��� � �� � �).These bit streams can be used for instance to feed a single scan chainin a CUT. LFSR seed, initial plaintexts, and key for 1-round-AES and1-round-DES have been randomly chosen. Bit streams issued from1-round-AES and 1-round-DES pass all tests while the LFSR passesonly 11 tests out of the 15 (the poor result for the LFSR is explainedby the fact that the length of the bit stream is far shorter than the wholecycle length, i.e., 1.5 Mb versus ���� � � ��).

For word-based pattern generation (e.g., multiple scan chains archi-tectures), we first checked the randomness of the streams issued fromeach output bit of the 1-round-AES, 1-round-DES, and LFSR output,considering each bit independently. Table I shows, for each NIST test,the percentage of bit streams that pass the test.

Concerning the 1-round-DES, since the 32 right bits of a DES roundare copied to the 32 left bits before execution of the next round, theright bit streams and the corresponding left bit streams are identical butshifted in time by one cycle. In order to reduce this strong dependencyamong bits (and in particular from a test usage perspective), either the32 right bits or alternatively 32 left bits should be used to test a CUT.The 64 bit streams can be used if the CUT is a combinational circuitsince there is no problem of correlation between the test data in the scanchains in this case. Statistical results of Table I (for 1-round-DES) arecollected only from the 32 right output bits.

In the case of multiple scan chains, another important issue is therandomness of the so-formed vectors�� � ���� � � � � � ���� where ��� isthe �th value of the �th bit stream. In fact, it could happen that everybit stream ����� ���� � � � � ���� is random while the sequence of vectorsis not. For instance, if the bit streams are random but equal, the se-quence of vectors is only composed of all-0s vectors and all-1s vectors.The NIST suite allows checking the randomness of bit streams only.Thus, for checking the randomness of the vectors, we built up a “bitstream” obtained by concatenating the successive vectors. The so-ob-tained 1-round-AES and 1-round-DES bit streams pass all the NISTStests, while the LFSR bit stream passes only 2 tests over 15.

TABLE IPROPORTION OF THE BIT STREAMS PASSING EVERY TEST

Another issue is the length of the so-generated test sequences. Inorder to use the random stream generated by crypto-cores as test se-quence for other cores, it is important to check whether the randomsequence presents short cycles or not. Note that the output state graphof the encryption standard (AES or DES) is composed of independentcycles of different lengths. Assuming n encryptions of R rounds each(� � �� for the AES, � � � for the DES), we expect � � dif-ferent data in the so-produced sequence. However, if the chosen initialinput and key are such that the circuit enters in a short cycle (numberof output states� ��), the number of generated data may not suf-ficient for achieving high FC. To our knowledge, there is no publishedgeneral result about the length of the cycles. Nevertheless, we conjec-ture that the length of such a cycle is quite large. Since the supportset is very large (respectively 264 and 2128 possible output states forDES and AES), and the encryption is bijective, the AES and DES algo-rithms can be considered as random permutations. As a consequence,the cycle-length probability distribution function is flat. Thus, the prob-ability that the output states fail in an insufficient length cycle is verysmall. In practice, we never observed such short cycles neither on thestates nor on individual bit streams up to 1.5 M bits.

As a conclusion of these experiments, the randomness properties ofthe bit streams issued from AES and DES crypto-cores are as goodas or even better than LFSR bit streams. Thus, the 1-round-AES and1-round DES can be considered as a candidate for pseudorandom pat-tern generation.

IV. SELF-TEST MODE

This section provides results in terms of stuck-at FC obtained on thecrypto-cores using the SELF_TEST mode. First, a theoretical approachallows precalculating the number of encryptions needed to reach 100%FC with a given confidence level. The theoretical number of requiredencryptions for 100% FC is then confirmed by fault simulation.

We considered the following aspects.1) The stream generated by a crypto-algorithm when the input is fed

by its output can be considered as random. This property has beendiscussed in Section III.

2) Clearly, the presence of a fault modifies the round output and thusthe next test pattern (circular scheme). This is taken into account inour experiments in which we injected every single stuck-at fault,fault simulated the circuit, and only observed the value of the finalsignature produced by the crypto-core (DES or AES).

Page 4: Self-Test Techniques for Crypto-Devices

332 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 2, FEBRUARY 2010

3) One of the operations of the crypto-algorithm is a substitutionfunction that is implemented by S-boxes. S-boxes represent thelargest part of the crypto-cores. Their inputs are independentlyfed by a subpart of the round inputs. We can therefore assume thatthey are fed by a random source, receiving a pattern every clockcycle.

4) An S-box needs k deterministic patterns to be fully tested and itreceives one random pattern every clock cycle.

5) Other parts of the round module (mainly wires and XOR opera-tions) receive one pattern every clock cycle as well. Since the otherparts have lower complexity than the S-boxes, it can be expectedthat they will be fully tested by the time the S-boxes received asufficient number of test patterns.

The theoretical number of required clock cycles (or patterns) canbe precomputed as the theoretical number of clock cycles for testingthe S-boxes. Equation (1) gives the length (n) of the minimal-lengthrandom sequence that includes the k targeted test patterns with a givenconfidence level P [20], with p being the probability of each pattern toappear

� �� � �� � ��

���

������

���� ����� (1)

Conversely to the so-called circular BIST approach where dedicatedsource and sink test resources (respectively TPG and ORA functions)are build from existing FFs from the CUT and extra logic (e.g., LFSR-based pseudo-random generation and signature analysis), the proposedSELF_TEST scheme only requires a slight modification of the crypto-core’s controllers. Circular BIST improvements as proposed in [21] forbreaking correlations in the test patterns, limit cycles, and jump to statesthat detect random-pattern-resistant faults do not improve the proposedapproach due to inherent random-pattern testability of the crypto-coresand their propensity to generate sufficiently long cycles.

A. DES Self-Test

We considered a confidence level of � � ��� for a sequence of� � patterns. This sequence represents an upper bound since it cor-responds to the exhaustive test set for any six-input S-box. Each pat-tern has the same probability to appear in the pseudorandom scheme,i.e., � � ��. From (1), it comes that the minimal-length random se-quence is � � �� patterns.

According to the implementation of the S-boxes and the actualnumber of required test patterns k �� � �, the length of the testprocedure can vary from 440 clock cycles (28 encryptions) to 540cycles (34 encryptions).

Experimental results confirmed our hypothesis. We fault simulatedthe DES with several keys and initial input messages. After 25 encryp-tions (i.e., 400 clock cycles), the whole circuit (round module and con-trol module) has always been fully tested.

Concerning the key generation module, it mainly consists of wiring.If the primary key is not hard-wired, a solution to test this module is touse the Boolean inverse value of the key during the last encryption. Inthis way, all the stuck-at faults in this module are tested.

B. AES Self-Test

From (1), it comes that the minimal random sequence length shouldinclude � � ��� random patterns for exhaustively testing the AESS-boxes �� � �� with a confidence level of 99%.

For various implementations of the AES S-boxes and thus for dif-ferent deterministic test sets, we calculated the length of the randomsequence for including the k targeted deterministic patterns. The min-imal length is ranging from 2400 to 2593 patterns depending on theimplementation.

TABLE IIFAULT COVERAGE

This hypothesis has been confirmed during the experiments since100% FC has been achieved on the whole circuit (round module,key generation module, controller) after 210 encryptions (i.e., 2100rounds). This experiment has been repeated with different plaintextsand secret keys as starting points, and we obtained test sequencesranging from 2100 to 2500 patterns.

V. TEST PATTERN GENERATION MODE

It has been shown in Section III that 1-round-AES and 1-round-DESdeliver sequences with good random properties. Thus, both can be usedfor random testing of other modules in the system. In this section, weinvestigate how good the sequences are compared to standard LFSRgenerated sequences for random testing. For that we fault simulatedsome circuits with the so-obtained sequences. Since 1-round-DES and1-round-AES sequences present similar random properties, we did theexperiments with 1-round-AES sequences only.

Fault simulations have been performed on ISCAS’89 benchmark cir-cuits s9234, s13207, and s38584 using the Synopsys Tetramax testsuite. They contain, respectively, 228, 669, and 1451 flip-flops. Eachcircuit has been analyzed using three scan chains configurations: singlesan chain, 16 scan chains, and 128 scan chains.

Table II reports FC obtained on these circuits. The first column givesthe length of the test sequence. The second one shows the three scanchain configurations. For the single scan chain, the scan path is fed bythe rightmost output bit of the TPG. The 16 scan chains of the secondconfiguration are fed by 16 randomly chosen output bits of the TPG.In the last configuration, the whole set of 128 output bits is used.

It can be seen that similar FC is obtained from LFSR and 1-round-AES. To go into details, among the 18 simulations presented here, thehighest FC is achieved only three times out of 18 by the LFSR TPG.As expected, the LFSR gives good results on single bit streams butit is less successful on multiple bits generation. This is explained bythe correlations that exist between the different streams issued fromthe LFSR. To overcome this problem, an extra XOR network can beimplemented between the TPG and the CUT (i.e., phase shifter) at thecost of additional hardware.

Page 5: Self-Test Techniques for Crypto-Devices

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 2, FEBRUARY 2010 333

VI. OUTPUT RESPONSE ANALYZER MODE

The role of the ORA is to compact into a single word the sequenceof test responses coming from the CUT. The so-obtained signature iscompared to the expected one.

However, two different test response sequences may lead to thesame signature if multiple erroneous responses (due to faults intothe CUT) are generated (i.e., aliasing). The probability of aliasingmeasures the signature analyzer quality. In cryptographic terms, thesignature can be seen as the hash of the input vector sequence andthe aliasing probability as the collision probability. For the proposedstructure, the aliasing probability after n test cycles is

� ����������

���

��

(2)

where � is the data path bit width. The interested reader can refer to[12] for computation details.

With the assumption that all possible errors are equally likely, thefault masking probability tends toward ��� for large n. Note that thisaliasing probability is equivalent to the one of LFSR-based analyzer(i.e., MISR) [22]

� ����������� ��� ���� �

������ � � (3)

VII. CONCLUSION

BIST approaches are effective for secure circuits since they do notrely on visible scan chains, thus preventing scan-based attacks. In thispaper, a generic BIST solution for devices including a symmetric cryp-tographic core has been presented. Conversely to standard BIST so-lution, this technique entails a negligible area overhead. We proposeda self-test procedure whose principle is to feed the core with its ownoutput and let the device run for a certain number of encryptions, andthen to compare the output of the final encryption with a precomputedsignature. In a very short test time, 100% of FC is achieved.

Finally, we also addressed the use of the crypto-cores as TPGs andORAs for the test of other cores in the circuit. Statistical analysis andexperiments show that crypto-cores are good alternatives when alreadypresent into the design.

REFERENCES

[1] National Bureau of Standards, U.S. Department of Commerce, “Dataencryption standard, federal information processing standard (FIPS),”Publication 46, 1977.

[2] D. Joan and R. Vincent, The Design of Rinjael, AES—The AdvancedEncryption Standard. New York: Springer-Verlag.

[3] Recommendation for the Triple Data Encryption Algorithm (TDEA)Block Cipher, Special Publication 800-67, Nat. Inst. StandardsTechnol. (NIST), Gaithersburg, MD, 2008.

[4] P. H. Bardell and W. H. McAnney, “Self-testing of multichip logicmodules,” in Proc. Int. Test Conf., Nov. 1982, pp. 200–204.

[5] B. Yang, K. Wu, and R. Karri, “Scan-based side-channel attack on dedi-cated hardware implementations on data encryption standard,” in Proc.Int. Test Conf., 2004, pp. 339–344.

[6] B. Yang, K. Wu, and R. Karri, “Secure scan: A design-for-test archi-tecture for crypto chips,” IEEE Trans. Comput.-Aided Design Integr.Circuits Syst., vol. 25, no. 10, pp. 2287–2293, Oct. 2006.

[7] J. Lee, M. Tehranipoor, C. Patel, and J. Plusquellic, “Securing scandesign using lock and key technique,” in Proc. IEEE Int. Symp. DefectFault Tolerance VLSI Syst. (DFT), Oct. 2005, pp. 51–62.

[8] D. Hely, F. Bancel, M.-L. Flottes, and B. Rouzeyre, “Securing scancontrol in crypto chips,” J. Electron. Test.: Theory Appl., vol. 23, no. 5,pp. 457–464, Oct. 2007.

[9] A. Schubert and W. Anheier, “On random pattern testability of cryp-tographic VLSI cores,” J. Electron. Test.: Theory Appl., vol. 16, no. 3,pp. 185–192, Jun. 2000.

[10] J. Daemen, “Cipher and Hash Function Design, Strategies based onlinear and differential cryptanalysis,” Ph.D. dissertation, KatholiekeUniversiteit Leuven, Leuven, Belgium, Mar. 1995.

[11] B. Yang and R. Karri, “Crypto BIST: A built-in self test architecturefor crypto chips,” in Proc. 2nd Workshop Fault Diagnosis ToleranceCryptography (FDTC), 2005, pp. 95–108.

[12] M. Doulcier, M.-L. Flottes, and B. Rouzeyre, “AES-based BIST: Self-test, test pattern generation and signature analysis,” in Proc. 4th IEEEInt. Symp. Electron. Des., Test Appl. (DELTA), 2008, pp. 314–321.

[13] B. Konemann, J. Mucha, and G. Zwiehoff, “Built-in logic block obser-vation technique,” in Proc. IEEE Int. Test Conf., 1979, pp. 37–41.

[14] C. Shannon, “A mathematical theory of communication,” Bell Syst.Tech. J., vol. 27, no. 4, pp. 379–423, 1948.

[15] C. Shannon, “Communication theory of secrecy systems,” Bell Syst.Tech. J., vol. 28, no. 4, pp. 656–715, 1949.

[16] H. Feistel, “Cryptography and computer privacy,” Sci. Amer. Mag., vol.228, pp. 15–23, 1973.

[17] B. Schneier, Applied Cryptography, 2nd ed. New York: Wiley, 1996.[18] P. Hellekalek and S. Wegenkittl, “Empirical evidence concerning

AES,” ACM Trans. Model. Comput. Simul., vol. 13, no. 4, pp.322–333, Oct. 2003.

[19] A Statistical Test Suite for Random and Pseudorandom Number Gener-ators for Cryptographic Applications, Special Publication 800-22, Nat.Inst. Standards Technol. (NIST), Gaithersburg, MD.

[20] S. Shioda, “Some upper and lower bounds on the coupon collectorproblem,” J. Comput. Appl. Math., vol. 200, no. 1, pp. 154–167, Mar.2007.

[21] N. A. Touba, “Circular BIST with state skipping,” IEEE Trans. VeryLarge Scale Integr. (VLSI) Syst., vol. 10, no. 5, pp. 668–672, Oct. 2002.

[22] P. H. Bardell, W. H. McAnney, and J. Savir, Built-in Test for VLSI:Pseudorandom Techniques. New York: Wiley, 1987.

Hazard-Based Detection Conditions for ImprovedTransition Fault Coverage of Scan-Based Tests

Irith Pomeranz and Sudhakar M. Reddy

Abstract—We define a new type of detection conditions for delay faults,referred to as hazard-based detection conditions, to enhance the coverage ofdelay faults using the standard scan test application methods. Some delayfaults, including irredundant faults, may be undetectable under the con-ventional detection conditions. These faults may be detectable under thehazard-based detection conditions. The use of hazard-based detection con-ditions thus improves the delay fault coverage achievable for a circuit. Weconsider transition faults under standard scan for the study in this paper.

Index Terms—Broadside tests, functional broadside tests, hazard-baseddetection, skewed-load tests.

I. INTRODUCTION

Two-pattern tests for delay faults can be applied under one of threetest application schemes. For standard scan circuits, the possible testapplication schemes are skewed-load [1] and broadside [2]. Enhancedscan circuits also allow enhanced scan tests to be applied [3]. We de-note a two-pattern test by ���� ��� ��� ���, where �� is the state and ��

Manuscript received June 28, 2008; revised September 24, 2008. First pub-lished April 14, 2009; current version published January 20, 2010. The workof I. Pomeranz was supported in part by Semiconductor Research Corporationunder Grant 2007-TJ-1643. The work of S. M. Reddy was supported in part bySemiconductor Research Corporation under Grant 2007-TJ-1642.

I. Pomeranz is with the School of Electrical and Computer Engi-neering, Purdue University, West Lafayette, IN 47907 USA (e-mail:[email protected]).

S. M. Reddy is with the Electrical and Computer Engineering Department,University of Iowa, Iowa City, IA 52242 USA.

Digital Object Identifier 10.1109/TVLSI.2008.2010216

1063-8210/$26.00 © 2009 IEEE