Regeldokument - Linnéuniversitetetcs.uccs.edu/.../doc/OGNSuneethaTedlaPhDThesisV2.docx · Web...

REDUCED VECTOR TECHNIQUE HOMOMORPHIC ENCRYPTION WITH

VERSORS

A SURVEY AND A PROPOSED APPROACH

by

Suneetha Tedla

M.C.A, Osmania University, India 1998

A dissertation submitted to the Graduate Faculty of the

University of Colorado at Colorado Springs

in partial fulfillment of the

requirements for the degree of

Doctor of Philosophy

Department of Computer Science

2019

2

© COPYRIGHT BY SUNEETHA TEDLA 2019

ALL RIGHTS RESERVED

Edward Chow, 03/14/19,

The page # before the chapter should use roman number, i, ii, iii. Please read the dissertation manual carefully. Even the dissertation draft should follow the format.The number to the top edge should be at least 0.5”. Currently it is too close to the edge.

3

This dissertation for the Doctor of Philosophy degree by

Suneetha Tedla has

been approved for the

Department of Computer Science

By

Dr. Carlos Araujo, co-Chair

Dr. C. Edward Chow, co-Chair

Dr. T.S. Kalkur

Dr. Jonathan Ventura

Dr. Yanyan Zhuang

Dr. T.S. Kalkur

Date: September 20,

2023

4

Tedla, Suneetha a Tedla (Ph.D., Security)

Reduced Vector Technique Homomorphic Encryption with Versors

A Survey and a Proposed Approach

Dissertation directed by Professors Dr. Carols Araujo and Dr. C. Edward Chow

ABSTRACT

In the last ten years, scholars put forth many efforts to achieve the ability to change

data without decrypting. The most famous is Craig Gentry’s

Fully Homomorphic Encryption, also known as IBM’s approach to homo-

morphic encryption, which uses ideal lattices. Several techniques based on this approach

have been published. I review them to demonstrate how this state-of-the-art encryption

technique works. Recently, homomorphic encryption encounters natural homomorphism

in the product of vectors spaces, by specifically applying Clifford algebra. Clifford

algebra, also known as geometric algebra, allows practical use

of homomorphic encryption. In this research, I will use a new type of homomorphic

encryption technique, based on geometric algebra and versor, called as the baseline for a

new type of homomorphic encryption, which I am calling “Reduced Vector

Technique.”Reduced Vector Technique Homomorphic Encryption (RVTHE) is designed,

developed and analyzed. My contribution is to create anThis new cipher method is

optimized to be cipher method that is faster and compact in cipher length, while

preserving the security strength.


The abstract should state directly your contribution. Not review the prior art in length. You do not have a lot of space. Focus on what you have achieve and some obervatinos/ facts/performance data to back it up.

5

I establish newNew performance criteria that can be usedare proposed to generate

benchmarks to evaluate the for homomorphic encryptioencryption for a fairn, that can

comparisone to benchmarks used for non-homomorphic encryption. The basic premise

behind these performance criteriabenchmarks is to establish the understanding of the

baseline to measure the variations of performance between different encryption methods

forin Cloud Storage type SSDs (Solid State Drives (SSDs). I observed S a significant

differences between throughput penalties, up to 20-50%, are observed using among

encryption software methods on Cloud storage SSD or encrypted SSDs.

The central point of this thesis of the research is to demonstrate verify that

homomorphic encryption is better accomplished with the use of versors instead of multi-

vectors. Using properties of versors, it is possible to design a homomorphic cipher that

has simple structure versality of key assignments, with while achieving a great speed that

rivals existing non-homomorphic ciphers. In the thesis, I will demonstrate that the versors

based homomorphic encryption is faster than an existing non-homomorphic encryption. I

developed a Reduced Vector Technique Homomorphic Encryption symbolized with

RVTHE and it is shown that RVTHE is a symmetric somewhat homomorphic encryption.

RVTHE was developed based on Versors and Clifford Geometric Algebra properties. The

evaluation of our implementation shows a file can be I can edited/appended a file in .001

sec. In the case of full file encryption, RVTHE is 75% faster on encryption and 25%

slower on decryption, compared with the AES Crypt encryption software ‘AES Crypt’

which implementsthe uses AES standard(Advanced Encryption Standard). The

ciphertext sizes of RVTHE are found to be generated ciphertext size reduced on average

of to 25% from those of previous approaches using multi-vectors and Clifford Geometric

Algebra. RVTHE has the potential for use as an encryption method on real workloads.

Keywords: Encryption, Homomorphic, AES, SSD, AES Crypt, Vectors, Versors.


Donot use “will” this is a dissertation report that show what you have achieved, not what you are going to achieve. It is a report of results not a proposal!


Try to use third body subject instead of using I too frequent.

7

ACKNOWLEDGEMENTS

I am blessed with beautiful people in my life. I am very thankful to all who

supported me with my journey of schooling. I really appreciate all the support,

encouragement, love and understanding provided by my family, friends, colleagues and

Advisory Committee.

A special thank you to Dr. Carols Araujo and Dr. C. Edward Chow for their support,

sharing their knowledge, and guiding me for the last several years. Dr. Xiaobo Charles

Zhou advised me prior to Dr.Carols Araujo, and I am very thankful to Dr. Xiaobo

providing me the skills and insight needed to pursue my Ph.D. I very much enjoyed and

admired Dr. Carols Araujo’s knowledge and the way he educates his thoughts to create a

new way of doing the security, and that helped me tremendously for my research. I

really appreciate Dr. Chow’s support and knowledge while discussing the ideas and

analyzing how to put my thoughts and ideas into actions. I am very thankful to both of

you. I appreciate my Advisory Committee members: Dr. Jonathan Ventura, Dr. Yanyan

Zhuang, Dr. T.S.Kalkur providing me their feedback and support. Many thanks to Ali

Langfels who helps all the students with a great smile while managing all the

administrative work.

I am very thankful to my parents and my in-laws, one gave me the beautiful life and

one provided me the beautiful life partner with their unconditional love and support. I

am blessed with beautiful friend, my husband Shravan Tedla, and my kids SaiKiran and

Siddhartha and my gratitude to them supporting me in all aspects of my life including

my Ph.D. I am very thankful to my friend Tim Murphy spending so many hours to help

me to write thesis.

8

I wish to dedicate this body of research to my husband and my best friend Shravan

Tedla; with him everything possible for me.

TABLE OF CONTENTS

CHAPTER 1.........................................................................................................................1

1 INTRODUCTION........................................................................................................1

1.1 Contributions..................................................................................................2

1.2 Security Terminology.....................................................................................2

1.3 Security systems.............................................................................................4

1.4 Cloud Storage Security...................................................................................4

1.5 Design Criteria for Cryptographic Algorithm................................................5

1.6 Encryption......................................................................................................7

1.7 Homomorphic Encryption..............................................................................7

1.8 Vector product spaces with Clifford Geometric Algebra...............................9

1.9 Reduce Vector Technique for Homomorphic Encryption.............................9

CHAPTER 2.......................................................................................................................11

2 BACKGROUND........................................................................................................11

2.1 Contributions................................................................................................11

2.2 Cloud Storage SSD.......................................................................................11

2.3 Survey of Various Encryption Approaches..................................................13

2.3.1 Block Ciphers...........................................................................................14

2.3.2 Block Cipher Modes.................................................................................22

9

2.3.3 Encryption Methods for SSD...................................................................28

2.3.4 Homomorphic Encryption........................................................................32

2.4 Mathematical Foundation.............................................................................33

2.4.1 Inner Product............................................................................................36

2.4.2 Outer Product............................................................................................37

2.4.3 Geometric Product....................................................................................39

2.4.4 Inverse of Vector......................................................................................40

2.4.5 Versors......................................................................................................40

CHAPTER 3......................................................................................................................43

3 PROBLEMS AND LIMITATIONS...........................................................................43

3.1 Contributions................................................................................................43

3.2 Defining the Problem...................................................................................43

3.2.1 Storage and Cloud Security......................................................................44

3.2.2 Cyber Attacks...........................................................................................45

3.2.3 Real Randomness.....................................................................................46

3.2.4 Encryption Security Limitations...............................................................47

3.3 Storage Security Limitations........................................................................48

3.3.1 SSD System Level Induced Limitations...................................................49

3.3.2 Existing research to mitigate the software limitations..............................55

CHAPTER 4.......................................................................................................................63

4 CLOUD STORAGE ENCRYPTION ANALYSIS....................................................63

4.1 Contributions................................................................................................63

10

4.2 Measurement Environment..........................................................................63

4.3 Fully Homomorphic Encryption Limitations...............................................78

4.3.1 Possible fully homomorphic encryption method:.....................................78

4.3.2 FHE with Vector Space............................................................................81

4.3.3 Previous homomorphic encryption using multivector technique.............81

CHAPTER 5.......................................................................................................................84

5 RVTHE.......................................................................................................................84

5.1 Contributions................................................................................................84

5.2 Design of RVTHE........................................................................................84

5.2.1 RVTHE Encryption and Decryption........................................................85

5.2.2 Encryption of RVTHE..............................................................................85

5.2.3 Decryption of RVTHE..............................................................................85

5.3 Mathematical Implementation of RVTHE Using Versors...........................86

5.4 Homomorphism of RVTHE.........................................................................87

5.4.1 Addition....................................................................................................87

5.4.2 Subtraction................................................................................................88

5.4.3 Multiplication...........................................................................................89

5.4.4 Division....................................................................................................89

5.5 Security of RVTHE......................................................................................90

CHAPTER 6.......................................................................................................................90

6 IMPLEMENTATION AND EVALUATION OF RVTHE........................................90

6.1 Contributions................................................................................................90

11

6.2 Implementation of RVTHE..........................................................................90

6.3 Experimental Systems..................................................................................91

6.4 Experimental Evaluations............................................................................92

6.4.1 Time measurements on various key sizes.............................................92

6.4.2 Time measurements on various file sizes.................................................93

6.4.3 Size measurements on Encrypted Files....................................................94

6.5 Security Evaluation of RVTHE....................................................................94

CHAPTER 7.......................................................................................................................97

7 FUTURE WORK AND CONCLUSION...................................................................97

7.1 Contributions................................................................................................97

7.2 Challenges and Lessons Learned.................................................................98

7.3 Success of work............................................................................................99

7.4 Future Work...............................................................................................100

7.5 Conclusion..................................................................................................100

REFERENCES.................................................................................................................100

Appendix A – Cloud Storage SSD...................................................................................116

Appendix B – Cloud Storage and Encryptions................................................................118

Appendix C – Multi-Vector Based Encryption................................................................124

Appendix D – RVTHE.....................................................................................................142

Appendix E – Acronym List............................................................................................161

12

LIST OF FIGURES

Figure 1 - Data Encryption Standard [29]....................................................................16

Figure 2 - TDEA [29]..................................................................................................17

Figure 3 – AES ............................................................................................................18

Figure 4 - Blowfish Algorithm ....................................................................................20

Figure 5 – Twofish [38]...............................................................................................21

Figure 6 – Serpent Algorithm - [42].............................................................................22

Figure 7 - CBC Encryption and Decryption.................................................................24

Figure 8 - CFB mode with 8 bits..................................................................................25

Figure 9 - XTS mode....................................................................................................27

Figure 10 - GCM mode................................................................................................28

Figure 11 - Outer Product.............................................................................................40

Figure 12 - Address Mapping between physical to logical..........................................52

Figure 13 - Flashes and their parallel architecture.......................................................56

Figure 14 - Consumer Vs Enterprise SSD...................................................................58

13

LIST OF GRAPHS

Graph 1 - IOPS Vs Block Size.....................................................................................72

Graph 2 -Parallelism Vs Throughput...........................................................................73

Graph 3 - Random Versus Sequential Operations.......................................................74

Graph 4 - t2.micro Block Size Versus IOPS................................................................75

Graph 5 - t2.micro Block Size Versus KB/Sec............................................................75

Graph 6 - Encrypted SSD Block Size Versus IOPS.....................................................77

Graph 7 - Best Crypt Block Size Vs IOPS...................................................................77

Graph 8 - Dm-crypt Block Size Vs IOPS....................................................................78

Graph 9 - Encrypted EBS SSD Volume Block Size Versus throughput.....................78

Graph 10 - BestCrypt Block Size Versus Throughput.................................................79

Graph 11 -Dm-Crypt Block Size Versus Throughput.................................................79

Graph 12 - Encryption Methods versus IOPS..............................................................80

Graph 13 - Encryption Methods versus Throughput....................................................80

Graph 14 - Read workloads for various Block Sizes...................................................81

Graph 15 – Write workloads IOPS for various Block Sizes........................................81

Graph 16 - Mixed Workloads IOPS for Various Block Sizes......................................81

Graph 17 - Multivector Based Homomorphic Encryption...........................................86

Graph 18 - Multivector based encrypted file sizes.......................................................87

Graph 19 - Key size Vs Encryption/Decryption time in Sec.......................................97

Graph 20 - File size and Encryption/Decryption times................................................98

Graph 21 - Encrypted file sizes inMB..........................................................................99

14

LIST OF TABLESTable 1--AES Key Size and Number of Rounds..........................................................19

Table 2 - Key and data location in versors...................................................................89

1

CHAPTER 1

1 INTRODUCTION

Rapid changes in information technology, specifically the need to use data from

anywhere, are leading users to use Cloud environments with the expectations of

availability (able to provide the data access as needed), reliability, solid integrity

(maintain the data reliability accuracy throughout its life cycle), and full security

(assuring the data is accessed by only authorized parties with authorized level of access).

In this digital age, protecting the PII (Personal Identifiable Information) is imperative.

Tax IDs, Medical Information, Credit Information, and other extremely sensitive data

needs to be secured at the highest level, because it can be used for Identity theft and other

information crimes [1]. Various methods or processes are implemented to secure the data;

among these methods, encryption techniques are the most commonly used. Scholars have

been implementing different cryptographic algorithms and methods such as the

following: Secure Channel, Public-Key encryption, Digital Signatures, and PKI.

Cryptographic algorithms consist of Block Ciphers (DES, AES, Serpent, and Twofish),

Blocker Cipher Modes (Padding, ECB, CBC, Fixed IV, Counter IV, Random IV, Nonce-

Generated IV, OFB, CTR, Combined Encryption and Authentication), Hash functions

(MD5, SHA-1, SHA-2, SHA-256, SHA-512). But even with all these encryption methods

each one requires full decryption of all the data including decrypting the sensitive data.

Also, I observed a significant difference between throughput penalties up to 20-50%

using encryption software methods on Cloud storage SSD or encrypted SSDs. FHE

(Fully Homomorphic Encryption (FHE) allows computing on encrypted data without

decrypting it, keeping sensitive data encrypted and thus not exposed [2] [3].

2

This thesis is organized as follows: Chapter 1 discusses the introduction. Chapter 2

discusses the most common techniques to secure systems or data. Chapter 3 presents

background work and shows the proof of performance penalties of cloud storage SSD and

encryption software methods. Chapter 4 introduces math and RVTHE. Chapter 5 presents

use of RVTHE evaluation in real workloads. Chapter 6 discusses future work. Chapter 7

concludes the thesis.

1.1 Contributions of the dissertation research

<<Chow: Contributions are for the whole dissertation. The

introduction/background/related work chapters are not consider to be significant

contribution. Reserve this for the major research results. You should move this section

content to the last section of Chapter 1 to make it standout.>>This chapter discusses

introduction of research in terms of generic survey of overall security and storage.

Discusses about terminology of storage, cloud, security systems and various encryptions

methods and ciphers.

1.2Security Terminology

The “Security” word Originated from “Late Middle English: from Old French

“securite” or Latin securitas, from securus ‘free from care’ and it means “check to ensure

that all nuts and bolts are secure” [4]. The following are some of the most used terms in

the field of cyber security. They will help to clearly define their role in Information

Technology System Security [5].

Assurance: Specific security method implementation that has adequately met

these four security goals: integrity, availability, confidentiality, and accountability.

Integrity: Ensuring the data in intact with all the modifications only with proper

allowable authenticity.

Availability: Able to provide timely reliable access to an entity.

3

Confidentiality: A set of practices and procedures that supports a security policy.

Accountability: Principle that an authorized individual is responsible to follow

the safeguard controls of the system.

Asymmetric Encryption: This encryption method which uses two unique keys, a

public key for encryption and a private key for decryption. It is impossible to

derive the private key from the public key.

Authentication: Able to verify the identity of an individual or system accessing

an entity.

Block Cipher: Arrays of bytes in the form of binary bits that are used as input,

output, state, and round key in the encryption process.

o State: Intermediate Cipher of encryption process.

o Round Key: Values derived from Cipher Key.

Cipher and Ciphertext: A procedure containing a series of operations that

convert plaintext to ciphertext. Output generated from a Cipher method on plain-

text.

Classified Information: Information requiring the highest level of security and

mandating authorized access.

Cloud Computing: Way to provide network access of shared resources that can

be rapidly provisioned with minimal effort.

Cryptography: Study that incorporates the foundations, mechanisms, or methods

used to hide data and protect it from unauthorized access.

Cyber Attack: Intentionally disrupting the assurance of a system or the data.

Decryption: A technique of converting ciphertext to plaintext.

Encryption: A technique of converting plaintext to ciphertext.

Key: A secret code needed to perform encryption and decryption.

Private Key: A key needed for the decrypting process of asymmetric encryption.

4

Public Key: A key needed for the encrypting process of asymmetric encryption.

Reliability: A system is consistently performing with quality.

Symmetric Encryption: A form of an encryption that uses same key for

encryption and decryption process.

User: An individual who has proper level of authorization to access the system.

1.3 Security systems

“A security system is only as strong as its weakest link.” [6]

We can only guarantee the level of security of the system depending on how strongly

we secured the weakest links. If we create an attack tree for any real system, it will

provide an insight for possible lines of attack [7]. If we leave one single weak link, the

rest of the system would be just as vulnerable, even with having the strongest security

elsewhere. Secrecy systems are broken three categories:

Concealment Systems use a fake covering cryptography method to hide a

message.

Privacy Systems need special equipment to recover the original message.

“True” Secrecy Systems use cipher for recovering the message.

To build a “True” secrecy system, one must follow the design criteria for a

cryptographic algorithm [8].

1.4 Cloud Storage Security

Cloud uses SSD and there has been a lot of research done related to SSD

characteristics, internal design, and performance for different types of workloads [9] [10].

Previous studies have shown SSD outperforms HDD in speed while accessing the data

from each device [11] but this research had not considered encryption on an SSD. There

has also been a lot of research related to different types of encryption methods, vulnerable

5

attacks, and secure methods [6] [12]. These existing algorithms were suitable for regular

HDD, but they may not be optimal for SSD. This is because, with SSD, the physical

structure is different, so the encryption algorithms for HDD might not be ideal or even

compatible for SSD.

There is a need for research to make sure these encryption methods are good enough;

they could be measured by calculating their impact on SSD in terms of performance and

security. The rethinking of existing encryption algorithms is good for SSD or coming up

with new algorithms that will accommodate new environments like the Cloud. The best

encryption method could be found using an assessment between already-existing

encryptions and new encryption methods. For this I study first about SSD’s physical and

logical limitations.

The research showed workloads performances improved, always adding SSD or just

SSD as the storage. SSD is faster than HDD [39] [13], so adding it to the storage system is

what is expected to improve performance. Very little research happened to show the

impact on performance of the different types of workloads with the different encryption

methodologies. When I explored what type of encryption is better for the cloud, we need

to consider data at all stages, which means data traveling and data at rest for cloud [14].

This can be accomplished by using fully data-centric security [15]. This can also be

accomplished by using homomorphic encryption methods.

1.5 Design Criteria for Cryptographic Algorithm

Encryption is a small component of the system but provides a higher level of security

during cyber-attacks [6]. Encryption is the original goal of cryptography. Encryption

converts plain text into unreadable data which is also called ciphertext. A good encryption

makes it impossible to find the plaintext from the ciphertext without knowing the key.

With good encryption, the only information that will be accessible is the plaintext length

6

and the time stamp [16]. The following are some of the design principles that will help to

generate stronger cipher [6].

Algorithm should provide effective security, should be easy to use, and completely

stated.

Security should depend on the key secrecy, not on the algorithm secrecy.

Algorithm should be available to users, adaptable to applications and systems.

Algorithm must be implementable on a targeted system.

Algorithm should be efficient, verifiable, and portable between systems.

The cipher must be dependent on the key, modifications in message should not mask

the key. Randomness of the key is critical for the security of the system and it is hard to

generate or guess it [8]. In 1999, NIST selected AES - Encryption method Criteria

Security. The evaluation criteria were divided into three major categories [17]:

Security:

Resistance of the algorithm to cryptanalysis.

Soundness of its mathematical basis.

Randomness of the algorithm output.

Cost:

Licensing requirements.

Computation efficiency on various platforms.

Algorithm and Implementation Characteristics:

Flexibility.

7

Hardware and software suitability.

Algorithm simplicity.

Flexibility.

Key and block size agility.

1.6 Encryption

Encryption is an imaginative technical derivate of cryptography. Building the optimal

encryption technique is still very important. In the encryption process the “key” takes an

important role for encrypting and decrypting data, without the key the data can’t be

interpreted. The strength of the key depends on its secrecy, randomness, length (size), and

complexity. Over the years the encryption processes became more complex during each

iteration of cipher text generation. Various encryption methods use a unique key generated

for each iteration. However, the definition of Kerckhoffs’ principle is, “security of the

encryption depends on the secrecy of the key not the algorithm.” Meaning that everybody

knows how the key is applied in the algorithm, therefore the complexity of that key is all

that matters. Most of the common cryptographic methods follow this principle.

In 1997, the NIST (National Institute of Standards and Technology) received fifteen

new security algorithms from twelve countries. Out of these encryption methods, MARS,

RC6, Rijndael, Serpent, and Twofish were selected as finalists [18]. Out of these finalists

Rijndael, Serpent, and Twofish took top 3 places respectively. The winning algorithm,

Rijndael, also called AES (Advanced Encryption Standard) is still in use by different

encryption methods [19]. All these methods are symmetric encryption ciphers.

1.7 Homomorphic Encryption

1978 was the first time the idea of homomorphism for cryptography was theorized by

Rivest, Adleman, and Dertouzous [20]. You can define homomorphism in abstract

8

algebra in terms of functions and algebraic structures. Once a function (map) is applied

on algebraic structures the result still holds the same algebraic structure from the domain

to range of algebraic sets. In group theory, homomorphism theorems are developed on

subgroups as quotients groups. Ideals introduced in 19th century played a parallel role

defining quotient rings and in the comparable homomorphism theorems in ring theory

[21]. In algebra, A and B are the same type of algebraic structure and mapping a function

“f ” from A to B is the homomorphism from A to B.

A map from A → B operation “µ” and arity “k” and a 1 , a 2…,ak elements in A.

f ¿ (1.1)

Mapping from A to B with µ and µepimorphism, B is homomorphic image of A.

When homomorphism holds a one to one relation it is called endomorphism and noted as

A=B.

This same homomorphism also can be derived using lattices, groups, modules, and

monoids [22]. In groups, homomorphism is a category of isomorphism when the

homomorphism must be a bijection. If the A and B are two rings, and f is a function from

A to B, where A is the domain of f and B is the range of f , then each a element belongs

to A, and f ( a )belongs to B. This homomorphism can hold addition, subtraction, and

multiplication algebraic operations (¿). It can be showed as below.

f ( a∗a' )=f (a )∗f (a ') (1.2)

If the f satisfies above the following are true

f (0 )=0 (1.3)

f (1 )=1 (1.4)

f (−a )=−f (a ) (1.5)

f ( a )=f (b ) then a=b (1.6)

9

If the properties of homomorphism are incorporated in an encryption method or

cipher, then it is a homomorphic encryption method. Homomorphic encryption can be

organized in three approaches that are partial, somewhat, and full. Partial Homomorphic

Encryption allows only one operation with unlimited iterations. Somewhat Homomorphic

Encryption allows more than one but not all types of operations and limits the iterations.

Fully Homomorphic Encryption allows all types of operations with unlimited iterations

[23].

FHE (Fully Homomorphic Encryption) can be defined as: Applying an encryption

method (E) on data 1 (D1) and data 2 (D2) where the ‘⨳’ represented any operation

(addition, subtraction, multiplication, and division).

This is the mathematical representation: E ( D 1⨳D 2 )=E (D 1 )⨳E ( D2 )

The first feasible form of FHE was proposed by Craig Gentry in 2009 using ideal

lattices with “bootstrappable” encryption methods [2].

1.8 Vector product spaces with Clifford Geometric Algebra

Geometric algebra was used as the basis for various encryption methods. For

example, RSA (Rivest–Shamir–Adleman) uses math in the form of factors with larger

prime number based key sizes. This approach creates complex factoring for RSA [24].

AES uses mathematics in the form of bit manipulations to increase the “diffusion” of

cyphertext and register based operations to increase “confusion” on shared key.

Applying Clifford Geometric Algebra on vector product spaces gives the results

which is intractable because the results will produce output as a vector in different

direction, space or volume. The geometric product that is a Clifford Geometric Algebra

operation, is an extension of the inner product of the vectors and it represents the

10

geometric objects of all dimensions in vector space. Versors represents the multiple

vectors geometric product and hold the properties of vectors in vector space. Selecting

multiple vectors with smaller dimensions and performing a geometric product on them,

results in an intractable vector in vector product space.

1.9 Reduce Vector Technique for Homomorphic Encryption

The use of multi-vectors for homomorphic encryption had been demonstrated by

David Williams Honorio Araujo Da Silva in his master thesis, theis algorithm was

designed using a concept invented by Dr. Carlos Paz de Araujo in 2017. However, There

there is another way we can use vectors with geometric product in vector space. Versors

are vectors in the geometric product space which have simpler inverse characteristics.

RVTHE (Reduced Vector Technique for Homomorphic Encryption) is cryptographic

cipher powered by Clifford Geometric Algebra and versors. This approach is an

incredibly efficient method for encryption, decryption, and real time usage [25].

Securing the data involves two stages, data at rest, and data while traveling. “Data at

rest” means to describe the data before or after sending to server, storage, or cloud. “Data

while travelingin transit” means sending the data between client and server, storage, or

cloud. I will refer to these two stages in this paper as ESD security (Every Stage of Data).

Enterprises have been using security networks, servers, and storage; but the data has not

always been stored in a secure state, therefore there is a need for fully data-centric

security. Data-centric encryption is a way to achieve data-centric security. RVTHE is a

data-centric encryption cipher which is simple to implement and provides ESD security

for the entire system. This method requires less resources to encrypt and decrypt, and it

offers real-time data updates. It is also scalable and adaptable from small devices to large

enterprise storage.

12

CHAPTER 2

Background

2 BACKGROUND

[2.1] Contributions

This chapter discusses all the research related to cloud VM storage SSDs and presents

the survey of security methods. Mainly discussing about SSD storage device

characteristics, various encryption approaches, and mathematical foundation related to

new cipher.

[2.2] Cloud Storage SSD

Most of the Cloud environments uses SSD as data storage or in the form of flash cache

to increase the performance. Data is stored regardless of power availability status. It does

not contain an actual disk (platter) as in a traditional HDD (Hard Disk Drive). SSD

technology uses electronic interfaces like SATA (serial ATA) Express to make compatible

with any host. It also uses typical traditional block input/output (I/O) provided by any

host, thus permitting simple replacement of traditional hard disk drive technology in

common applications. SSD is used as the primary data storage for communication devices,

storage systems, modern computers, etc. [11].

In the perspective of security for an SSD, it strives to achieve the best data reliability,

integrity, secure deletion, and encryption; plus, the unique physical nature of the device.

These aspects depend on ECCs, reliably erasing data from storage media (no digital

footprint), and proper encryption methods. SSD’s built-in commands are effective for

ECCs and deletion, but manufacturers sometimes implement them incorrectly. However,


You donot need this sectin title. Start with 2.1


Normally you do not say each chapter is a contribution.Contribution section is reserved for the major contributions or major chapters, typically not the related work or background.


13

previous research has been done to solve some of the above issues by implementing a

variety of different approaches for achieving better ECCs and encryption methods.

Previous research had not considered the sacrifice to performance due to encryption, when

they implemented their methods. This thesis will consider those factors in the form of

performance of SSD in IOPS for different workloads, while doing the encryption process.

2.1.1.1[2.2.1.1] Data Reliability and Integrity

ECC is one of the functions of the FTL. ECC schemes are implemented to ensure raw

reliability of data. It usually contributes overhead on resources; thus, it impacts the

performance. Conventional ECCs such as the commonly used BCH (Bose-Chaudhuri-

Hocquengham) code reliability degrades as SSD capacity grows. It is important to

implement a powerful ECC Engine with LDPC (low-density parity-check) code to

improve the reliability of SSDs [26] . Previous research proposed different ECC

approaches to increase the data reliability. One of the ECC research approaches is

lightweight EDC (Error Detection Code) for the block to achieve better cache

performance [27].

2.1.1.2[2.2.1.2] Sanitization and Secure Deletion of SSD

The physical SSD architecture of the non-encrypted SSD had limitations for sanitizing

the disk or securely deleting a file from SSD. In case the vendor did not implement the

host interface built-in commands correctly, the sanitation of the SSD will not be achieved.

There is no full sanitizing technique that worked for HDD that is guaranteed to work for

an SSD. Usually we can achieve sanitization of SSD by writing to the visible address

space twice using FTL procedures. But this is a time-consuming process and it is not a

true sanitization, because it does not take care of invisible address space (files marked as

deleted, but physical data still exists). The data in SSD can be erased with erasure-based

sanitization techniques (overwriting the disk with multiple IO operations) that may be able

14

to sanitize the SSD but these techniques have shortcomings and fail to do a real

sanitization [28].

Completely deleting and securely erasing data in SSD is challenging. For that reason,

storing unencrypted data creates a risk of exposing that data to unauthorized access. And

although erasing files via sanitization methods will make the SSD more secure, it also

creates a lot wear and tear on the device which will shorten its lifespan. To avoid these

problems, the best option is encrypting the data on an SSD. Previous research created a

couple of methods for encrypting files on SSDs, they are: node level and password-based

file level encryptions. In node level encryption, you encrypt the nodes. It stores keys on

the dedicated KSA (Key Storage Area). That is the concern though, because KSA blocks

can turn into bad blocks and at such a time they can be read [29]. In password-based file

level encryption, files are encrypted using passwords, but encryption and deletion of the

files is slow and accessing the files each time is tedious [30].

Even with all the challenges of encryption, it is still the best option for securing the

data of SSD.

2.2[2.3] Survey of Various Encryption Approaches

This chapter describes encryption methods and algorithms; and we’ll look at their

strengths and weaknesses in high detail. They are used in encryption software for SSD.

This chapter also talks about real randomness for creating keys and the most common

types of existing encryption methods.

2.2.1[2.3.1] Block Ciphers

An encryption function on fixed-sized blocks of data is called a block cipher. “A

secure block cipher is one for which no attack exists” [6]. Block cipher is an encryption

function for fixed-sized blocks of plaintext and generates the same sized block ciphertext

15

using the same secret key (key size can be different then plaintext). Without a secret key,

no plaintext can be produced from ciphertext. Security of the block cipher is also defined

as using attacks as non-generic methods to differentiate between block and an ‘ideal block

cipher’ [6]. “Block cipher written in terms of E(K,p) or Ek(p) for encryption of plaintext p

with key K and D(K,c) or Dk(c) for decryption of ciphertext c with key K” [6]. In block

cipher encryption, the key is a critical component and its integrity is absolute, changing a

single bit in the key value can result into a different ciphertext [6].

Using a permutation on k-bit values generating the k-bit cipher with each of the key

values can create 2k cipher values [6]. Suppose we have single permutation on 128-bit

values, it will create a table of 2128 cipher values (each cipher is 128-bit). The ideal cipher

should have a random permutation for each key value, this will give the ability to choose

the look up table randomly. The “distinguisher” is an algorithm that converts data to a

block cipher or an ideal block cipher using a black-box function. The distinguisher does

not have knowledge of the internal process of the black-box function. There are limited

amounts of computing that a distinguisher can do, otherwise more computing would

complicate the process beyond an acceptable level of efficiency. A practical block cipher

should be designed such that each encryption function appears to be a randomly chosen

key with an invertible function [6].

The block cipher is an ‘ideal block cipher’ if it can withstand attacks like: known

plaintext, ciphertext only, related key, chosen plaintext, and other types of attacks. In

SSD, encryption software uses one or more of the following block ciphers, the following

sections will address them and some of their attacks.

16

2.2.1.1[2.3.1.1] DES (Data Encryption Standard)

The algorithms described in this DES standard specifies both enciphering and

deciphering operations which are based on a binary number called a key. DES uses

Feistel ciphers design with 16 rounds.

Figure 1 - Data Encryption Standard [24]

In Figure 1, DES starts with a 64-bit input block (binary digits). DES then applies a

56-bit key which was randomly generated from the 64-bit key. Out of this 56-bit key, 48

bits are used directly by the algorithm, and the other 8 bits are used for error detection as

needed. These 8 bits are used to set the parity of each 8-bit byte, the output should have

an odd number of "1"s. XOR operations performed between key and data along with

permutations makes the final cipher [18].

17

Figure 2 - TDEA [24]

In [24]Figure 2 - TDEA [24], a 3DES (TDEA) key is made out of three DES keys,

which are also referred to as a key bundle. The keys inside the key bundle are different

from each other. This key bundle is used for encryption and decryption. The encryption

process starts with encrypting using the first key, decrypting using the second key and

then encrypting using the third key. The decryption process follows the reverse order of

encryption process. The encryption algorithms specified in this standard are commonly

known among those using standard encryption [31].

3DES was heavily used by organizations until researchers discovered Active Collision

Attacks on different modes (e.g. CBC, CTR, GCM, OCB, etc.) [32]. The small key size,

data block size (64 bits), and using same key for encryption became a vulnerability

because it created the same ciphertext in every two to the power of data block size (2 32).

These matching collision ciphers can expose security to attacks like birthday attacks. Due

to XOR operation you can find plaintext XOR between the collision ciphers. Cipher

collision is not enough to discover the plaintext, but that along with same secret key feed

and some fraction of known plaintext, will make it easier to perform successful attacks.

18

Due to ever increasing computer power these attacks are more easily committed by

various attack methods like the man-in-the-browser attack [32].

2.2.1.2[2.3.1.2] AES (Advanced Encryption Standard)

Figure 3 – AES encryption process 1

Figure 3, AES is symmetric encryption algorithm that was created to replace DES and

3DES encryption. Joan Daemen and Vincent Rijmen developed AES encryption. It was

adopted as NIST encryption standard <<year it is adopted>> [ref]. Figure 3 shows the

AES encryption process.

AES can be is created with the key length choices s of 128, 192 and 256 bits. It

encrypts a 128-bit block of plaintext and generates a 128-bit block of ciphertext. It uses

only one key for encryption and decryption. AES encryption consists of repeated rounds

of implementing the following steps: sub bytes (replacing bytes using S-box table), shift

1 https://www.researchgate.net/figure/257006826_fig1_Fig-5-e-Block-diagram-of-the-Rijndael-AES-encryption-algorithm

19

rows, mix columns, and add round keys. The last round does all functions except mix

columns. AES decryption reverses the encryption process.

Key Size Total

Rounds

128 10

192 12

256 14

Table 1--AES Key Size and Number of Rounds.

Table 1 shows the total rounds performed based on key size [33] [34]. In recent years

most of the AES implementations are using the 256-bit key length instead of 192-bit key.

Even though AES 256 has a longer key, the way the key schedule is designed it makes it

more vulnerable to sub-key attacks. AES is subject to a theoretical brute-force attack, but

even with current technology it would take a quintillion year to break the encryption key.

There are some additional theoretical attacks documented, and they are cryptanalytic

attacks, related key attacks on AES 192 and 256, middle attacks on AES 128, and first key

attack on all of AES. Exploits of AES 256 have received the focus by the security

community more than AES 128 and AES 192. Despite this, all AES versions are

considered not breakable by today’s technology [34] [35].

2.2.1.3[2.3.1.3] Blowfish

Blowfish is a symmetric block cipher that was designed by Bruce Schneier to replace

DES in 1993.

20

Figure 4 - Blowfish Algorithm 2

The original design of Blowfish manipulates data in 32-bit, 64-bit and 128-bit block

sizes with variable key size scales from 32 bits to 256 bits. In Figure 4, the algorithm uses

the XOR operation, table lookup (S-box), and modular multiplication. It has the same

structure as the DES algorithm. This algorithm uses precomputable sub-keys to expedite

the speed of encryption. After a year, to increase the security the key size was increased

from 256 bits to 448 bits and published in Dr. Dobb's Journal [36].

If the security person has chosen a small key length, then it will behave like weak keys

in Blowfish, which will make it vulnerable to chosen-key and related-key attacks. Due to

its Feistel structure and key dependent S-box substitution it is also prone to slide and

simple power attacks. Because Blowfish is a block cipher, it is vulnerable to similar

attacks as block ciphers are already prone to such as; side-channel, exhaustive search, and

birthday attacks are just name a few [37].

2 http://www.sm.luth.se/csee/courses/smd/102/lek3/lek3.html

21

2.2.1.4[2.3.1.4] Twofish

Twofish symmetric encryption algorithm is like AES. It uses key lengths of 128, 192,

and 256-bits and a 128-bit block cipher. The National Institute of Standards and

Technology selected it as one of the top 5 finalists, but it was not selected for

standardization in the end. Still recently developed encryption software for storage and

file systems incorporated this algorithm (i.e. TrueCrypt, BestCrypt, Dm-crypt, and

DiskCryptor). Twofish algorithm is one of the ciphers included in the OpenPGP standard

and it is free with no restrictions.

Figure 5 – Twofish [38]

Twofish algorithm uses the same predefined key-dependent, S-box, and key schedule

as AES. The first half of the key is used for encryption and the second half is used for an

S-box lookup and modifying the encryption algorithm. Twofish’s design looks like a mix

of DES and AES, one half is like DES in that it uses a Feistel structure, and other half is

like AES in that it uses S-box and a Maximum Distance Separable matrix. Twofish’s 128-

bit key encryption is slower than its AES counterpart, but the 256-bit key encryption is

faster [38].

22

Researchers had claimed that when the weak key pairs were present, there might have

been vulnerability to a Twofish cipher by partial chosen-key and related-key attacks. But

it was determined that the existence of these key pairs was not realistic, so the proposed

attacks would not work [39]. With time the scholars found there are vulnerabilities with

the Twofish cipher after all. One attack, SPA (Simple Power Analysis) revealed the secret

key of the cipher. It uses S-box with 8-bit predefined permutation and round operations so

it is prone to Side Channel attack with one iteration to discovering encryption key. [40].

2.2.1.5[2.3.1.5] Serpent

Serpent is also a block cipher, and it was published in 1998 by Ross Anderson, Eli

Biham and Lars Knudsen. This algorithm was selected as one of the finalists by the US

National Institute of Standards and Technology [41].

Figure 6 – Serpent Algorithm - [42]

23

Serpent uses 128, 192 and 256-bit key lengths, and it uses 32-bit words with 32-bit

round substitutions and a permutation network with 4-bit S-boxes running 32 rounds key

mixing operation [41].

In 2011, there was a cryptographic analysis performed using a multidimensional linear

method to find vulnerability with Serpent. Researchers proved Serpent breaks in 11

rounds of using a 128-bit key length with key mixing operations that find the encryption

key [43].

2.2.2[2.3.2] Block Cipher Modes

A Block Cipher Mode is a repeated cryptographic conversion of a single-block

operation on several bits to achieve confidentiality and authenticity. It adapts to the

different operating environments and requirements. The following are some of the most

commonly used block cipher modes.

2.2.2.1[2.3.2.1] CBC (CIPHER BLOCK CHAINING) MODE:

CBC mode uses the IV (Initial Vector) to encrypt the first plaintext using XOR

operation. This method uses the previous ciphertext to encrypt the next plaintext block.

The encrypted ciphertext is stored in a feedback register and used for inputting the XOR

function with the next plaintext. This process repeats until all the plaintext has been

addressed. From the second block onwards, all the blocks depend on the previous blocks.

In the decryption process, the same thing is applied but in reverse order. To decrypt the

next cipher text block, use the cipher from previous decryption cycle and apply XOR with

decryption key to get the next plaintext block. After each decryption cycle the cipher is

stored in the feedback register.

Encryption: Ci = Ek(Pi Ci − 1) and Decryption: Pi = Ci − 1 Dk(Ci)

24

Figure 7 - CBC Encryption and Decryption

CBC structure may be exposed to some vulnerabilities. For example, in CBC mode

the encryption process will not start until there is enough plaintext data to fill the entire

block being processed. In secured network communications, the terminals need to

immediately send each character or string of bytes to the destination host, as they can’t

wait until the block is full. But when the string of bytes is smaller than a block, CBC

mode will not be able to handle the encryption. Another weakness, the birthday paradox

exposes identical patterns of the plaintext every 2m/2 blocks (m = block size), this is due to

chaining. There are ways, you can mitigate these issues for example: taking care of the

message starting point and endpoint, including controlled redundancy and authentication

[24].

If an attacker added some bits to the ciphertext block and it was undetected during the

decryption cycle, that block will result in gibberish. Sometimes it may not be an issue,

but other times it can cause problematic situations. Altering the ciphertext by even one bit

will cause the subsequent block to have the wrong input, and that will affect the

25

decryption of that block. The combination of SSL v3 and TLS v1 with CBC is not

recommended as it uses the entire traffic single set of ‘initialization vector’ for the

communication. This exposes the targeted block to a padding oracle attack, where an

attacker can figure out the padding information, then the attacker can determine the

plaintext bytes from the ciphertext by running multiple queries [44]. This was addressed

in TLS 1.2, which checks for multiple queries and stops the connection to prevent that

type of queries 3, and it was recommended to upgrade all the secure communications by

implementing this change.

2.2.2.2[2.3.2.2] CFB (CIPHER-FEEDBACK) MODE

Usually the block ciphering won’t start until the block data is received. As mentioned

in the CBC section, CBC cannot handle when a string of bytes is smaller than a block. On

the other hand, CFB mode can handle this smaller string of bytes. This process derives the

next key from encrypting the previous ciphertext. This key is used for the next iteration to

encrypt next plaintext bytes.

Figure 8 - CFB mode with 8 bits

Figure 8 shows the encryption and decryption for n bit block. The encryption and

decryption use block size with shifting and XOR operations.

3 http://www.iij.ad.jp/en/company/development/iir/pdf/iir_vol25_infra_EN.pdf

26

Encryption:

Ci = Pi Ek(Ci − 1)

Decryption:

Pi =Ci Ek(Ci − 1)

CFB uses synchronous stream cipher on both the encryption side and the decryption

side. Encryption and decryption keystream generators need to derive the exact same keys

on corresponding iterations. If any of them miss a cycle it can result in generating the

wrong ciphertext or plaintext. CFB mode is like CBC mode, in that one incorrect bit can

propagate to all the subsequent processing [24].

2.2.2.3[2.3.2.3] LRW (Liskov, Rivest, and Wagner) mode

To prevent attacks from the CBC mode the LRW mode was introduced. This is a

tweakable narrow-block encryption, which is a random permutation using a key with a

known tweak I on the plaintext P, the result of which will be block cipher C. This

method uses two keys: The first key K is used to encrypt the plaintext with XOR, and the

second key F is used for a finite field permutation. The key F is the same size as a block,

it is used in the finite field permutation with a precomputation tweak of the plaintext. This

outcome X will be used for the encrypting process [45].

Encryption:

C = Ek(P X) X

Where X = F I

Decryption:

P = C Ek(Ci − 1)

The XOR and multiplication are performed using key K and F on the plaintext and

finite field (GF (2128) for AES) with a precomputation tweak.

F I = F (I0 ) = F I0 F

represents all possible values in the binary finite field of (GF (2128)).

27

This method protects from CBC mode attacks, but still have its own leak. If the

attacker changes a single block it only affects that cipher block but not all the subsequent

cipher blocks4.

2.2.2.4[2.3.2.4] XTS Mode

Figure 9 - XTS mode

In Figure 9, XTS mode is an Advanced Encryption Standard with XEX (XOR Encrypt

XOR) tweakable code value and ciphertext stealing. Simplified tweaked AES with XEX

method will use the XOR operation on the plaintext to generate the tweaked output. Then

the second-time AES encryption is applied on the tweaked output it will generate the final

ciphertext. Ciphertext stealing is a block cipher mode that allows the encryption of the

messages without having to divide them into sizes that are not divisible by the block size,

this results in same size ciphertext, but it is more complex.

X = Ek(I) αj

C = Ek(P X) X

P - The plaint text.

I - The number of the sector.

4 https://en.wikipedia.org/wiki/Disk_encryption_theory#Liskov.2C_Rivest.2C_and_Wagner_.28LRW.29

28

α - The primitive element of (GF (2128)) defined by polynomial.

j - The number of the block within the sector.

XTS mode has similar vulnerabilities like CBC mode. For example, tampering of data

can go unrecognized, which will when decryption occurs, generate gibberish. The system

must be built to recognize this potential threat and be able to protect the data using

checksums and authentication tags. This mode is prone to other vulnerabilities like replay

attacks and randomization attacks. If the attacker has access to ciphertext blocks they can

analyze them and use them for replay attacks and randomization attacks [46].

2.2.2.5[2.3.2.5] GCM (Galois/Counter Mode)

Figure 10 - GCM mode

In Figure 10, GCM is a symmetric key cryptographic block cipher. It is derived from

GMAC (Galois Message Authentication Code), an authenticated incremental message

communication. All blocks are numbered and then they are encrypted using XOR

operation (similar to a stream cipher operation order in the form of counters). GCM uses a

hash key H, it is a string of a 128 zero bits encrypted using the block cipher. For

encryption, along with the hash key, it uses a unique arbitrary length initialization vector

for each stream [47].

https://en.wikipedia.org/wiki/File:GCM-Galois_Counter_Mode.svg

29

GCM mode does not have vulnerabilities like CBC. For example, in CBC mode

tampering can occur without noticing, but in GCM the operations are performed using an

authenticated encryption method, which keeps data and communication confidential. It

also maintains integrity, by using the main function’s authentication tag or mode to verify

the data. It uses reasonable hardware resources (memory, CPU, etc.,), it also performs

very efficiently due to parallel processing, and provides high speed communication [47].

The key in GCM mode is similar to the one in LRW mode (multiplication for Galois

field) per each 128-bit block cipher (GF (2128) for AES). The GF polynomial is defined as:

x128+x7+x2+1.

Feeding the blocks of data into the GHASH function and encrypting the output will

generate the authentication tag.

GHASH (H,A,C) = X m+n+1

H - Hash Key ,A - Authenticated data (plaintext)

C – Ciphertext ,m - The number of 128-bit blocks in A

n - The number of 128-bit blocks in C

This encryption method has been shown to be secure and efficient. Currently, Google

uses as it’s mode for their website certificate.

2.2.3[2.3.3] Encryption Methods for SSD

SSD serves as a typical alternative to HDD. In fact, SSD considerably emulates the

technology of HDD such as the communication protocol and hardware interfaces. So, the

technology of HDD can quickly be adapted to SSD. However, the methods that SSDs

30

employ to process data is different from HDDs in storing, managing, accessing, and

securing. Because of the differences between the two technologies, it is possible that the

processing of the same commands on HDD will produce different results on an SSD [11].

When it comes to encryption, we need to consider these differences. There are couple of

encryption techniques that have been used for SSD. This chapter will discuss those

methods.

2.2.3.1[2.3.3.1] Dm-crypt

Dm-crypt is a disk encryption method compatible with Linux kernel version 2.6 or later.

It uses API routines. Devices are mapped to encrypted containers using a device mapper

[48]. This API uses AES-256 cryptographic method along with other methods. Dm-crypt

uses Linux Unified Key Setup (LUKS) to create encrypted containers which are

independent from outside platforms. LUKS was developed by Clemens Fruhwirth in 2004

[49]. Using this method, user can even encrypt the root device. A passphrase is required

to create encrypted containers.

There has been some research around the drawbacks of dm-crypt. For example, it has

been discovered that hackers can sidestep the passphrase to access encrypted containers

by hitting the ‘Enter’ key couple of times. They can also delete the containers, because to

delete the containers they do not require the passphrase. Utilizing disk commands on the

system an intruder can determine critical components of the hidden containers relatively

easily [50] [51].

2.2.3.1.1[2.3.3.1.1] Process Method

Dm-crypt uses device mapper and the Linux kernel’s Crypto API routines. This API is

built with a cryptographic method using an AES-256 algorithm. Dm-crypt supports XTS,

LRW, and other modes for the encryption. The encrypted containers are stored as files

inside a folder. Users can create these containers (volumes) with LUKS (Linux Unified

31

Key Setup) encryption specification that is protected by a passphrase. Using the system

device mapper, it mounts the encrypted containers on the top of existing devices. Clemens

Fruhwirth created LUKS in 2004, Dm-crypt uses this method to create encrypted

containers, which are independent from the existing platform and allow compatibility

from system to system5.

2.2.3.1.2[2.3.3.1.2] Weaknesses

Using this method, a user can encrypt the root device, but they may need a smart device

attached to the system so that they can boot to the primary system. When creating the

containers, a passphrase is required, but to delete a container a passphrase is not even

requested. This method is mainly used for Linux like systems. Some of the research

showed that you can bypass the passphrase to access the containers by just pressing the

enter key a couple of times. The file systems information displays the sizes of volumes,

which may result in someone guessing information about the hidden containers6.

2.2.3.2[2.3.3.2] BestCrypt

BestCrypt is encryption software implemented in 1995 and it is still in use. It creates,

mounts, and manages encrypted volumes called containers. Because this encryption

software is still in use, it will be selected for evaluation.

2.2.3.1[2.3.3.1] Process Method

This encryption method stores files in encrypted containers and keep them safe from

unauthorized access. The benefits of BestCrypt is the system disk volumes can be

mounted and stored as encrypted files when not in use. This method can be applied to

removable media, network shares, archived storage, and email attachments on Windows

or Linux OS. It uses the following cryptographic methods: AES, Blowfish, DES, Triple

5 https://en.wikipedia.org/wiki/Dm-crypt6 https://threatpost.com/cryptsetup-vulnerability-grants-root-shell-access-on-some-linux-systems/

121963/

32

DES, Twofish, Serpent, and GOST 28147-89. All these cryptographic methods use LRW

and CBC modes. AES, Twofish and Serpent also use XTS mode [52].

2.2.3.2[2.3.3.2] Weaknesses

It seems a viable option, but like any software it can have bugs, these errors can be as

large as damaging entire partitions7.

2.2.3.3[2.3.3.3] FDE (Full Disk Encryption)

FDE is a hardware encryption method, it started implementation in 2009 and is still in use.

It encrypts all the partitions, system files and operating system using hardware component.

This technique is used by Samsung SSDs which are commonly used. Applying FDE on

an SSD is called an SED (Self-Encrypting Drive). Self-encrypting SSDs provide better

performance than SSDs where the encryption software is installed [12]. This encryption

implementation method will be selected for evaluation.

When Full Drive Encryption (FDE) is applied on an SSD it is called a Self-Encrypting

Drive (SED). FDE was developed in 2009, it is a literal encryption of the entire system

which includes all the partitions, system files, and operating system. This encryption

method assigns the process to use the hardware component of the drive. This helps to

enhance the security by utilizing the Opal Storage Specification (which is a set of

specification features of SEDs) [12]. SED needs a master password for the SED and a user

password for each user. They are stored in the BIOS and handled by the hard disk

controller. SED uses AES 128 and AES 256.

Researchers have found the following vulnerabilities of this method: Hot Plug Attack, Hot

Unplug Attack, Forced Restart Attack, and Key Capture Attack. They have also shown

that attackers can bypass the encryption and access data; this undermines the purpose of

securing the data [53].

7http://quinxy.com/technology/jeticos-bestcrypt-volume-encryption-can-lead-to-destroyeddamagedlost-data/

https://en.wikipedia.org/wiki/GOST_28147-89

33

2.2.3.3.1[2.3.3.3.1] Process Method

This encryption method delegates the process logic to a dedicated hardware component of

the drive using the Opal storage specification (a set of specification features of SEDs) to

enhance security. The hard disk controller handles key management, it enhances the

security and protects the data from unauthorized access. SED will have two passwords and

they are User and Master password. Both passwords are stored in the BIOS. The Master

password is generated by the SED and the user password generated by users for system

access. In situations where user password is lost or forgotten then the Master password

can be used to unlock the system. It uses the following cryptographic methods: AES 128

and AES 256. Using a BIOS password, it is used for pre-boot authentication of the system.

2.2.3.3.2[2.3.3.3.2] Weaknesses

There are some attacks that are related to this method: Hot Plug Attack, Hot Unplug

Attack, Forced Restart Attack, and Key Capture Attack. Research has shown the attacker

can bypass the encryption and access data, this undermines the purpose of securing the

data [53].

2.2.4[2.3.4] Homomorphic Encryption

The first practical and feasible version of Homomorphic Encryption was introduced

by Craig Gentry applying addition and multiplication on the encrypted data over circuits

in 2009 [54] [2]. Research had shown that there were advantages of leveraging the

homomorphic encryption in the Cloud and in Multi-Party Computing environments [55]

[56]. Most of the previous implementations were asymmetric homomorphic methods. But

researchers observed that some behaviors not practical for real world usage and they

were:

Key Sizes: Ranged from 17MB to 2.25GB

Key Generation Time: Ranged from 2.5secs to 2.2hours

Cipher Text size: Much larger cipher texts

34

Noise: Creation exceeding thresholds

Time: Very long execution times

These weaknesses made homomorphic encryption impractical to use in the cloud or

real time systems [3] [57]. Currently there is no encryption method in production which

can take advantages of homomorphic features for any system [58].

It must be way, we could create an encryption methodology that could derive great

value from the advantages of the unique features of homomorphic encryption. Using

Versors from Clifford Algebra and Versors I developed a symmetric homomorphic

encryption scheme. The next section will discuss mathematical foundation of new

encryption method.

2.3[2.4] Mathematical Foundation

This section discusses the mathematical foundation which was used to architect

RVTHE.

Algebra is the base for most homomorphic encryption methods. It uses positive

numbers, real numbers, complex numbers, linear algebra, geometric algebra and function

spaces (e.g., Hilbert Spaces and Clifford Algebra) for number fields. If, Geometric

Algebra uses vector spaces with a quadratic form and it is associative, then it is called

Clifford Algebra. I chose to use Clifford Algebra for RVTHE, because it calculates a

geometric product of vectors and the generated results are not traceable, this is ideal for

level of security that we want to achieve. So, it is important to understand these Clifford

Geometric Algebra terms [59]:

Geometric Algebra is the foundation for homomorphic encryption. It uses positive

numbers, real numbers, complex numbers, linear algebra, and function spaces (e.g.,

Hilbert Spaces and Clifford Algebra) for number fields. If Geometric Algebra uses vector

35

spaces with a quadratic form and it is associative then it is called Clifford Algebra. It is

important to understand these Clifford Geometric Algebra terms [59]:

Vector: “a quantity having direction as well as magnitude, especially as

determining the position of one point in space relative to another.”

Vector Dimension: “Let V be a finite dimensional vector space over the field 𝔽.

The Dimension of V denoted dim𝔽 V is the number of vectors in any basis of V.

If V is an infinite dimensional vector space over 𝔽 then we write dim𝔽 V =∞”.

We can represent a “n” dimension vector as “nD”.

i.e. If n=2 then “2D” is used to represent a 2-dimensional vector.

Vector Space or Bivectors: “a space consisting of vectors, together with the

associative and commutative operation of addition of vectors, and the associative

and distributive operation of multiplication of vectors by scalars.”

Multivector: “a mathematical structure comprising a linear combination of

elements of different grade, such as scalars, vectors, bivectors, tri-vector, etc.”

Geometric Algebra Axioms: To understand combinations of scalars, vectors,

and bivectors, we first need to know the axioms behind the geometric algebra.

These are the proven axioms in geometric algebra. Vectors are represented by

(a ,b , c ), scalars by ¿,ε ¿ , and bivectors by (ab ,ba , ac , etc ¿.

Axiom 1: a (bc)=(ab)c (4.1.1.1)

Axiom 2: a (b+c )=ab+ac(b+c)a=ba+ca

(4.1.1.2)

Axiom 3: (λ a)b=λ(ab)=λ ab[ λ∈R] (4.1.1.3)

Axiom 4: λ (ε a)=(λε )a[ λ , ε∈R] (4.1.1.4)

Axiom 5: λ (a+b)=λa+λ b[ λ∈R] (4.1.1.5)

36

Axiom 6: (λ+ε )a=λ a+ε a [ λ , ε∈ R] (4.1.1.6)

Axiom 7: a2=¿a∨¿2 ¿ (4.1.1.7)

Axiom 8: |a · b| = |a||b| cos θ (4.1.1.8)

Axiom 9: |a ∧ b| = |a||b| sin θ (4.1.1.9)

Axiom 10: ab = a · b + a ⋀ b (4.1.1.10)

Axiom 11: a ⋀ b = −b ⋀ a. (4.1.1.11)

Product of Vectors: The result of multiplying the vectors with scalar and cross

products. These two products are foundation for geometric algebra’s inner, outer,

and geometric products of vectors.

o Scalar Product: (Also known as dot product) The magnitude of

production of vector quotients.

o Cross Product: (Also known as vector product) A binary operation on

two vectors in three-dimensional space.

o Outer Product: (Also known as wedge product) The tensor product of

two coordinate vectors.

o Inner Product: The dot product of the Cartesian coordinates of

two vectors.

o Geometric Product: The sum of the inner and outer products

Vector Inverse: When performing geometric product between vector A and

another vector B; if the result is “1” then vector B is called the inverse of vector A

and vice versa.

Blade: The outer product of k vectors is called a k-blade, suppose 1-blade means

vector, 2-blade means bivector, 3-blade means tri-vector, and so on. Where k

indicates the grade of the blade.

Versors: Versors are multiple vectors using geometric product following Clifford Geometric Algebra.

37

Vector: “a quantity having direction as well as magnitude, especially as

determining the position of one point in space relative to another.”

Vector Dimension: “Let V be a finite dimensional vector space over the field 𝔽.

The Dimension of V denoted dim𝔽 V is the number of vectors in any basis of V.

If V is an infinite dimensional vector space over 𝔽 then we write dim𝔽 V =∞”.

We can represent a “n” dimension vector as “nD”.”

i.e. If n=2 then “2D” is used to represent a 2-dimensional vector.

Vector Space or Bivectors: “a space consisting of vectors, together with the

associative and commutative operation of addition of vectors, and the associative

and distributive operation of multiplication of vectors by scalars.”

Multi-vector: “a mathematical structure comprising a linear combination of

elements of different grade, such as scalars, vectors, bivectors, tri-vector, etc.”

To show how Clifford Geometric Algebra is represented in math, I will use two

dimensional (2D) vectors for inner product, outer product, and geometric product

representations [21] [59].

2.3.1[2.4.1] Geometric Algebra Overview

Geometric Algebra combines the work of Hamilton (Quartenion) and Grassman

(Non-Commutative Algebra) into a field that generalizes the product of two vectors,

including the 3-dimensionally restricted “Cross Product” to an n-dimensional subspace of

the vector space (V) over number fields (Z , R ,C ,N ,etc .) such that the subspace is a

product space that allows two vectors to have a “geometric product” as [59]::

V 1 V 2 ¿V 1∙ V 2+V 1∧V 2 ¿

38

Where V 1 and V 2 are vectors or multivectors (i.e.: a collection of “blades”). The

peration V 1∧V 2 is known as a “wedge product” or “exterior product.” The operation

V 1 ∙V 2 is the “dot product” or “interior product” (aka. “inner product”).

For a simple pair of two dimensional vectors:

V 1=a1 e1+a2 e2

V 2=b1 e1+b2 e2

where the set {e1 , e1 } are unit vectors and {ai } , {bi } ,i=1,2 are scalars, the geometric

product follows the rules of Geometric Algebra, as described below:

e i∧ e i=0 e i∧ e j=−e j∧ ei

e i∧ e j=eij (compact notation)

e i∧ e i=0

e i ∙ e i=1

e i ∙ e j=0

Thus, by performing the geometric product of V 1and V 2we have

V 1 V 2=[( a1b1 ) e1 ∙ e1⏞ei ∙ ei=1

+( a1b2 ) e1 ∙ e2⏞e i ∙e j=0

+( a2b1 ) e2 ∙ e1⏞e j∙ e i=0

+( a2b2 ) e2 ∙ e2⏞e j ∙ e j=1]⏟̇

product

[ (a1 b1 ) e1∧ e1⏞ei∧e i=0

+( a1b2 ) e1∧e2+(a2 b1 ) e2∧ e1⏞e j∧e i=−e i∧e j

+ (a2 b2 ) e2∧ e2⏞e j∧e j ]

⏟wedge product

39

Resulting in

V 1 V 2=(a1b1+a2b2 )+(a1 b2−b1 a2 ) e1∧e2

The product V 1 V 2 produces a scalar and an object e1∧ e2 which in compact notation is

written as e12 and represents an area created bye1∧ e2 rotation (clockwise) or −e2∧ e1 in

anti-clockwise. The orientation is given by the sign of the term in front of the e1∧ e2

component.

A versors is product of vectors in the geometric product space which has simpler

inverse characteristics. V=¿ V 1 V 2 V 3… V n

2.3.2[2.4.2] Inner Product

Inner product (also called dot product or scalar product) is synonymous with

transforming

vectors into scalars. Inner product of vectors ‘a’ and ‘b’ is represented by a “a · b”.

If ‘a’ and ‘b’ are vectors, defined as: a=(a1e1+a2 e2 ) and b=(b1e1+b2 e2 ) then:

a · b=(a1 e1+a2 e2 ) · ( b1 e1+b2e2 )

a · b=(a1 b1 e1 · e1+a1b2e1 · e2+a2 b1 e2 ·e1+a2 b2 e2 · e2 )

a · b=a1 b1+a2 b2

Inner product is the magnitude of production of vector quotients. If we were to

reverse the order of the vectors to the inner product, then the resulting value will always

be the same [59]:.

a·b=b ·a

Example:

W hen a=(2e1+3 e2 ) and b=( 4 e1+5 e2 )

Then the inner product a ·bis:

40

a · b=(2 e1+3e2 )· ( 4e1+5 e2 )

a ·b=(8e1 · e1+10 e1 · e2+12 e2 · e1+15 e2 ·e2 )

a · b=8+15

a · b=23

Reversing the order of the vectors, the inner productb · a is:

b · a=(4 e1+5 e2 )· (2e1+3 e2 )

b ·a=(8e1 · e1+12 e1 · e2+10 e2 · e1+15 e2 ·e2 )

b · a=8+15

b · a=23=a· b 2.3.3[2.4.3] Outer Product

Outer product of vectors ‘a’ and ‘b’ (also called wedge product) is represented by a “

a ⋀ b”. If ‘a’ and ‘b’ are vectors defined as: a=(a1e1+a2 e2 ) and b=(b1e1+b2 e2 ) then

[59]::

a ⋀ b=( a1 e1+a2e2 )⋀ (b1e1+b2 e2 )

a ⋀ b=( a1 b1 e1 ⋀ e1+a1 b2 e1 ⋀e2+a2 b1 e2 ⋀ e1+a2 b2 e2 ⋀ e2 )

a ⋀ b=( a1 b2 e1 ⋀ e2−a2b1e1 ⋀ e2 )

a ⋀ b=(a1 b2−a2 b1)e1 ⋀ e2

a ⋀ b=(a1 b2−a2 b1)e12

In the above formula the “(a1b2−a2 b1)” represents a coefficient scalar term of the area of

a parallelogram associated with the plane containing the two basis vectors e1 and e2.

Figure 11 - Outer Product

41

Outer product of two vectors is antisymmetric. Such that a ⋀ b=−b⋀ a

Example:

‘a’ and ‘b’ are vectors and when a=(2e1+3e2 ) and b=( 4 e1+5e2 ) ;

a ⋀ b=( 2e1+3 e2) ⋀ (4 e1+5 e2)

a ⋀ b=( 8 e1 ⋀e1+10e1 ⋀e2+12 e2 ⋀ e1+15e2 ⋀ e2 )

a ⋀ b=10 e1 ⋀ e2−12 e1 ⋀ e2

a ⋀ b=−2e1 ⋀ e2 a ⋀ b ¿−2 e12

If we reverse the order of the vectors, then the outer productb ⋀ a is:

b ⋀ a=( 4e1+5 e2) ⋀ ( 2e1+3 e2)

b ⋀ a=( 8 e1 ⋀e1+12 e1 ⋀ e2+10 e2 ⋀ e1+15 e2 ⋀ e2 )

b ⋀ a=12 e1 ⋀ e2−10 e1 ⋀ e2

b ⋀ a=2 e1 ⋀ e2 b ⋀ a ¿2 e12 −b ⋀ a ¿−2 e12

The math confirms that the outer product is antisymmetric: a ⋀ b ¿ −b ⋀ a 2.3.4[2.4.4] Geometric Product

Geometric product (also called wedge product) of vectors ‘a’ and ‘b’ is represented

by a “ab”. If ‘a’ and ‘b’ are vectors defined as: a=(a1e1+a2 e2 ) and b=(b1e1+b2 e2 ) then

[59]:

ab=( a1e1+a2 e2) ( b1 e1+b2 e2 )

ab=( a1e1+a2 e2) · (b1 e1+b2 e2 )+( a1 e1+a2e2 )⋀ (b1 e1+b2 e2 )

42

ab=( a1b1e1 · e1+a1 b2 e1 ·e2+a2 b1 e2 · e1+a2b2e2 · e2 )+(a1b1 e1 ⋀ e1+a1b2e1 ⋀ e2+a2b1e2 ⋀ e1+a2b2 e2 ⋀ e2 )

ab=(a¿¿1 b1+a2 b2)+(a1 b2 e1 ⋀ e2−a2 b1 e1 ⋀ e2 )¿

ab=(a1 b1+a2 b2)+(a1b2−a2 b1)e1 ⋀ e2

ab=( a1b1+a2 b2 )+(a1 b2−a2b1)e12

The output of the geometric product contains two terms. The first term from the

output “(a1b1+a2 b2 )” is a scalar. The second term “e12” is bivector with a coefficient of “

(a1b2−a2 b1)”.

Geometric product of two vectors is not equal when we change the order of vectors.

Such that ab ≠ ba (the exception would be if the vectors are parallel then ab=ba ).

Example:

W hen a=(2 e1+3 e2 ) and b=( 4 e1+5 e2 )

ab=(2 e1+3e2 ) · ( 4 e1+5e2 )+( 2e1+3e2) ⋀ (4 e1+5 e2 )

ab=(8e1 · e1+10 e1 · e2+12 e2 · e1+15 e2 · e2 )+( 8 e1 ⋀ e1+10 e1 ⋀ e2+12 e2 ⋀e1+15 e2 ⋀e2 )

ab=(8+15 )+(10 e1 ⋀ e2−12e1 ⋀ e2)

ab=23−2e1 ⋀ e2 ab ¿23−2e12

Reversing the order of the vectors, the outer productba is:

ba=(4 e1+5 e2 )· (2 e1+3 e2 )+( 4 e1+5 e2 ) ⋀ (2 e1+3 e2 )

ba=(8 e1 · e1+12 e1 · e2+10e2 · e1+15e2 · e2 )+(8 e1 ⋀ e1+12e1 ⋀ e2+10e2 ⋀ e1+15 e2 ⋀ e2 ) ba=8+15+2 e1 ⋀ e2

43

ba=23+2e12

The math confirms that the ab ≠ ba.

2.3.5[2.4.5] Inverse of Vector

If a vector geometric product A−1L A=1 then A−1L is called the left inverse of vector

A and if AA−1R=1 then A−1R is called the right inverse of vector A . Geometric product is

not commutative, therefore the left inverse and right inverse may or may not be equal.

2.3.6[2.4.6] Versors

“One type of multivector that lends itself for inversion has the form A=a1 a2 a3 ...an w

here a1a2a3 ... an are vectors, and Versor A is their collective geometric product. Such

multi-vectors are called versors [59]:.

“Versor A=a1 a2 a3 ...an geometric product of vectors.”

Reverse of versors A is A†=an . . . a3 a2 a1 .

Multiplying A† with A

A† A=(an ...a3 a2 a1 ) (a1a2a3 ...an )

A† A=¿

A† A=¿ (¿a1∨¿2 ¿ + ¿a2∨¿2+¿a3∨¿2+. ..+¿ an∨¿2¿¿¿) Furthermore Multiplying A withA†

A A†=(a1a2a3...an ) (an... a3 a2 a1 )

A A†=¿

A A†=¿ (¿a1∨¿2 ¿ + ¿a2∨¿2+¿a3∨¿2+. ..+¿ an∨¿2¿¿¿) A† A=¿ A A† and it is scalar

A A−1=1

44

We can say A† A A−1=A†

¿¿

A−1=¿ A†

AA †

A−1 A ¿ A†

AA † A ¿ A† AA† A

= 1

For versors implies that A−1L∧A−1R are same.

Suppose A=a is a multivector, if writing A in reverse order A† = a.

A A† = ¿a∨¿2 ¿ A−1 = a−1 = a

¿a∨¿2¿

There for given ab we can derive b multiplying with a−1

aa−1 = 1 a−1 ab = b b= a

¿a∨¿2ab¿ similarly, we can obtaina= b

¿b∨¿2ab=a¿

Example:

Uusing versors and inverse we derive component of geometric product.

Assume a1=s1=a , a2=d1=b ,∧a3=s2=c

<<Chow: It is not clear what is the purpose of ❑❑, s1 a2, d1, a3❑❑. It never got used

below and it does not correspond to any formula above. Need some explanation before

you proceed. I assume a1is 2below based on a=(2 e1+3 e2 ) and a2is 3>>

W hen a=(2e1+3 e2 ) , b=( 4e1+5e2 ) and c=( 3 e1+4e2)

ab=(2 e1+3 e2 ) · ( 4 e1+5 e2 )+( 2 e1+3e2) ⋀ (4 e1+5 e2 )

45

ab=23−2e12

abc=(23−2 e12) (3 e1+4 e2 )

abc=61e1+98 e2

To derive value of b=a a−1 bcc−1 b=¿ (61 e1+98 e2 )(

3 e1+4e2

25)

b=¿ 2 e1+3 e2

13(23−2 e12)

b=¿ 113

((46+6)e1+(69−4)e2)

b=4 e1+5 e2 .

<<Chow: what this example is tryging to demonstrate. It is not clear to the reader!>>

46

CHAPTER 3

3PROBLEMS AND LIMITATIONS of What? Be specific.

[3.1] ContributionsOverview of ….

In this chapter, I will present various security problems with the Cloud and SSD

storage. I will present about various types of cyberattacks and discuss the importance of

randomness of encryption methods and its limitations. I evaluate existing encryption

methods and their performance on SSD in the Cloud and the performance penalties in

terms of IOPS. This section will show that encryption methods/techniques will affect

workload performance. I used Amazon Web Services (AWS) for this performance

benchmarking. First, I studied the storage (SSD) performance impact between various

storage options provided by AWS without encryption. Next, I benchmarked workloads

with various block sizes, read/write ratio, and encryption methods on VMs with regular,

encrypted SSD, and software encrypted containers. Also, this chapter will discuss existing

encryption methods including homomorphic encryption methods.

3.1[3.2] Defining the Problem

In the cloud computing environment, there are several security threats. Cloud Storage

SSDs brings their own strengths and weaknesses. Here I consider the causes, conditions,

and limitations of enterprise cloud storage that can generate security concerns, to see if

47

there are practical solution(s) to all stages ESD security. I will also explain how these

weaknesses are exploited using cyber-attacks. Lastly, I will discuss the limitations of

existing and proposed encryption methods including FHE.

3.1.1[3.2.1] Storage and Cloud Security

SSD Physics: Some SSD vendors implemented their FTL (Flash Translation Layer) with

errors, those errors may prevent full sanitization or may delete all the data by overwriting

the entire visible address space. Overwriting SSD address space is not always sufficient to

sanitize the drive because the data persists, and this is a time-consuming process [28].

When a file is deleted, from the OS’s perspective it is deleted, but on the SSD it may

remain until garbage collection happens with the TRIM process [11].

Persistence of Data: When an SSD write occurs, data writes to new cells, but the data

still exists in the old cells until a TRIM is executed [28]. If the key and encrypted file are

stored on the same system, there is a possibility to read the encryption key from the SSD

key storage area [29]. The SSD’s internal design and the way IO(Input/Output) operations

happen are different than HDD’s. Yet, most encryption software for SSDs was developed

using the same cryptographic algorithms that were used for HDDs. But this does not

account for SSD’s ghost data.

Data Exposure: If the data is not encrypted then there is a risk of exposing personal data,

this state can pose a security threats while data is at rest or traveling. The data can be

accessed from different devices like PCs, phones, and public networks, which can each

pose a security threat due to malware, adware, and non-secured public networks if they

have access by hacker. Public cloud poses its own security issues due to other cloud

security threats like account hijacking, human error, etc.

Account Hijacking: One of the major security issues for the cloud is account hijacking,

where someone gained access to account credentials and uses them for nefarious purposes.

48

Human Error: Human error and negligence can pose a security threat. For example, not

removing the key or plain-text file from the cloud system. In Cloud computing users must

move the key between their system and the cloud. Security issues can be caused, if the

users are not following proper security procedures and practices; such as writing password

on sticky notes, forgetting passwords, sharing passwords, sharing keys in non-secure way,

etc.

3.1.2[3.2.2] Cyber Attacks

There are various attacks can be performed by attackers. One must remember while

designing the encryption cipher should able to protect the data from these attacks.

Ciphertext-Only: When an attacker has access to ciphertext and nothing else, such as the

key or plaintext, then using statistical methods they can guess the distribution of

characters and use them to reveal the plaintext or secret key. This is called a Ciphertext-

Only attack. This most difficult type of attack for the attacker, since the attacker has the

smallest amount of information [24].

Known-Plaintext: In this case if an attacker will have some of the plaintext/ciphertext

pairs and then they use them to derive the key. This is called a Known-Plaintext attack. I

will show using statistical methods and mathematical operations manipulation and see

how I can able to derive the keys.

Chosen-Plaintext: It is similar as Known-Plaintext attack, but an attacker can choose and

manipulate the plaintext input to the encryption algorithm, then evaluate the resulting

cipher text to obtain the key.

Distinguishing-Attack: The goal of a distinguishing attack is to distinguish the keystream

of the cipher from a truly random sequence. An attacker can distinguish the cipher output

from random data faster than a brute force search is found. This sort of information can

be very valuable to an attacker to reveal the plain-text.

49

Birthday Attack: A Birthday-Attack is based on the statistical concept of the Birthday

Paradox where a match between two random items increases as the number of elements to

use increases. For example, if there are 23 people in a room the probability of two people

having same birthday increases to 50.7%. This concept is expounded upon with

determining the encryption key (Birthday Attack). While the numbers are higher, the

concept of matching the encryption key is statistically much higher than the true

randomness of the key.

Meet-in-the-Middle Attack: In this method the attacker builds a table with keys and

MACs (Message Authentication Code). A MAC is computed using 50% of the possible

keys of key length on the same plaintext. Then the attacker eavesdrops on each

transaction and compares the cipher with MAC table and reveals the key.

There are several more methods of attacks and cyber threats like spectra and

meltdown. The impact of an attacker finding a key could be devasting; this would give

attackers to access to personal, financial, medical information and prevent access to this

information from authorized users. All of these are a justification to constantly increase

the strength and complexity of ciphers which is important part of security [6].

3.1.3[3.2.3] Real Randomness

To generate an encryption key, real randomness is critical but extremely hard to

achieve on computer system. Pseudorandom numbers can be generated from the system’s

entropy resources: timing of keystrokes, exact movements of a mouse, and fluctuations of

hard-disk access time; are just to name a few [60]. The key generated from randomness of

these sources may become suspect, if an attacker is able to measure those sources and

apply them to simulate the same random number generation; but this is difficult, due to

the amount of entropy generated from these resources.

50

Timing of a single keystroke will generate 1 to 2 bytes of random data and

cryptographers think that is not enough entropy to thwart off the threat of attacker

determining the key. Better typists have a consistent typing pace, where the timing

between each keystroke will be within milliseconds, limiting frequency of which

keystroke timing can be scanned, so timing of typing data may not be random. In this

example, the attacker may have access to resources such as the computer’s microphone to

hear the keystrokes and determine the timings (pace). Even generating the randomness

using quantum physics force specific patterns that may be prone to attacks. This is

because an attacker can use the RF (Radio Frequency) field to influence these patterns

[52]. Suppose I have a key with 128 bits of random data, this can still be vulnerable

because an attacker can try 2128 computations. This brute force attack is of growing

concern as computation speeds increase.

3.1.4[3.2.4] Encryption Security Limitations

Key Strength: If the Data is encrypted, customers must use a key to manage the data

storage process. If the key was generated with low randomness, that will create weaker

security.

Encryption Algorithm: The degree of the system’s security depends on the strength of

the cryptography method and its implementation. Increased computing power allows

hackers to break encryption algorithms that were once considered state of the art.

Execution of Encryption method in the Cloud: The conventional encryption methods

have a couple of issues.

Large amount of data that needs to be transferred between the client and the cloud.

If client is okay to have the encryption key on the cloud, that means the very item

used to decrypt the file will be readily available, in case an attacker gets into the

cloud system, which is clearly a security concern.

51

If the client chooses to not store the key in the cloud, to update a file; they must

download all the encrypted file, decrypt it, modify it, encrypt it again, and upload

the encrypted file back to the cloud. As file grows it increases the overhead on the

resources.

Encryption vs Performance: There is very little research on how various encryption

software methodologies impact performance of various workloads on SSD in the cloud.

The problem with these methods is that enterprises use the same encryption software for

all types of workloads and different storage systems. Encrypting and doing regular

application workload functions simultaneously will adversely impact the read write

performance of SSD drives.

Practicality of Homomorphic Encryption: Practical Homomorphic Encryption Survey

[58] say “A significant amount of research on homomorphic cryptography appeared in the

literature over the last few years; yet the performance of existing implementations of

encryption schemes remains unsuitable for real time applications”. Due to homomorphic

encryption speeds are one of the main reasons for this conclusion, such as it takes ranging

from 2.5 sec to 2.2 hours to generate the key, the implementation is complex, noise

creation can exceed thresholds, and bigger key sizes ( 17MB to 2.25GB) require high

memory resources; all this becomes impractical in real systems [3] . Fully Homomorphic

Encryption (FHE), is on the “bleeding edge” of encryption technology. But currently

there is no FHE available for real time applications [58]. There is still a lot of work that

needs to be done to have “production ready” version of FHE.

This research will focus on deriving production ready secure, efficient, scalable, and

portable homomorphic encryption method.

3.2[3.3] Storage Security Limitations

This thesis first evaluates the SSD storage security and modern encryption software

for securing the SSD. First, I will discuss the importance of reliability and integration of

52

the SSD and then I will address security. Cloud storage primarily uses SSD as storage to

achieve performance guarantees. First, I studied SSD characteristics to understand SSD

strengths and performance metrics, when I use various storage specific encryption

methods. I want to prove with using performance benchmarking framework that

encryption will impact the performance of read and write operation of storage.

3.2.1[3.3.1] SSD System Level Induced Limitations

SSD physical structure poses reliability and scalability limitations. This can result

system level limitation like wear leveling (endurance), Bad Block Management, and

Performance. Understanding the SSD limitations can help to determine or derive better

security techniques for the device.

3.2.1.1[3.3.1.1] Physical Limitations Contribute to Logical (Software) Limitations

This chapter will describe the SSD physical limitations and how they will impact

logical SSD functions. The following four major components of SSD functions will detail

the physical and logical limitations.

3.2.1.2[3.3.1.2] Physical Level Address Map

In SSD, the address map is applied the same as traditional hard disk drives. The SSD

FTL maintains all the address table information. In figure 8, the top row is the logical

address space and the bottom row is the physical address space. From the host’s

perspective the writes and edits happen in plain sight.

53

Figure 12 - Address Mapping between physical to logical

Due to the limitations of SSD it does not allow writes on the unused pages in the

block, it instead writes to a new page in a new block, which is assigned in the physical

block (in physical world it is the string). The old pages are not erased, but they are marked

as invalid pages. Writing and rewriting to a cell causes cells to be exposed to multiple

voltage impacts which deteriorates the cell walls, which reduces its life span. To avoid

deterioration of an individual or set of SSD blocks, each rewrite follows a wear leveling

algorithm to make sure all the cells deteriorate consistently. Also, when the current

physical block is full, then another free one is assigned to the logical block. These

changes add mapping addresses to the translation table (address mapping table), which is

also stored on the SSD. The data for this table may be stored on the SSD itself, that could

decrease the storage capacity of the device [11] [61].

Even with the best wear leveling algorithm bad blocks will be created due to the

inherent limitations of SSD writes and erases. When the blocks are not reliable, they are

called bad blocks, information about these addresses are maintained by the BBM (Bad

Block Management) map. The limitation is keeping the BBM up to date, which is

important for reliability. If the BBM is not maintained with correct information about bad

blocks, then the system will try to write to those blocks. The particular data which is

written to bad blocks will not be reliable. Monitoring the BER (Bit Error Rate) is also

54

important to achieve a reliable system. ECC (Error Correction Code) is used to maintain

the BER, but the ECC engine may cause performance issues, if it is not designed to

perform in parallel for multiple channels. Correcting too many errors though, will

negatively impact the efficiency of the drive [62].

3.2.1.3[3.3.1.3] Physical Wear Leveling Limitation

TOX is a dielectric material and its thickness is a limiting factor in SSD. Floating gate

cells will lose their charge over time through TOX, due to the thinness of the TOX layer.

Floating gate cells also experience wear and tear due to additional stresses caused by

voltage fluctuations. Electric charge for “program” (writes) operations are transferred

through the TOX in the form of oxide traps. The concentration of the traps increases along

with each write and erase operation, this called oxide stress. When electrons leak from a

floating gate, these traps are used as a path for these electrons to travel toward the cell

channel region [63]. The number of electrons leaking through the border of TOX is lower

than the electrons traveling through SILC (stress-induced leakage current). If you have a

close distance for SILC between each tunneling step it increases the leakage. The TOX

thickness scalability limitation is defined by important factors: the number of traps, SILC,

and oxide voltage of the floating gate cell during retention. It’s been determined that the

TOX thickness must be 8.0-7.5 nm [64].

The floating gate cells should be able to hold a charge for minimum of 10 years. This

was determined based on how much leakage is acceptable in a 10-year time span. The

TOX thickness requirement plays an important role in defining the acceptable leakage.

The number of cycles of program/erase operations applied to that cell, also depends on

TOX thickness. After about 10 thousand program/erase cycles the cell voltage threshold

shifts upwards which would then require more voltage to do the operations of the cell.

Physically neighboring cells share the same sensing amplifier. Because of this, a voltage

55

shift in one cell will be used by neighboring cells. But this could damage cells which do

not require more voltage. The effects of cells going bad will change the over-provisioned

cell amount (each SSD is manufactured with more storage, at least 25% more than the

stated amount). Over-provisioned cells play a main role on endurance, as they decrease

the SSD life span also decreases [64].

3.2.1.4[3.3.1.4] Physical Limitation of Parallelism

When I discuss parallelism in terms of SSD, we are discussing parallelism of the read,

write and erase operations. The performance of these operations in parallel will be faster

because multiple operations are processed at the same time. There are a couple of ways to

increase the parallelism, one would be increasing the dies per channel, another would be

increasing the number of channels. In increasing the dies per channel method, this may

cause channel overloading and it may not be helpful for write performance. In increasing

the number channels method, this can pose different Error Correction Codes for each

channel, for this it needs dedicated SRAM (Static RAM). This option is scalable and can

increase performance for the read and write operations. Hence, memory components must

be coordinated to operate in parallel. The serial ‘interface’ is over flash packages which

can cause a bottle neck for the performance.

Other techniques to consider that may improve performance with parallelism: page

size, page spanning process, queueing methods, ganging multiple flash, interleaving

between flash, and the background cleaning process. With the page size technique, if the

page size is smaller this will make look up times faster and take less space than if the page

size table were larger. But this may not be good for performance, if the data blocks are not

consistently accessed. With the page spanning process technique, different flash packages

can distribute the information to a single or multiple package. If the data stays on the

same package the results will be faster performance, otherwise it goes through different

56

packages which will lower the performance. With the separate queue technique, each

package handles parallel requests simultaneously, this means there is access to all the

flash packages at the same time. This process is scalable and flexible and wear-leveling is

maintained equally. The drawback in this is each queue needs to maintain its own ECC,

SRAM, and it also complicates the FTL. Handling too many ECCs may decrease

performance. Ganging multiple flash packages technique is when SSD algorithms

combines multiple flash packages together, then maintains for that group packages the

same queues, ECCs, and FTL. It handles multi-page requests with a reduced number of

queues than the separate queue technique uses. This processing helps with less overhead

for the ECC, but too few queues to work with, can cause a bottle neck for a busy system.

With interleaving in flash packages all processes occur within a single die to speed up the

read and write operations. To avoid the latency in this process, it can access all related

blocks in one place, which is faster than crossing between flash packages through a serial

connection. The drawback of this process is it may be writing to the same blocks over and

over. When we focus on interleaving the benefits of wear-leveling are lost. Background

cleaning process of SSD happens on packages when the system is not busy. When the

cleaning process occurs, crossing between different packages means moving the erase

blocks from one package to another through the serial connection. This generally is

slower than cleaning the same die, but it will maintain wear-leveling. Each technique has

its own pros and cons, so we need to carefully analyze which technique is better

depending on each workload situation [11] [65].

There is another form of parallelism which may improve performance, placing

continuously allocated data from one domain over a set of N domains (A set of flash

memories that share a specific set of resources like channels, queues, and ECCs; that can

be divided into sub-domains as packages) like a stripe using mapping policy. Most flash

memory packages support two-plane operations to read multiple pages from two planes in

57

parallel and the operation across the dies can be interleaved. Since logical pages are

normally striped over the flash memory array, reading multiple logical continuous pages

in parallel for read ahead can be performed efficiently [11].

Figure 13 - Flashes and their parallel architecture

<<Each figure included needs to be mentioned in your main text with description.

They can not be left alone with out further explanation! Same for Figure14 and others>>

Most of the SSD operations store two bits per MLC cell. It was theorized that storing

more (3 to 4) bits in each cell would increase the performance. But research showed, the

Vth voltage threshold required for the read, write, and erase operations took longer for 3

and 4 bits than it took for 2 bits per cell. Strategy wise, running NAND chips in parallel

(Figure 9) would give the best performance, but it has its own limitations. More chips

require more current flow, and that may not be possible due to the limiting factor of the

maximum allowed current. Also, you need to read these strings using thousands of

reading circuits with lots of sensors, which can make the process too complex and is more

error prone [11].

3.2.1.5[3.3.1.5] Physical Limitation of Workload Management

In the current market, the SSD for consumer and enterprise versions are different.

Vendors built according to the anticipated workloads. Depending on the workload

58

requirements, they are built and programed with different designs. The consumer version

does not need as complicated algorithms as does the enterprise version. In the real world,

the consumer version of SSD falls short of the needs of the enterprise version (Figure 10),

in that it does not have algorithms for zero tolerance of data loss, the uptime reliability,

the endurance, the performance, and the error correction code handling; plus, it does not

need to work with multiple I/O operations. Usually enterprise SSD systems come as pure

flash (SSD) storage or hybrid (combined HDD and SSD) storage. Enterprise SSDs must

be able to simultaneously handle workloads like file, database, email, etc.; that are

generated by multiple users with various traffic patterns. These different traffic patterns

are multi-threaded random workloads, they are handled independently using multiple

initiators. Additionally, for enterprise usage it must maintain consistent I/O throughput

(IOPS), integrity and availability. The SSD controller needs to be tested thoroughly before

it can be placed into enterprise usage to handle workloads 24/7/365.

Figure 14 - Consumer Vs Enterprise SSD


This figure is far way. You may have a numbering problem!


59

In the case of power failures or other disruptions in a data center the work-loads must

be protected, so enterprise SSD systems are designed to handle those situations with the

help of ECCs and CRCs (Cyclic Redundancy Check). Reliability of the work-loads is

very important, and SSD systems are built using redundancy techniques (raid) to cover

any hardware failures. If an enterprise wanted to have the higher performance, they can

replace HDD storage with SSD, but it can become expensive. The details will be

discussed in the existing research section [11].

3.2.2[3.3.2] Existing research to mitigate the software limitations

Some of the main limitations in SSD are address mapping, parallelism (performance),

wear leveling, and workload management. The user will not have the option to change the

physical structure of the SSD. They will be limited to software approaches to mitigate the

physical limitations. This section explains the research that has been done to mitigate

these limitations. Most approaches have been focused on improving processes within the

FTL. The FTL is a core part of the SSD controller that maintains a sophisticated address

mappings ( Indirect address mappings between ‘physical block address’ and ‘logical

block address’), log-like write mechanism, GC (Garbage Collection), wear leveling, ECC,

and over-provisioning [66].

3.2.2.1[3.3.2.1] Address Mapping

One of the FTL main functions is to maintain a mapping table of virtual addresses to

physical addresses. Write operations can only happen when the block is in a special state

called “Erased”. The erase operations happen at a much coarser spatial granularity than

write operations, since page-level erases are extremely time consuming [67]. Page-level

FTL mapping can provide compact and efficient utilization of each block, but the issue is

that this takes a large amount of printing paging-table space (32MB SRAM large page

table for 16GB Flash) and in some situations the lookup time will also be higher than

calculating the off-set in block-level mapping. The block-level FTL mapping uses offset

60

to calculate the page number, to maintain page information it requires just a fraction of the

printing page-table space8. However, to lookup a page information in this mapping, is

more time- consuming than it is in page-level mapping. It also forces the logical page to

be mapped to a physical page within each block. As a result, garbage collection overhead

grows. Still the block level address mapping is the better option to use because it uses a

lot less space [68]. Both schemes are opposite extremes in their weaknesses, this means

page level mapping uses more space for the mapping table while block level mapping

generates more garbage collection [11].

To address this issue, researchers implemented hybrid FTL9, which combines page-

level and block-level address mapping in the SRAM. In this method, some of the address

table is stored on SRAM while the rest is stored on flash. This results into a problem with

the hybrid FTL approach, because random writes (need to look both areas for addresses)

induce costly garbage collection, it will affect the performance on subsequent operations.

Demand-based page-mapped FTL-DFTL (Demand-based Flash Translation Layer)

addresses this problem in their approach. DFTL stores only the most recently used address

translations on SRAM, while the rest are stored on flash [68]. The reason for this storage

strategy is that most enterprise-scale workloads exhibit significant temporal10 locality.

However, the DFTL does not support spatial locality11 of workloads, which means

frequent “evict out” operations will cause extra erase operations and page mapping lookup

overhead for workloads with less temporal locality. DFTL limits the space to store the

page table and it suffers from frequent updates to the page mapping table in the SSD flash

for write intensive workloads and garbage collection [68]. The CFTL (Convertible Flash

Translation Layer) approach tries not to depend on the space of SRAM. CFTL is a hybrid

8 Block size/Page size (128KB/2KB=64)9 Hybrid FTL contains page-level and block-level mapping for different operations.10 http://en.wikipedia.org/wiki/Locality_of_reference11 http://en.wikipedia.org/wiki/Locality_of_reference

61

FTL with efficient caching strategies and can dynamically change according to data

access patterns. CFTL’s concept is to use read-intensive data managed by block level

mapping and write-intensive data managed by page level mapping. CFTL uses a hot data

(data that is accessed the most by users) identification method to change the page

mapping table. The CFTL uses a bloom-filters-based12 scheme which can capture recent

and frequently accessed information at a fine-grained level. CFTL considers temporal and

spatial locality of workloads for page level cache. If the page size is large, this means the

chance that a file is spanning to multiple pages is lower; hence, the consecutive field of

CFTL will be less effective [69]. SCFTL (Strategy Caching Flash Translation Layer)

deals with the large page size and the spanning issue of pages. SCFTL stores a page-

mapping table in several TPs (translation pages) containing thousands of physical page

numbers and mapped to consecutive logical addresses. SCFTL’s PMT (page-mapping

table) contains TPD (translation page directory) and CMT (cache mapping table). TPD is

in RAM and indexes CMT by the most significant bits of logical addresses. The

performance degradation from offloading the mapping table is reduced by caching several

mapping entries in the CMT. CMT integrates two spatial locality exploitation techniques

and a customized cache replacement policy to enhance its efficiency of SCFTL. SCFTL

performs multilevel page table lookups for address maps. If there were a cache miss then

the request goes to TPs, if a cache miss occurs there too, then the requested block must

get it from flash [70]. CA-SSD (Content Aware SSD) is a modified FTL that adds

minimal support in the form of additional hardware for hash functions. It uses hashes as

values in the mapping table instead of page information. It also requires battery-backed

RAM to store hashes. The drawback of the approach of CA-SSD is that it depends on

battery power and extra hardware [71].

12 http://en.wikipedia.org/wiki/Bloom_filter

62

Implementing encryption on the above approaches will become cumbersome. When

the scholars studied address mapping enhancements, they may have not considered

encryption. The existing research results may not be the same with encryption and that

needs to be studied further.

3.2.2.2[3.3.2.2] Wear Leveling

Due to the locality in most workloads, writes are often performed over a subset of

blocks (e.g. file system metadata blocks). Some flash memory blocks may be frequently

overwritten and tend to wear out earlier than other blocks [11] [65]. FTLs usually employ

some wear-leveling mechanism to ‘shuffle’ cold blocks with hot blocks to even out writes

over flash memory blocks. There is has been some research with some variations on how

to approach wear leveling in the form of managing workloads. Researchers approached

implementing CAFTL (content aware FTL) for removing unnecessary duplicate writes to

improve the efficiency of garbage collection, wear-leveling, and reduce the write traffic to

flash [72]. One of the previous researchers came up with an approach to solve the wear-

leveling issue by reusing the flash blocks, which have been cycled to the specified worn

out algorithm SR-FTL (Smart Retirement FTL) [73]. Another approach is to use a dual-

pool algorithm to store cold data to the blocks that have been identified as more worn and

smartly leave them alone until wear leveling takes effect [74].

With all the bodies of research on wear leveling approaches, it is a complex (full of

unknown variables) process and there may never be a perfect solution. That’s because

there are no consistent workflows nor predictable usage of storage. So, the researchers

weigh the pros and cons for various approaches to evaluate the performance versus

endurance versus reliability with different workloads. But the inherent nature of SSD is to

move data around to maintain wear leveling. In doing so, it leaves valuable data in the

invisible address space, even though it is not retrievable by normal operations, it is still

there. Ideally, purging or overwriting the address space is most desired, but it may create a

63

lot of wear on an SSD. Encrypting the data allows us to retain existing wear-leveling

algorithms without exposing this valuable data.

3.2.2.3[3.3.2.3] Parallelism

The bandwidth and operation rate of any given flash chip is not enough to achieve

optimal performance. SSD has multiple flash arrays so we can run multiple I/O jobs

concurrently and this will improve the performance of the SSD. A single flash memory

package can only provide limited bandwidth (e.g. 32-40MB/sec). Writes are slower than

reads, other necessary background jobs like garbage collection, wear-leveling, can incur

latencies as high as milliseconds [65]. These limitations can be addressed by SSD’s clever

structure that is built with an array of flash memory packages, that are connected through

multiple channels to flash memory controllers to provide internal parallelism. The logical

block addresses as the logical interface to the host system, and it can stripe over multiple

flash memory packages. This way the data accesses can be conducted independently in

parallel, it will provide high bandwidth in aggregate and hide high latency operations, that

combination can result in high performance [72]. One way is to improve the sequential

writes is by dividing the flash array into banks, each bank will be able to read/write/erase

independently. The performance gains from internal parallelism are highly dependent on

how the SSD internal-mapping and resource management compete for critical hardware

resources. The workloads are in the form of mixing reads and writes, but they interfere

with each other, so proper address mapping management and design of applications is

critical. Most of the applications are designed for HDD storage, when we execute them to

an SSD this may be not optimal. The critical issues in SSD parallelism include: thin

interface between the storage device and the host, workload access patterns, asynchronous

background operations generated by reads and writes, effect on read ahead, ill-mapped

data layout, and application designs [75]. There are different levels of parallelisms in

SSD: Channel, Package, Die, and Plane. The previous research [75] concluded that read

64

ahead is not affected by access patterns in MLC-SSD, writes though are strongly

correlated to access patterns. Small size random writes suffer from high latencies and high

interference between reads and writes [75]. Adding a disk cache helped improve the

performance for read and write operations. But background operations like the erase

operation can cause interference with reads and writes and internal fragmentation is too

high for excessive random writes. Studies on the four levels of parallelism such as

channel, chip, die, and plane have shown a direct impact to SSD performance, but they

provided limited information, considering that the SSD structure is a block box. The

advanced commands utilize only die and plane levels of parallelism, they explore how

allocation schemes can determine priority order for multiple levels of parallelism for

different types of application loads. The channel-level parallelism should be given the

highest priority order among the four levels and it was observed that chip level parallelism

keeps chips very busy. The service request can only be handled when chips are idle [75].

Parallelism has the biggest impact on SSD performance. The advantages of existing

parallelism can still be viable even with the addition of encryption methodologies for

storage.

3.2.2.4[3.3.2.4] Workload Management Integrated with SSD

Performance is highly workload-dependent. Well-designed systems, databases, and

applications improve performance. The following are some of the classic examples of

integrating SSD to systems to achieve better performance [11]. Integrating the SSD into

existing system is a complex process. Scalability (replacing 1GB of HDD with 1GB of

SSD) is limited by cost effectiveness, because the gains in performance don’t justify the

added expenses. HybridDyn (Integration of HDD and SSD storage) is an innovative

storage design that is cost-effective and improves performance and endurance. It handles

incoming workloads by dynamically partitioning and distributing them between SSD and

HDD, this design showed better performance than HDD alone [76]. Another research

65

approach is LSM-tree-based store with an open-channel SSD to utilize channel level

parallelism. Level DB (a fast key-value storage library in LSM-tree-based store) is

extended as multi-threaded to fully utilize the channel level parallelism with evaluating

optimal I/O request scheduling and dispatching. Evaluating the utilization of channel level

parallelism’s impact on I/O performance showed that it outperforms conventional SSDs

[13]. Another system, Libra tracks the I/O consumption of each tenant, it recognizes the

application’s dynamic I/O usage profiles and provides I/O resources accordingly. Libra

based VOP (virtual I/O operations) captures the non-linear relationship between SSD I/O

bandwidth and I/O operations throughput, it does this while considering the disk-IO (disk

Input Output) cost model [77]. Hadoop13 workloads showed a performance increase over

HDD alone when an SSD was integrated into the underlying storage system.

The research showed workloads performance always improved with adding SSD or

just SSD as the storage. SSD is faster than HDD, so adding it to the storage system it was

expected to improve performance. But, in some cases, the applications won’t able to

utilize the SSD performance fully due to the nature write guarantees. This research studies

the impact on performance of the different types of workloads with the different

encryption methodologies.

13 http://www.sandisk.com/assets/docs/increasing-hadoop-performance-with-sandisk-ssds-whitepaper.pdf

66

CHAPTER 4

4CLOUD STORAGE ENCRYPTION ANALYSIS

4.1 Contributions

In this section I showed how the SSD storage performance by storage type(t2 micro

versus i1.xlarge), encryption software methods affect the performance. I proved that both

aspects there is performance penalties for workloads.

4.2 Measurement Environment

Each Amazon EC2 (Elastic Compute Cloud) instance can access disk storage from

disks that are physically attached to the host computer. This disk storage is referred to as

an instance store or EBS (Elastic Block Store) volumes. An instance store provides

temporary block-level storage for use with an instance. The size of an Amazon instance

store ranges from 8GB to 48TB, and varies by instance type (i.e., larger instance types

have larger instance stores) for HDD. Using regular SATA SSD, the storage ranges from

8GB to 6.4TB, if the storage type is NVMe (Non-Volatile Memory express) SSD then the

storage ranges from 8GB to 16TB.

Amazon EBS provides two volume types: Standard volumes and Provisioned IOPS

volumes, which differ in performance characteristics and price. Standard volumes offer

storage for applications with moderate or burst I/O requirements. These volumes deliver

67

approximately 100 IOPS on average but can burst up to hundreds of IOPS. Provisioned

IOPS volumes offer storage with consistent and low-latency performance, which allows

users to predictably scale to thousands of I/O operations per second per Amazon EC2

instance. These volume-types are designed for applications with I/O-intensive

workloads. Backed by SSDs, Provisioned IOPS volumes support up to 30 IOPS per GB,

which enables a system to be provisioned up to a maximum of 4,000 IOPS per volume.

While it is possible to stripe multiple volumes together to achieve up to 48,000 IOPS

when attached to larger EC2 instances, but as per theory it may show as regular SSD disk

volumes, so we did not evaluate this type of VMs. When attached to an EBS-optimized

instance, Provisioned IOPS volumes are designed to deliver consistent performance

within 10 percent of the guaranteed rate throughput (Provisioned IOPS) 99.9% of the

time. In addition, the delivered IOPS rate depends on the block size of the various reads

and writes. Amazon Provisioned IOPS volumes process reads and writes in I/O block

sizes of 16KB or less with every increase in I/O size above 16KB, linearly increasing. A

significant amount of data was produced during the experiments and it was used to

analyze the main concepts about SSD performance variations with different variables

including encryption methods.

The experiments in this study have been conducted on three different 64-bit VM

(Virtual Machine) instances in Amazon EC2, the first one was an Amazon Linux AMI

(HVM) 2014.03.1, the remaining two VMs were Amazon Ubuntu Server 16.04 LTS

(HVM). The first VM is an instance store (i2.xlarge) of an 800GB SSD, which can

provide up to 36,000 IOPS. The second VM (standard t2.micro) is an 8GB instance store

with 3,000 IOPS. And the third VM (standard t2.micro) is an 8GB encrypted EBS

General Purpose (SSD) Volume Type with 3,000 IOPS.

The first VM is drastically different from the other two (in: memory, vCPUs, and

processor model), I chose those VMs to analyze their unique SSD characteristics. The

68

second and third VMs are similar (having the same: ECUs, 1GB memory, vCPUs (1),

and processor (2.5 GHz, Intel Xeon Family)); the only difference between the two VMs

is where one of them is a standard instance store SSD without encryption, and the other

VM has an attached EBS SSD volume with encryption.

4.2.1.1 Selection of Encryption methods

I selected the following two software encryption methods, encrypted SSD and regular

SSD. The following explains each in very high level of them about what type of

algorithm I used in these evaluations.

Dm-crypt:

Dm-crypt is a disk encryption method compatible with Linux kernel version 2.6 or later.

It uses API routines. Devices are mapped to encrypted containers using a device mapper

[48]. This API uses AES-256 cryptographic method along with other methods. Dm-crypt

uses Linux Unified Key Setup (LUKS) to create encrypted containers which are

independent from outside platforms. LUKS was developed by Clemens Fruhwirth in 2004

[49]. Using this method, user can even encrypt the root device. A passphrase is required

to create encrypted containers.

There has been some research around the drawbacks of dm-crypt. For example, it has

been discovered that hackers can sidestep the passphrase to access encrypted containers

by hitting the ‘Enter’ key couple of times. They can also delete the containers, because to

delete the containers they do not require the passphrase. Utilizing disk commands on the

system an intruder can determine critical components of the hidden containers relatively

easily [50] [51].

BestCrypt:

69

BestCrypt is encryption software downloaded and created encrypted volumes on OS

level. Use them to store secure data with encryption password. These volumes are

mounted as file system to store data. I applied AES encryption algorithm as option to

gather performance statistics [52].

Self-Encrypting Drive (SED):

When Full Drive Encryption (FDE) is applied on an SSD it is called a Self-Encrypting

Drive (SED). FDE was developed in 2009, it is a literal encryption of the entire system

which includes all the partitions, system files, and operating system. This encryption

method assigns the process to use the hardware component of the drive. This helps to

enhance the security by utilizing the Opal Storage Specification (which is a set of

specification features of SEDs. SED needs a master password for the SED and a user

password for each user. They are stored in the BIOS and handled by the hard disk

controller. SED uses AES 128 and AES 256.

Researchers have found the following vulnerabilities of this method: Hot Plug Attack,

Hot Unplug Attack, Forced Restart Attack, and Key Capture Attack. They have also

shown that attackers can bypass the encryption and access data; this undermines the

purpose of securing the data [53].

4.2.1.2 Experimental Tools and Workloads

To evaluate the internal parallelism of SSDs by producing the necessary workloads in

this research, FIO (Flexible I/O) Synthetic Benchmarks were used14. FIO is a tool that

generates multi-threaded workloads with different configuration variables to fully utilize

the hardware, such as: a read/write ratio, a block size and the number of concurrent jobs.

14 http://freecode.com/projects/fio

70

This process produces a report that contains the bandwidth, the IOPS, the latency, plus

many other measurements; as performance metrics of specific I/O workloads for each

SSD storage device over a given period of time (60 seconds in this research study). A

sample FIO command is provided below:

fio --filename=/dmcrypt/4krandreadwrite6040j8 --direct=1 --rw=randrw --size=1024m --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=4k --rwmixread=60 --iodepth=8 --numjobs=8 --runtime=60 --group_reporting --name=4krandreadwrite60j8 --output=/home/output/4kdmcryptrandreadwrite60j8

Sample FIO Command

In the Sample FIO Command, the file size to be written is 1024MB (size=1024m), in

block sizes of 4K (bs=4k), split between 60 percent random read (rwmixread=60) and 40

percent random write (=100-60% read), with 8 jobs (numjobs=8) running in parallel for 60

seconds(runtime=60).

Experiments were executed independently on each virtual machine to fully utilize the

SSD parallelism capability while introducing variations in the block size (4k, 8k, 16k,

32k, 64k, and 128k), the number of parallel jobs (8), and the random read/write ratio in

100 percent reads, 100 percent writes, and 60/40 read/write workloads. These factors

were used on an unencrypted SSD, two different software-based encryption methods on

SSD, and one fully Amazon encrypted SSD. Each experiment was executed for a total of

60 seconds utilizing the FIO benchmark, version 2.1.7.

The first FIO command executed at 100% writes, the next one was a 60/40 read/write

ratio and the last one was 100% reads. Each one was executed for six different block

sizes with 8 number of jobs. To emulate enterprise workload environment, I used

random read/writes workload environment. The research is about how these workloads

get affected based on the encryption method and its implementation. A queue depth of

eight was selected as sufficient, because only a handful of earlier trials were utilizing a

depth past eight.

71

4.2.1.3 SSD performance without Encryption

Having completed lengthy experiments, plus being prepared with the knowledge

about the internal structure of SSDs, and background information regarding the storage

options within Amazon EC2; I am now positioned to evaluate the experimental results

and answer several related questions.

I created different types of VM instances using SSDs with different IOPS ranges. Our

research considered all of those to understand the internal characteristics of SSD. Baseline

metrics were created from those experiments to use for performance comparisons with

various encryption implementations.

4.2.1.3.1 Performance differences between Amazon EC2 VMs

There were significant differences in the performance between the two Amazon EC2

instances. While this was expected, it was interesting to validate the actual performance

characteristics of the two different instances versus the specs that Amazon provided about

their VMs.

In Graph 1 and Graph 4, the performance of the i2.xlarge instance consistently out-

performed the t2.micro instance in all experimental runs with all block sizes. In addition,

this difference typically increases as the read/write ratio transition closer to 100 percent

reads, regardless of whether evaluating a sequential or random read/write. This is likely

since the instance store volume is physically attached to the computer to which the EC2

instance is running. Our experiments focused on random read writes. One of the

limitations in this comparison is that the total random reads and writes were limited to

35,000 IOPS on the i2.xlarge instance and only 3,000 IOPS for the t2.micro instance.

This prompted us to compare the t2.micro instance store versus the existing EBS storage

volume to perform a more in-depth comparison of the two different storage mechanisms.

Section In 5.4 I will discusses the results.

72

4.2.1.3.2 Did various block sizes significantly affect I/O throughput?

I observed in both Amazon EC2 instances that as the block size increases the number

of IOPS decreases along with the execution time to complete the required reading and

writing of data by FIO. This is most likely because as the block size increases, there is

less frequent overhead required to manage the writing of larger blocks. In addition, as

expected with the increased block sizes, the reading or writing of data is also completed

in increasingly larger chunks. The metrics in Graph 1 plot the ratio of reads and writes

versus the number of IOPS completed for various levels of block sizes. I can see that as

the block size increased the IOPS decreased, with one exception at 100% reads 16K

outperformed 8k.

0 10 20 30 40 50 60 70 80 90 1000

10000

20000

30000

40000

50000

60000

70000

i2.xlarge Block size can affect number of IOPS

4k - rand read

8k - rand read

16k - rand read

4k - rand write

8k - rand write

16k - rand write

Read Percentage

IOPS

Graph 1 - IOPS Vs Block Size

4.2.1.3.3 Did various levels of parallelism significantly affect I/O throughput?

Experiments were performed consisting of 8, 16, and 32 threads, or jobs, operating in

parallel on all block sizes. As seen in the Graph 2 (using a block size of 8K), I did not

see any significant improvements between 8 threads, 16 threads, or 32 threads; but

instead saw a drop in IOPS for the 16 thread and 32 thread simulations. This may

indicate the SSD is saturated after 8 threads and cannot provide any increase in

performance using parallelism. The main observation that 8 threads or jobs saturated the

73

SSD parallelism and increasing the jobs did not help.

0 10 20 30 40 50 60 70 80 90 1000

5000

10000

15000

20000

25000

30000

35000

40000

45000 i2.xlarge Number of jobs VS IOPS16 jobs rand read8 jobs rand read32 jobs rand read8 jobs rand write16 jobs rand write32 jobs rand write8 jobs rand read write16 jobs rand read write32 jobs rand read write

Read Percentage

KB /

Sec

Graph 2 -Parallelism Vs Throughput

4.2.1.3.4 Did random and sequential jobs have a significant different IOPS?

In Graph 3 , I observed there was no significant difference between the observed

behavior of sequential reads and writes versus those of random reads and writes. The

i2.xlarge instance has been optimized by Amazon for random reads and writes, as it even

performed better than the corresponding sequential reads and writes. This occurs around

55 percent reads and 45 percent writes and continue until about 90 percent reads, where

sequential outperforms random reads/writes again. The results showed that at 100 percent

sequential write it was significantly slower than the equivalent random write. I

hypothesize this is related to garbage collection or trying to understand the changing of

write mode at the FTL level. However, there is no such gain for random reads/writes on

the t2.micro machine. As can be seen in Graph 3, the total random reads/writes are

capped around 3,000 IOPS for 4k or 8k block sizes. This performance is expected per the

performance metrics provisioned by Amazon for the EBS volume attached to this

instance. Additionally, at no time does random read/write operations outperform

sequential read/write operations. This type of performance is more in line with what is

expected from a traditional SSD.

74

0 10 20 30 40 50 60 70 80 90 1000

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

i2.xlarge Random vs Sequential 8 jobs rand read

8 jobs se-quential read

8 jobs rand write

8 jobs seq write

8 jobs rand read write

8 jobs seq read and write

Read Percentage

KB /

Sec

Graph 3 - Random Versus Sequential Operations

4.2.1.3.5 SSD Random Workload Analysis on t2.micro VM without encryption

From Section 5.3.1 to 5.3.4, I observed that random and sequential operations are

very close in IOPS. Amazon provides different numbers of IOPS for various types of

SSD VMs. I chose Amazon t2.micro (16.04 LTS HVM, SSD volume type VM instance

store in EC2) as the VM machine. And I used the block sizes (4k, 8k, 16k, 32k, 64k, and

128k) on random reads, writes, and read/writes to establish baseline metrics. These

metrics will be used for comparing with different encryption methods workloads. These

experiments were done using random workloads for 100 percent reads, writes and 60/40

read/writes (Mixed).

4k 8k 16k 32k 64k 128k0

500

1000

1500

2000

2500

3000

3500

IOPS and Random Workloads Without Encryptions

Read IOPS

WriteIOPS

Mixed IOPS

Block Size

IOP

S

75

Graph 4 - t2.micro Block Size Versus IOPS

4k 8k 16k 32k 64k 128k0

1000000

2000000

3000000

4000000

5000000

6000000

7000000Block Size and No Encrypted SSD Performance

Read IO

Write IO

Mixed IO

Block Size

KB/S

ec

Graph 5 - t2.micro Block Size Versus KB/Sec

In Graph 4 and Graph 5, it was observed workloads for 100 percent reads, writes and

60/40 read/writes showed similar IOPS (maximum IOPS Amazon provisioned) for 4k,

8k, and 16k block sizes. Once it reached 32k the IOPS decreased 40%, 64k IOPS

decreased 60%, and 128k decreased to 85% of the 4k block size IOPS, but as seen in

Graph 5, overall reading and writing of data to the disk increased because of increased

block size. I hypothesize that this is related to the block size overhead, but the increase is

not proportional to the block size data input. Also, another important SSD characteristics

I observed, was that reads were faster than writes as shown in Graph 5 for the block sizes

(32k, 64k, and 128k – which were less impacted by Amazon maximum provisioned

IOPS). This type of performance is more in line with what is expected from a traditional

SSD. Going back to Graph 4, the evidence of Amazon’s data capping is clear at 4k, 8k,

and 16k, plus 32k mixed, where IOPS hovers around 3,110. I used these metrics as

baseline for future comparisons.

4.2.1.4 SSD performance with Encryption

In chapter 5.3 I established set of baseline metrics, I then ran the same experiments

with various encryption methods, block sizes, and workloads. I chose not to vary the

number of jobs based on the data described in section 5.3.3, which showed little

difference between 8 jobs versus 16 or 32 jobs. So, I set the number of jobs/threads to 8

76

for all block sizes and all workloads. These experiments conducted on two different

software encryption methods (BestCrypt and Dm-crypt) and one encrypted SSD by

Amazon. Amazon EBS volumes are encrypted with unique 256-bit key using the AES-

256 algorithm. Also, when you snapshot (a way of cloning storage volumes) these

volumes share the same key15. Customers maintain these keys using their own key

management infra-structure.

To execute the experiments, I created a working environment by creating a VM in

Amazon EC2 and installing encryption software and FIO benchmarking software. I used

the same process for both software-based encryption methods. For the encrypted SSD, I

created a VM and attached encrypted EBS SSD volume to it. The following graphs

(Graph 6 – 16) will show the different encryption methods and their performance patterns

for the different the block sizes versus IOPS and KB/Sec.

15 http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSEncryption.html

77

4.2.1.4.1 Did various block sizes significantly affect IOPS

I observed that as the block size increases the number of IOPS decreases along with

the execution time to complete the required reading and writing of data by FIO in all

types of encryption software methods. The performance metrics from Graph 6, 7, 8

showed a similar decrease of IOPS for all types of encryption methods.

4k 8k 16k 32k 64k 128k0

500

1000

1500

2000

2500

3000

3500

Encrypted SSD Block Size Versus IOPS

Read IOPS WriteIOPS Mixed IOPS

Block Size

IOP

S

Graph 6 - Encrypted SSD Block Size Versus IOPS

4k 8k 16k 32k 64k 128k0

50100150200250300350400450500

BestCrypt Block Size Versus IOPS

Read IOPS

WriteIOPS

Mixed IOPS

Block Size

IOPS

Graph 7 - Best Crypt Block Size Vs IOPS

78

4k 8k 16k 32k 64k 128k0

200

400

600

800

1000

1200

1400

dm-crypt Block Size Versus IOPS

Read IOPS

WriteIOPS

MixedIOPS

Block Size

KB

/Se

c

Graph 8 - Dm-crypt Block Size Vs IOPS

One of the main characteristics of SSD is reads outperform writes, but when I use

encryption software they showed the opposite results, writes performed better than reads

(Graph 4 versus , Graph 7, Graph 8 - This is a very significant discovery about doing

encryption on SSDs. This finding may indicate that when using software-based

encryption on an SSD, the decryption (read) process takes more time than the encryption

(write) process.

4.2.1.4.2 Did various block sizes significantly affect Performance Throughput

In the previous section, I observed that IOPS decreased as block size increased in all

encryption methods.

4k 8k 16k 32k 64k 128k0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

Encrypted EBS SSD Volume Block Size Versus throughput

Read IO

Write IO

Mixed IO

Block Size

KB

/Se

c

Graph 9 - Encrypted EBS SSD Volume Block Size Versus throughput

79

4k 8k 16k 32k 64k 128k0

50000

100000

150000

200000

250000

300000

BestCrypt Block Size Versus Throughput

Read IO

Write IO

Mixed IO

Block Size

KB

/Sec

Graph 10 - BestCrypt Block Size Versus Throughput

4k 8k 16k 32k 64k 128k0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

1800000

Dm-Crypt Block Size Versus Throughput

Read IO

Write IO

Mixed IO

Block Size

KB

/Se

c

Graph 11 -Dm-Crypt Block Size Versus Throughput

In the Graph 9, I observed there was no significant difference between reads and

writes versus unencrypted SSD throughput. Also, I observed that at 32k and higher I do

not see any significant throughput increase. In Graph 10, using the software encryption

method Best Crypt, I observed the 32k block size had the lowest performance of all other

block sizes. In Graph 11, using software Dm-crypt encryption I observed that the

throughput has a linear increase as the block size increases.

4.2.1.4.3 Did various Encryptions Versus Performance Throughput

The experiments showed that there is a significant difference between throughput

with encryption software methods versus SSD or encrypted SSD. I did various workload

performance metrics experiments for all sizes (4k, 8k, 16k, 32k, 64k, and 128k) of blocks.

80

In Graph 12 and Graph 13 I observed that an encrypted SSD outperformed software-

based encryption methods (graphs show only for 4k block size). In Graph 12, the

encrypted volume showed very similar performance to regular unencrypted SSD.

Without En-cryption

Encrypted EBS SSD

Dm-crypt BestCrypt0

500

1000

1500

2000

2500

3000

3500Encryption Methods Versus IOPS

Read IOPS

Write IOPS

Mixed IOPS

Encryption Methods

IOP

S

Graph 12 - Encryption Methods versus IOPS

Without En-cryption

Encrypted EBS SSD

Dm-crypt BestCrypt0

200000

400000

600000

800000

1000000

1200000

1400000

1600000Encryption Methods Versus Throughput

Read IO

Write IO

Mixed IO

Encryption Methods

KB

/Sec

Graph 13 - Encryption Methods versus Throughput

4.2.1.4.4 Reads, Writes and Mixed workloads Versus Block Sizes.

Graph 14, Graph 15, and Graph 16 indicates that as the block size increased the IOPS

decreased. For software encryption methods, block sizes of 64k and 128k had lower

performance than the 4k, 8k, 16k, and 32k. For 128k block size the encryption method

Best Crypt, 100% reads showed such a low performance that it could be measured in just

a single digit (only 9 IOPS), this was by far the lowest performance of all the encryption

methods.

81

4k 8k 16k 32k 64k 128k0

500

1000

1500

2000

2500

3000

3500Read workloads - Block Sizes Versus Throughput

Without Encryption

Encrypted EBS SSD

Dm-crypt

BestCrypt

Block Size

IOP

S

Graph 14 - Read workloads for various Block Sizes

4k 8k 16k 32k 64k 128k0

500

1000

1500

2000

2500

3000

3500Write workloads - Block Sizes Versus Throughput

Without Encryption

Encrypted EBS SSD

Dm-crypt

BestCrypt

Block Size

IOP

S

Graph 15 – Write workloads IOPS for various Block Sizes

4k 8k 16k 32k 64k 128k0

500

1000

1500

2000

2500

3000

3500Mixed Workloads - Block Sizes Versus Throughput

Without Encryption

Encrypted EBS SSD

Dm-crypt

BestCrypt

Block Size

IOP

S

Graph 16 - Mixed Workloads IOPS for Various Block Sizes

82

4.3 Fully Homomorphic Encryption Limitations

<<Chow: Sections 4.3.1 and 4.3.2 should be moved to after Section

1.7>> This section should be just performance evaluation. Should

not include introduction material.>>

4.3.1 Possible fully homomorphic encryption method:

In 2009 Craig Gentry introduced the first possible fully homomorphic encryption

method with an arbitrary depth circuit (composed of additions and multiplications) on the

encrypted data. This research provided the blueprint of FHE, it is referred to as SwHE

(Somewhat Homomorphic Encryption) and it uses limited depth circuit, addition, and

multiplication for evaluation [2]. This research helped develop an encryption method

using lattice-based, integer-based, LWE (learning-with-errors), and RLWE (ring-

learning-with-errors). Further research of SwHE and FHE showed promise for potential

usage in cloud computing environments and other MPCs (multi-party computing) [55]

[56]. In the Gentry method, using a lattice-based scheme takes too long to generate the

key (ranging from 2.5 sec to 2.2 hours), the implementation is complex, noise creation

can exceed thresholds, and bigger key sizes ( 17MB to 2.25GB) require high memory

resources; all this becomes impractical in real systems [3] . Fully Homomorphic

Encryption (FHE), is on the “bleeding edge” of encryption technology. But currently

there is no FHE available for real time applications [58]. There is still a lot of work that

needs to be done to have “production ready” version of FHE.

Gentry defines the algorithm, in public-key encryption scheme ε consists of three

algorithms: KeyGenε, Encrypt ε, and Decrypt ε. KeyGenε takes λ security parameter as

input and implemented as randomly which it results a public key pk and secret key sk

and public key pk. Plain text space P and ciphertext space C is defined by pk. Gentry’s

encryption method Encrypt ε also randomized algorithm and it uses pkand plaintext π ∈ P

83

as input, and generates outputs a ciphertext ψ ∈ C. His decryption technique Decrypt ε

takes sk and ψ as input, and outputs the plaintext π. Algorithm computations work of all

of them must be polynomial in λ. Algorithm correctness is:

if (sk,pk) R← KeyGenε, π ∈ P, and ψ R← Encrypt ε(pk,π), then Decrypt ε(sk ,ψ) → π.

Homomorphic encryption scheme has property possibly randomized efficient

algorithm ¿ε|¿ is derived using public key pk and a circuit C from a permitted set C ε of

circuits, and tuple of ciphertexts ψ = ⟨ ψ1 , ...,ψ t ⟩ for the input wires of C; generated

ciphertext ψ ∈ C. Informally, the functionality that I want from ¿ε|¿ is that, using pk if

ψi “encrypts π i”, then ψ ← ¿ε|¿(pk,C,ψ) “encrypts C(π1 ,... , π t)” using pk, for input (

π1 ,... , π t) generates output C(π1 ,... , π t) of C. For the encryption the minimal

requirement is correctness. The following are couple of different ways to formalize

Gentry’s homomorphic encryption methods. Gentry defined them as follows.

“Definition 1: (Correctness of Homomorphic Encryption). Gentry says a homomorphic

encryption scheme ε is correct for circuits in C ε if, for any key-pair (sk,pk) output by

KeyGenε(λ), any circuit C ∈ C ε, any plaintexts π1 ,... , π t, and any ciphertexts ψ =

⟨ ψ1 , ...,ψ t ⟩ with ψi ← Encrypt ε(pk,π i), it is the case that: ψ ← ¿ε|¿(pk,C,ψ), then

Decrypt ε(sk,ψ ) → C(π1 ,... , π t) except with negligible probability over the random coins

in ¿ε|¿.

By itself, mere correctness fails to exclude trivial schemes. Suppose I define ¿ε|¿(pk

,C,ψ) to just output (C,ψ) without “processing” the circuit or ciphertexts at all, and

Decrypt ε to decrypt the component ciphertexts and apply C to results.

Definition 2: (Compact Homomorphic Encryption). We say that a homomorphic

encryption scheme E is compact if there is a polynomial f such that, for every value of the

84

security parameter λ, E’s decryption algorithm can be expressed as a circuit DE of size at

most f(λ).

Definition 3: (“Compactly Evaluates”). We say that a homomorphic encryption scheme

E “compactly evaluates” circuits in CE if E is compact and correct for circuits in CE.

Definition 4: (Fully Homomorphic Encryption). We say that a homomorphic encryption

scheme E is fully homomorphic if it compactly evaluates all circuits.

Definition 5: (Leveled Fully Homomorphic Encryption). We say that a family of

homomorphic encryption schemes {E(d) : d ∈ Z+} is leveled fully homomorphic if, for

all d ∈ Z+, they all use the same decryption circuit, E(d) compactly evaluates all circuits

of depth at most d (that use some specified set of gates), and the computational

complexity of E(d)’s algorithms is polynomial in λ, d, and (in the case of ¿ε|¿) the size of

the circuit C.

Definition 6: ((Statistical) Circuit Private Homomorphic Encryption). We say that a

homomorphic encryption scheme E is circuit-private for circuits in CE if, for any keypair

(sk, pk) output by KeyGenE(λ), any circuit C ∈ CE, and any fixed ciphertexts Ψ =

hψ1,...,ψti that are in the image of EncryptE for plaintexts π1,...,πt, the following

distributions (over the random coins in EncryptE, ¿ε|¿) are (statistically) indistinguishable:

EncryptE(pk, C(π1,...,πt))≈¿ε|¿(pk,C,Ψ)

The obvious correctness condition must still hold.

Definition 7: (Leveled Circuit Private Homomorphic Encryption). Like circuit private

homomorphic encryption, except that there can be a different distribution associated to

each level, and the distributions only need to be equivalent if they are associated to the

same level (in the circuit) [3].”

85

All the above definition to show very high level of Gentry’s work to defend the thinking

behind homomorphic encryption. Gentry scheme was asymmetric encryption scheme and

his work very revolutionary to the thought behind homomorphic encryption scheme

bringing back to the world, so for that reason all the definitions are mentioned in this

thesis, but the details of his work is out of scope of this this research. The mathematics

used in his scheme has some shortcomings because the primitive itself is not

homomorphic but his circuit computation, algorithm allowed the homomorphism

properties. His algorithm organizes the data manipulates the circuits to achieve the

computations on encrypted data.

4.3.2 FHE with Vector Space

The first simple and efficient FHE cipher using multi vectors, called EDCHE

(Enhanced Data – Centric Homomorphic Encryption) was presented by DaSilva. It uses

geometric algebra and multivector spaces Rn , where n is 2 or 3. And these vectors

represent the dimensions of vector space 2D and 3D respectively. When using a 3D

vector space, it will generate an encrypted file that is 8 to 10 times the size of original

plaintext fileCITATION DASILVA ¿1033[25]. This makes it hard for users to justify this

method for their applications. When creating the most robust secure algorithms, the

cryptographer needs to keep in mind that the algorithms should be simple, efficient,

secure, practical, and able to accommodate computer resources. This gives an

opportunity to develop a new FHE cipher to fulfill these requirements.

86

4.3.3 Previous homomorphic encryption using multivector technique.

64 128 256 512 10240

20

40

60

80

100

120

140

Key Size and Time in sec on Regular SSD

AES-Crypt Ecnryption Xlg-Crypt Encryption AES Crypt Decryption xlg Crypt Decryption

Graph 17 - Multivector Based Homomorphic Encryption

In this method I used different key sizes ranging from 64 bits to 1024 bits for

encryption and decryption. I observed that when comparing the performance in terms of

time; xlg-crypt underperformed than AES-crypt for full file encryption and decryption.

Xlg separates itself from AES because as it is fully homomorphic encryption and does

not need to encrypt/decrypt an entire file on every update. Due to this unique

characteristic, xlg-crypt will outperform AES-crypt on smaller updates.

Even though xlg-crypt takes more time to encrypt than AES-crypt, it offers additional

security features. Such as the unique nature of xlg allows a client to work with all, some,

or even none of the encrypted files from the server. This allows the system to only

expose necessary parts of encrypted files to the client keeping the rest of the files

encrypted and secured on the server. When using Symmetric encryption methods, during

any update process the decrypted (plain text) version of the file exists until it is deleted.

As xlg-crypt is homomorphic encryption I do not need to have any plain text file on VM

due to its characteristics.

87

32 64 128 256 5120

100

200

300

400

500

600

700

800

900

Encrypted file size in MB on Regular SSD VM

AES-Crypt xlg-Crypt

Graph 18 - Multivector based encrypted file sizes

I also observed that when I encrypt a 100 MB file AES-crypt created a 101 MB

encrypted file while xlg-crypt created an 801MB encrypted file. In general, the xlg-crypt

I have observed 8 times the encrypted file generated from original plain text file. This is

due to xlg-crypt math algorithm calculations it creates bigger encrypted file. Its output

file is 8 times larger, due to that it takes longer time to decrypt. Xlg uses 3 dimensional

and infinite filed space and it causes this growth. Even though it takes more space, each

update does not require a rewrite of cells in the SSD, this is more aligned with endurance

concerns on SSD storage devices.

88

CHAPTER 5

5 RVTHE

RVTHE (Reduced Vector Technique Homomorphic Encryption) is new symmetric

homomorphic encryption method and this chapter descripts its design and homomorphic

properties.

5.1 Contributions

In this chapter I will present a new symmetric homomorphic encryption method

“Reduced Vector Technique Homomorphic Encryption”. This section will discuss its

design, mathematical implementation, and homomorphic properties.

89

5.2 Design of RVTHE

Design of RVTHE depends on Versors. Mathematically a Versor can be represented as:

A=a1 a2 a3 ...an

In the design of RVTHE we will have n−1 vectors as the number of keys and one vector as the data. For example, if n=5 then there would be 4 vectors of keys and 1 vector of data.

Vectors Example 1 Example2a1 Key3 Dataa2 Key1 Key1a3 Data Key4a4 Key2 Key2a5 Key4 Key3

Table 2 - Key and data location in versors

The location of each key and the data are flexible, their locations are determined by the designer.

To reduce the generated cipher text, we must choose two term vectors.

5.2.1 RVTHE Encryption and Decryption

Each key is a random generated number that is converted into base 10. We divide

each key and the data into two parts. And use them as two terms (coefficients) for each

vector.

5.2.2 Encryption of RVTHE

Once we design the keys and data locations. We perform a geometric product

operation from the first two vectors and that will generate an intermediate ciphertext.

Next, we perform geometric product operation between the intermediate cyphertext and

the next vector, repeat this calculation for each vector. This will generate cipher text.

90

E(d1)=s1 s2 s3d1 …sn

From above s1 s2 s3 …sn are keys and d1 is data and E is encryption.

5.2.3 Decryption of RVTHE

For the decryption process finding the inverse of the key vectors is critical. First, we

perform a geometric product operation between the cipher text and the inverse of the key,

that will generate an intermediate ciphertext. Next, we perform a geometric product

operation between the intermediate cyphertext and the next vector inverse, repeat this

calculation for each vector. This will generate the plain text.

D(c1)=s3−1 s2

−1 s1−1 s1 s2 s3 d1… sn sn

−1

From above s3−1 s2

−1 s1−1… sn

−1 inverse vectors fors1 s2 s3 …sn keys and c1 is data and

D is decryption.

In our implementation we chose to use three vectors, two vectors for keys ( s1 s2¿ and

one vector for data(d1).

5.3 Mathematical Implementation of RVTHE Using Versors

In the versors example from section 3.3, while using the vector inverse, we derived

the vector ‘b’ value. If we chose to present the same example in terms of encryption

methods then the a ,b ,∧c from the math become s1 , d1 ,∧s2 in the scheme RVTHE, and

they represent the first secret key, the data value, and the second secret key, respectfully.

91

RVTHE’s mathematical representation choosing two secret keys and one data is

shown in the format of s1 d1 s2 . In other words, we chose only three vectors a1 , a2 , a3 in

our implementation.

Encryption method is represented as ‘E’ and Decryption method represented as ‘D’.

Assume

a1=s1=a

a2=d1=b

a3=s2=c

And assigning these values, where e1∧e2are the vectors.

a=(2 e1+3 e2 )

b=( 4 e1+5e2 )

c=( 3 e1+4 e2)

Then

For Encryption of d1 which E(d1 ¿=abc

ab=(2 e1+3 e2 ) · ( 4 e1+5 e2 )+( 2 e1+3e2) ⋀ (4 e1+5 e2 )

ab=23−2e12

abc=(23−2 e12) (3 e1+4 e2 )

abc=61e1+98 e2

For Decryption to derive d1

D(E(d¿¿1))=D (abc )=a a−1 bcc−1=d1=¿¿ b

b=¿ (61 e1+98 e2 )(3 e1+4e2

25)

b=¿ 2 e1+3 e2

13(23−2 e12)

b=¿ 1

13((46+6)e1+(69−4)e2)

92

b=4 e1+5 e2 .

This Encryption implementation is based on Versors providing a new way to utilize

the Geometric Product of Algebra.

5.4 Homomorphism of RVTHE

In this section I will show the properties of Homomorphism of RVTHE.

Homomorphism will have addition, subtraction, multiplication and division properties

[78].

5.4.1 Addition

We represent

data 1 ( d1 ) , data2 (d2 ) ,

secret key 1 ( s1 )∧secret key 2(s2)

Prove the following

E(d1+d2 ¿=E ( d1 )+E (d2)

Example:

When d1=8∧d2=6 thend1+d2=16

s1=( 2e1+3 e2 ), d1=( 4 e1+4 e2 ) and s2=( 3 e1+4 e2)

Then Encryption of

E(d1 ¿=( (2e1+3 e2 ) ( 4 e1+4 e2 ) (3 e1+4 e2 ))

E(d1 ¿=44 e1+92 e2

E(d2 ¿=( (2e1+3 e2 ) ( 3 e1+3e2 ) ( 3 e1+4 e2))

E(d2 ¿=33 e1+69 e2

E(d1+d2 ¿=( (2e1+3 e2 ) (7 e1+7e2 ) (3 e1+4e2 ))

E(d1+d2 ¿=¿ 77 e1+161e2

93

E(d1 ¿ + E(d2 ¿=77 e1+161 e2

It proves E(d1+d2 ¿=E ( d1 )+E (d2)

5.4.2 Subtraction

E(d1−d2¿=E (d1 )−E (d2)

Example:

When d1=8∧d2=6 thend1−d2=2

s1=( 2e1+3 e2 )

d1=( 4 e1+4 e2 )

s2=( 3 e1+4 e2)

Then Encryption of

E(d1 ¿=44 e1+92e2

E(d2 ¿=33 e1+69 e2

E(d1−d2¿=((2e1+3 e2 ) (e1+e2 ) (3e1+4 e2 ))

E(d1−d2¿=¿ 11e1+23e2

E(d1 ¿−¿ E(d2 ¿=11 e1+23 e2

It proves E(d1−d2¿=E (d1 )−E (d2)

5.4.3 Multiplication

In vectors we have scalar multiplication.

d1=8∧scalar r1=2then E (r1 d1)=r 1 E(d1)

s1=( 2e1+3 e2 ) ,d1=( 4 e1+4 e2 ) and s2=( 3 e1+4 e2) Then Encryption of

E(r1 d1¿=(( 2e1+3e2 ) (8 e1+8 e2 ) (3 e1+4 e2 ))

E(r1 d1¿=88 e1+184 e2

94

E(d1 ¿=44 e1+92 e2

r1 E(d1)¿=88 e1+184e2

This proves E(r1 d1¿=r1 E(d1)¿ for scalar multiplication.

5.4.4 Division

In vectors we have scalar division.

d1=8∧scalar r1=1 /2 then E (r1 d1)=r1 E(d1)

s1=( 2e1+3 e2 ) ,d1=( 4e1+4 e2 ) and s2=( 3e1+4 e2) Then Encryption of

E(r1 d1¿=(( 2e1+3e2 ) (2 e1+2e2 ) (3e1+4 e2 ))

E(r1 d1¿=22e1+46 e2

E(d1 ¿=44 e1+92 e2

r1 E(d1)¿=22 e1+46 e2

This proves E(r1 d1¿=r1 E(d1)¿ for scalar division.

Design of RVTHE depends on Versors. Mathematically a Versor can be represented as:

A=a1 a2 a3 ...an

5.5 Security of RVTHE

There is a need to make sure this encryption method is good enough in terms of

security. Design of RVTHE depends on versors (two-dimensional vectors), Geometric

Product, and inverse. The dimensions of vectors contribute an extra layer of security. This

can be accomplished by using simple mathematical manipulations on known information.

Security of RVTHE is derived from applying mathematical manipulations on known-

plaintext and known-ciphertext and try to derive keys.

95

CHAPTER 6

6 IMPLEMENTATION AND EVALUATION OF RVTHE

I evaluated encryption, decryption and the ability to update/append real time files

without decrypting and re-encrypting on the RVTHE scheme. I ran these evaluations on

a cloud system provided by one of the leading cloud providers Amazon (AWS EC2).

6.1 Contributions

In this chapter I will discuss how I converted RVTHE into executable program. This

section deeply discusses the implementation of RVTHE into various applications and

compares it with AES-Crypt encryption method in terms of speed of encryption and

decryption performance. I evaluate high level RVTHE security analyzing mathematical

operations and statistical evaluations.

6.2 Implementation of RVTHE

AES-Crypt is one of most widely known methods for encrypting individual real-time

files. It offers a high speed and security. I choose it to develop same type of executable to

provide encryption and decryption. So, I created an executable crypto program in ‘C’

language based on the RVTHE scheme like AES-Crypt program. I executed that on real-

time files for encryption, decryption, and appending new data to the end of the encrypted

file without decrypting the original ciphertext.

I did use the following command to run encryption and decryption.

AES-crypt:

time aescrypt -e -p key plaintext_file_name

time aescrypt -d -p key plaintext_file_name.aes

RVTHE:

96

time xlg -e -x key1 -y key2 plaintext_file_name

time xlg -d -x key1 -y key2 plaintext_file_name.xlg

time xlg -a -x key1 -y key2 “data” plaintext_file_name.xlg

In the above commands, the ‘-e’ indicates encryption, ‘-d’ indicates decryption, and ‘-

a’ indicates append.

When we used a 512-bit key for the AES-Crypt program we then split that key into

two 256-bit keys for RVTHE (xlg) program. I did this for all key sizes starting from 64 to

1028-bit key sizes. To evaluate I chose 256-bit key size.

6.3 Experimental Systems

Our evaluations have been conducted on a 64-bit Amazon EC2 virtual machine SSD

instance. I chose an instance type t2-micro, which has 1 vCPU, 1GB Memory, and 8GB

maximum storage. I specifically selected a VM with SSD storage, because SSD has high

performance and has become an industry standard for cloud computing.

AES-Crypt is one of most widely known methods for encrypting individual real-time

files. It offers a high speed and security. I choose it to develop baseline statistics on

speed and output file size after encryption. It is also using an AES algorithm and it is a

symmetric method like RVTHE. I compared them in the context of encryption speeds,

decryption speeds and encrypted file size (disk storage used by cipher text).

In addition, I will also explain the additional security and efficiency benefits that are

unique to a homomorphic encryption method.

6.4 Experimental Evaluations

I ran both executables (AES-crypt and RVTHE) on various key sizes and files sizes.

The key sizes were 64, 128, 256, 512, and 1024-bit and file sizes were 1MB, 10MB, and

97

100MB. From that we measure the speed for encryption and decryption plus the storage

size of the encrypted file on the cloud server.

6.4.1 Time measurements on various key sizes

64 128 256 512 10240

2

4

6

8

10

12

Key Size and Time in sec on Regular SSD for 100MB file

AES-Crypt Ecnryption RVTHE Encryption AES Crypt Decryption RVTHE Decryption

Key Sizes in Bits

Tim

e in S

ec

Graph 19 - Key size Vs Encryption/Decryption time in Sec

<<Each graph should be mentioned in main text with description. Cannot leave it

alone like this. You need provide interpretation of the results. Is it good or bad? What is

the trends. Do not leave the readers to interpret themselves. Be nice to your reader.>>

For Encryption I did not observe a sizeable increase in time for any key size for either

encryption method. Across the board RVTHE required less time for encryption than

AES-Crypt. For decryption the fastest method was RVTHE at 64-bit. But the larger the

key size the more time decryption process took for the RVTHE method. Most commonly

used key size is 256-bit, for that both encryption method speeds are almost same.

Note: Having homomorphic features as in RVTHE means that full file decryption

should be rare. In other words, file updates do not require the full file to be decrypted.

98

6.4.2 Time measurements on various file sizes

1 10 1000

1

2

3

4

5

6

Key Size and Time in sec on Regular SSD -used in paper

AES-Crypt Ecnryption RVTHE Encryption AES Crypt Decryption RVTHE Decryption

File size in MB

Tim

e in S

ec

Graph 20 - File size and Encryption/Decryption times

I chose 256-bit key size for all tests. The larger the file size the more time took for

encryption and decryption both RVTHE and AES-Crypt method.

1 10 1000

0.010.020.030.040.050.060.070.08

Key Size and Time in sec on Regular SSD -used in paper

AES-Crypt Ecnryption Rate RVTHE Encryption Rate AES Crypt Decryption Rate RVTHE Decryption Rate

File size in MB

Tim

e in

Sec

Across the board RVTHE required less time for encryption than AES-Crypt. Also,

RVTHE performed at about the same rate regardless of the file sizes for encryption. As

with encryption the RVTHE decryption process performed at about the same regardless

of file sizes.

99

6.4.3 Size measurements on Encrypted Files

64 128 256 512 10240

50

100

150

200

250

Encrypted file size in MB on Encrypted Volume in SSD

AES-Crypt xlg-Crypt

Key Size

Siz

e o

f t

he f

ile i

n M

B

Graph 21 - Encrypted file sizes in MB

The output file generated from encryption process is always the double the of original

file for RVTHE whereas AES-Crypt has only 10% penalty.

Note: When you need to 1GB file while decrypting full file to modify you need

another 1GB or more space any way needed. Saying that working with 1GB file you need

minimum of 2.2 GB for AES-Crypt for RVTHE I need 3.2GB but it will be rare.

6.5 Security Evaluation of RVTHE

There are various attacks can be performed by attackers. I evaluate RVTHE in two

major type of attacks to show designing the encryption cipher of RVTHE is very secure.

Ciphertext-Only:

Suppose an attacker has access to ciphertext of RVTHE and nothing else, such as the

key or plaintext, then using statistical methods and mathematical operations to reveal the

plaintext or secret key.

We represent data 1 ( d1 ) , data2 (d2 ) , secret key 1 ( s1 )∧secret key 2(s2)Example:

When d1=8∧d2=6 thend1+d2=16s1=( 2e1+3 e2 ), d1=( 4e1+4 e2 ) and s2=( 3e1+4 e2)

Then Encryption of

100

E(d1 ¿=( (2e1+3 e2 ) ( 4 e1+4 e2 ) (3 e1+4 e2 ))

E(d1 ¿=44 e1+92 e2 = C1

E(d2 ¿=( (2e1+3 e2 ) ( 3 e1+3e2 ) ( 3 e1+4 e2))

E(d2 ¿=33 e1+69 e2 = C2

Cipher-text c1 and c2produced by data 1 ( d1 ) , data2 (d2 )while applying statistical

method it is very hard to evaluate as these cipher texts stored with two dimensional. Even

applying statistical and mathematical operations such as additions and subtractions I do

not see way to derive the keys.

C1 + C2=77 e1+161 e2

C1−C2=11e1+23 e2

Known-Plaintext:

In this case if an attacker will have some of the plaintext/ciphertext pairs and then they use

them to derive the key. This is called a Known-Plaintext attack. I will show using

statistical methods and mathematical operations manipulation and see I can able to derive

the keys.

I represent data 1 ( d1 ) , data2 (d2 ) , secret key 1 ( s1 )∧secret key 2(s2)

Example:

When d1=8∧d2=6 thend1+d2=16s1=( 2e1+3 e2 ), d1=( 4 e1+4 e2 ) and s2=( 3 e1+4 e2)

Then Encryption of

E(d1 ¿=( (2e1+3 e2 ) ( 4 e1+4 e2 ) (3 e1+4 e2 ))

E(d1 ¿=44 e1+92 e2 = C1

E(d2 ¿=( (2e1+3 e2 ) ( 3 e1+3e2 ) ( 3 e1+4 e2))

101

E(d2 ¿=33 e1+69 e2 = C2

Cipher-text C1¿data1 (d1 ) , C2 ¿data2 (d2 )while applying statistical method is very

hard to evaluate as I have two keys while designing RVTHE . Even applying statistical

methods and mathematical operations such as additions and subtractions. I do not see a

way to derive the keys due to having two keys and two-dimensional vectors. There is a

pattern 11,33,44, and 77 but no way I can guess plaintext or keys.

C1 + C2=77 e1+161 e2

C1−C2=11e1+23e2

102

CHAPTER 7

7 FUTURE WORK AND CONCLUSION

7.1 Contributions

This chapter provides the challenges which I overcome while doing this study. I also

discuss the high level of research flow of my work for developing new cipher. I present

about new cipher, how did I designed, implemented, and calculated the performance and

its security strength.

The goal and purpose of the study is to explain what an encryption can do for devices

in terms of tradeoffs between performance and security. The main thought behind this

thesis is that there is a way to have high security and better performance without having

to compromise each other. This ultimately prompted me to study how different types of

encryptions, like BestCrypt, Dm-crypt, and SED, can affect SSD security and

performance. The above-mentioned encryption software has their own drawbacks. The

study has proved they have performance differences for sequential, random reads and

writes. Though, little research has been done for various workloads like random read

writes. And most of the enterprise workloads are random. I showed that over the years,

SSD has been changed to handle to perform better for random workloads. Next, I

evaluated how modern SSDs handle random workloads when using encryption.

Evaluating different workloads with many types of functions for different encryptions

will produce various performance metrics. I went through selected methods (BestCrypt,

Dm-crypt, and encrypted Elastic Block Store volume from Amazon) to analyze their

strengths and weaknesses for SSDs in terms of both security and performance. This

helped to look into use of homomorphic encryption in the Cloud. After this I did survey

previous approaches of Homomorphic Encryption methods.

103

Applying fully homomorphic encryption, it is possible to achieve cyber security as it

allows a series of new commutations on encrypted data. Technically I can start zero bytes

file or data and encrypt it and compute applying homomorphic encryption on zero

bytes file and never expose or leave data foot print on disk. Rivest et al in [20] first

mentioned this idea and Gentry first proposed fully homomorphic encryption [2] using

binary circuits on encrypted data and performed basic mathematical operations. All other

scholars inspired by Gentry’s approach and improve his scheme or contributed new

approaches. Gentry’s presented a theoretical approach of homomorphic encryption

providing a new way to solve security encryption, but his solution is not ready to be

applied easily and thus it is impractical.

RVTHE is a new homomorphic symmetric encryption scheme based on Clifford

Geometric Algebra. The foundation for this encryption used mathematics extensions of

versors, geometric product and inverse in the form of language. Geometric Algebra is a

very critical part for developing its design and framework. The design is of this new

cipher is simple, but combining versors, geometric product, and inverse will generate a

strong cipher. This is very substantial and powerful and requires a great deal of each

aspect of cipher such as security or defense from various attacks, design of algorithm and

performance. In this work I showed how to design following design principles [24], its

application in real world showing the performance, and mathematical approach of

security defense towards attacks. I created a measurable benchmark to calculate

encryption speeds. First, I did experiments to understand encryption performance

penalties on SSD in the Cloud and then setup an experimental environment to calculate

RVTHE performance.

7.2 Challenges and Lessons Learned

This section will present the research flow of work, some of the challenges and issues

that were faced during the study of encryption methods and how did I overcome those

challenges.

104

1. When I began starting to work on my research work, first I learned about SSDs. I did

various benchmarks to calculate its performance on AWS. To conclude the storage

shared among various tenants the blocks are reused between tenants. It was not that

easy to prove that block level in the Cloud. I got an idea, I deleted all the data from

local SSD drive and able to recreate the file using freely available recovery software.

That itself gave me the first step to investigate sanitization of SSD.

[2.] Once I proved that SSD has its limitations about sanitization I started investigating

about encryption methods and running them in the Cloud to how much it will cause

performance overhead and any security issues. The research proved there is indeed

performance and shows there is hidden folders for encrypted containers. I found there

areis issues with these software packagess.

2.[3.] The main strength to create secure and efficient cipher the mathematical foundation

is very important. To understand this there is more study related to geometric algebra

and finding is there a possibility of using that math for encryption was sure

challenging.

3.[4.] Once I found out I can able to decrease the size of cipher it is challenge to design

because of the way RVTHE creates scalar and multivector in the intermediate step

before creating vector as final product. First, I proved that on the paper and make sure

it is what it is. After that converted to program and choosing a comparable program

AES-Crypt to compare with is also challenging.

7.3 Success of work

In this work I developed a Reduced Vector Technique Homomorphic Encryption

(RVTHE) and it is a symmetric somewhat homomorphic encryption. RVTHE was

developed based on using Versors and Clifford Geometric Algebra properties. The

evaluation of our implementation shows I can edit/append a file in .001 sec. In the case of

105

full file encryption, RVTHE is 75% faster on encryption and 25% slower on decryption

compared with encryption software ‘AES Crypt’. RVTHE generated ciphertext size

reduced to 25% from previous approaches, using multi-vectors and Clifford Geometric

Algebra, and has the potential to use on real workloads. It is a great success as it is faster

efficient and only takes twice the size for cipher text.

7.4 Future Work

RVTHE encryption method can be explored using different types of technologies and

systems. We can include hardware system and other features. In cloud computing the

Homomorphic Encryption provides secure computing. It does this by allowing users to

compute in the Cloud without converting the cipher text into plain text. RVTHE (an

implementation of homomorphic encryption) satisfies that requirement efficiently. We

can explore using RVTHE in various applications and databases such as password stores.

There is more to study Geometric Algebra and come up with new encryption method.

7.5 Conclusion

Securing the data involves two stages, data at rest, and data while traveling. “Data at

rest” means before sending to Cloud and on the Cloud. “Data while traveling” means

sending the data between Client and Cloud. This can be achieved using homomorphic

encryption techniques and RVTHE can be part of it. In conclusion, cloud computing the

Homomorphic Encryption provides secure computing. It does this by allowing users to

compute in the cloud without converting the cipher text into plain text. RVTHE (an

implementation of homomorphic encryption) satisfies that requirement efficiently.

106

REFERENCES

[1] E. Aïmeur and D. Schőnfeld, "The ultimate invasion of privacy: Identity

theft," in 2011 Ninth Annual International Conference on Privacy, Security and

Trust, 2011.

[2] C. Gentry, "Fully Homomorphic Encryption Using Ideal Lattices," in

Proceedings of the Forty-first Annual ACM Symposium on Theory of

Computing, New York, NY, USA, 2009.

[3] C. Gentry and S. Halevi, "Implementing Gentryś Fully-homomorphic

Encryption Scheme," in Proceedings of the 30th Annual International

Conference on Theory and Applications of Cryptographic Techniques:

Advances in Cryptology, Berlin, 2011.

[4] O. Dictionaries, "Definition of security," [Online]. Available:

https://en.oxforddictionaries.com/definition/security.

[5] R. Kissel, R. Kissel, R. Blank and A. Secretary, "Glossary of key

information security terms," in NIST Interagency Reports NIST IR 7298

Revision 1, National Institute of Standards and Technology, 2011.

[6] N. Ferguson, B. Schneier and T. Kohno, Cryptography Engineering: Design

Principles and Practical Applications, Wiley Publishing, 2010.

[7] S. Mauw and M. Oostdijk, "Foundations of Attack Trees," in Proceedings

of the 8th International Conference on Information Security and Cryptology,

107

Berlin, 2006.

[8] C. E. Shannon, "Communication theory of secrecy systems," The Bell

System Technical Journal, vol. 28, pp. 656-715, Oct 1949.

[9] "Intro-Samsung Elec. Datasheet (K9LBG08U0M).," 2007.

[10] J.-U. Kang, J.-S. Kim, C. Park, H. Park and J. Lee, "A Multi-channel

Architecture for High-performance NAND Flash-based Storage System," J.

Syst. Archit., vol. 53, pp. 644-658, sep 2007.

[11] R. Micheloni, A. Marelli and K. Eshghi, Inside Solid State Drives (SSDs),

Springer Publishing Company, Incorporated, 2012.

[12] B. Bosen, "Full Drive Encryption with Samsung Solid State Drives," nov

2010.

[13] P. Wang, G. Sun, S. Jiang, J. Ouyang, S. Lin, C. Zhang and J. Cong, "An

Efficient Design and Implementation of LSM-tree Based Key-value Store on

Open-channel SSD," in Proceedings of the Ninth European Conference on

Computer Systems, New York, NY, USA,, 2014.

[14] D. E. Denning and P. J. Denning, "Data Security," ACM Comput. Surv.,

vol. 11, pp. 227-249, 9 1979.

[15] M. Tebaa, S. E. Hajji and A. E. Ghazi, "Homomorphic encryption method

applied to Cloud Computing," in 2012 National Days of Network Security and

Systems, 2012.

[16] S. I. M. O. N. SINGH, The code book : the science of secrecy from ancient

108

Egypt to quantum cryptography, NEW YORK : ANCHOR BOOKS, 2000.

[17] J. Nechvatal, E. Barker, L. Bassham, W. Burr and M. Dworkin, "Report on

the development of the Advanced Encryption Standard (AES)," 2000.

[18] N. I. of Standards and T. (NIST), "FIPS Publication 46-2: Data Encryption

Standard," 1993.

[19] J. Nechvatal, E. Barker, D. Dodson, M. Dworkin, J. Foti and E. Roback,

"Status report on the first round of the development of the Advanced

Encryption Standard," Journal of Research of the National Institute of

Standards and Technology, vol. 104, 1999.

[20] R. L. Rivest, L. Adleman and M. L. Dertouzos, "On Data Banks and

Privacy Homomorphisms," Foundations of Secure Computation, Academia

Press, pp. 169-179, 1978.

[21] L. N. Childs, A Concrete Introduction to Higher Algebra, Volume1,

Springer, 1979.

[22] S. Burris and H. P. Sankappanavar, A Course in Universal Algebra-With 36

Illustrations, 2006.

[23] A. Acar, H. Aksu, A. S. Uluagac and M. Conti, "A Survey on

Homomorphic Encryption Schemes: Theory and Implementation," CoRR, vol.

abs/1704.03578, 2017.

[24] B. Schneier, Applied Cryptography (2Nd Ed.): Protocols, Algorithms, and

Source Code in C, New York, NY, USA,: John Wiley & Sons, Inc., 1995.

109

[25] D. W. H. A. D. A. Silva, "Fully Homomorphic Encryption over exterior

product spaces," UCCS Master Thesis Report, 2017.

[26] K. Zhao, W. Zhao, H. Sun, X. Zhang, N. Zheng and T. Zhang, "LDPC-in-

SSD: Making Advanced Error Correction Codes Work Effectively in Solid

State Drives," in Presented as part of the 11th USENIX Conference on File and

Storage Technologies (FAST 13), San, 2013.

[27] P. Huang, P. Subedi, X. He, S. He and K. Zhou, "FlexECC: Partially

Relaxing ECC of MLC SSD for Better Cache Performance," in Proceedings of

the 2014 USENIX Conference on USENIX Annual Technical Conference,

Berkeley, 2014.

[28] M. Wei, L. M. Grupp, F. E. Spada and S. Swanson, "Reliably Erasing Data

from Flash-based Solid State Drives," in Proceedings of the 9th USENIX

Conference on File and Stroage Technologies, Berkeley, 2011.

[29] J. Reardon, S. Capkun and D. Basin, "Data Node Encrypted File System:

Efficient Secure Deletion for Flash Memory," in Proceedings of the 21st

USENIX Conference on Security Symposium, Berkeley, 2012.

[30] Y. Choi, D. Lee, W. Jeon and D. Won, "Password-based Single-file

Encryption and Secure Data Deletion for Solid-state Drive," in Proceedings of

the 8th International Conference on Ubiquitous Information Management and

Communication, New York, NY, USA,, 2014.

[31] N. I. of Standards and Technology, FIPS PUB 46-3: Data Encryption

Standard (DES), pub-NIST:adr,: pub-NIST, 1999.

110

[32] K. Bhargavan and G. Leurent, "On the Practical (In-)Security of 64-bit

Block Ciphers: Collision Attacks on HTTP over TLS and OpenVPN," in

Proceedings of the 2016 ACM SIGSAC Conference on Computer and

Communications Security, New York, NY, USA,, 2016.

[33] M. A. Wright, "Feature: The Advanced Encryption Standard," Netw.

Secur., vol. 2001, pp. 11-13, oct 2001.

[34] N. Ferguson, J. Kelsey, S. Lucks, B. Schneier, M. Stay, D. Wagner and D.

Whiting, "Improved Cryptanalysis of Rijndael," in Proceedings of the 7th

International Workshop on Fast Software Encryption, London, 2001.

[35] A. Biryukov, O. Dunkelman, N. Keller, D. Khovratovich and A. Shamir,

Key Recovery Attacks of Practical Complexity on AES Variants With Up To 10

Rounds, 2009.

[36] B. Schneier, "Description of a New Variable-Length Key, 64-bit Block

Cipher (Blowfish)," in Fast Software Encryption, Cambridge Security

Workshop, London, 1994.

[37] A. Biryukov and D. Wagner, Slide Attacks, L. Knudsen, Ed., Berlin,

Heidelber: Springer Berlin Heidelberg, 1999, pp. 245-259.

[38] B. Schneier, J. Kelsey, D. Whiting, D. Wagner and C. Hall, "On the

Twofish Key Schedule," in Proceedings of the Selected Areas in Cryptography,

London, 1999.

[39] N. Ferguson, J. Kelsey, B. Schneier and D. Whiting, "A Twofish Retreat:

Related-Key Attacks Against Reduced-Round Twofish," 2000.

111

[40] J. J. G. Ortiz and K. J. Compton, "A Simple Power Analysis Attack on the

Twofish Key Schedule," CoRR, vol. abs/1611.07109, 2016.

[41] R. Anderson, E. Biham and L. Knudsen, Serpent: A Proposal for the

Advanced Encryption Standard, 1998.

[42] User:Dake commonswiki, "File:Serpent-linearfunction.png," 2005.

[Online]. Available: https://commons.wikimedia.org/wiki/File:Serpent-

linearfunction.png.

[43] M. Hermelin, J. Y. Cho and K. Nyberg, "Multidimensional Linear

Cryptanalysis of Reduced Round Serpent," in Proceedings of the 13th

Australasian Conference on Information Security and Privacy, Berlin, 2008.

[44] J. Rizzo and T. Duong, "Practical Padding Oracle Attacks," in Proceedings

of the 4th USENIX Conference on Offensive Technologies, Berkeley, 2010.

[45] M. Liskov, R. L. Rivest and D. Wagner, "Tweakable Block Ciphers," in

Proceedings of the 22Nd Annual International Cryptology Conference on

Advances in Cryptology, London, 2002.

[46] L. Martin, "XTS: A Mode of AES for Encrypting Hard Disks," IEEE

Security and Privacy, vol. 8, pp. 68-69, may 2010.

[47] D. A. McGrew and J. Viega, "The Security and Performance of the

Galois/Counter Mode (GCM) of Operation," in Proceedings of the 5th

International Conference on Cryptology in India, Berlin, 2004.

112

[48] Dm-crypt, "Dm-crypt," [Online]. Available:

https://wiki.archlinux.org/index.php/dm-crypt/Device_encryption. [Accessed

10 12 2016].

[49] C. Fruhwirth, "LUKS- Wikipedia," [Online]. Available:

https://en.wikipedia.org/wiki/Linux_Unified_Key_Setup. [Accessed 2018].

[50] L. s. weakness, "https://thehackernews.com/2016/11/hacking-linux-

system.html," https://thehackernews.com/2016/11/hacking-linux-system.html.

[Online].

[51] d.-c. plausible-deniability, "https://blog.linuxbrujo.net/posts/plausible-

deniability-with-luks/," https://blog.linuxbrujo.net/posts/plausible-deniability-

with-luks/. [Online].

[52] M. Bauer, "Paranoid Penguin: BestCrypt: Cross-platform Filesystem

Encryption," Linux J., vol. 2002, pp. 9--, jun 2002.

[53] B. Daniel and K. Fowler, "Bypassing Self-Encrypting Drives (SED) in

Enterprise Environments," Europe,,, 2015.

[54] C. Gentry, "A fully homomorphic encryption scheme," 2009.

[55] A. López-Alt, E. Tromer and V. Vaikuntanathan, "On-the-fly Multiparty

Computation on the Cloud via Multikey Fully Homomorphic Encryption," in

Proceedings of the Forty-fourth Annual ACM Symposium on Theory of

Computing, New York, NY, USA, 2012.

[56] M. Tebaa and S. E. Hajji, "Secure Cloud Computing through Homomorphic

Encryption," CoRR, vol. abs/1409.0829, 2014.

113

[57] W. Wang, Y. Hu, L. Chen, X. Huang and B. Sunar, "Accelerating fully

homomorphic encryption using GPU," in 2012 IEEE Conference on High

Performance Extreme Computing, 2012.

[58] C. Moore, M. OŃeill, E. OŚullivan, Y. Doröz and B. Sunar, "Practical

homomorphic encryption: A survey," in 2014 IEEE International Symposium

on Circuits and Systems (ISCAS), 2014.

[59] J. Vince, Geometric Algebra: An Algebraic System for Computer Games

and Animation, 1st ed., Springer Publishing Company, Incorporated, 2009.

[60] D. Davis, R. Ihaka and P. Fenstermacher, Cryptographic Randomness from

Air Turbulence in Disk Drives, Y. G. Desmedt, Ed., Berlin, Heidelber: Springer

Berlin Heidelberg, 1994, pp. 114-120.

[61] J. Kim, J. M. Kim, S. H. Noh, S. L. Min and Y. Cho, "A Space-efficient

Flash Translation Layer for CompactFlash Systems," IEEE Trans. on Consum.

Electron., vol. 48, pp. 366-375, may 2002.

[62] R. Micheloni, A. Marelli and R. Ravasio, Error Correction Codes for Non-

Volatile Memories, 1st ed., Springer Publishing Company, Incorporated, 2010.

[63] J. H. Stathis, "Reliability Limits for the Gate Insulator in CMOS

Technology," IBM J. Res. Dev., vol. 46, pp. 265-286, mar 2002.

[64] P. Olivo, T. N. Nguyen and B. Ricco, "High-field-induced degradation in

ultra-thin SiO2 films," IEEE Transactions on Electron Devices, vol. 35, pp.

2259-2267, dec 1988.

114

[65] N. Agrawal, V. Prabhakaran, T. Wobber, J. D. Davis, M. Manasse and R.

Panigrahy, "Design Tradeoffs for SSD Performance," in USENIX 2008 Annual

Technical Conference, Berkeley, 2008.

[66] A. Birrell, M. Isard, C. Thacker and T. Wobber, "A Design for High-

performance Flash Disks," New York, NY, USA,, 2007.

[67] F. Chen, D. A. Koufaty and X. Zhang, "Understanding Intrinsic

Characteristics and System Implications of Flash Memory Based Solid State

Drives," in Proceedings of the Eleventh International Joint Conference on

Measurement and Modeling of Computer Systems, New York, NY, USA,,

2009.

[68] A. Gupta, Y. Kim and B. Urgaonkar, "DFTL: A Flash Translation Layer

Employing Demand-based Selective Caching of Page-level Address

Mappings," in Proceedings of the 14th International Conference on

Architectural Support for Programming Languages and Operating Systems,

New York, NY, USA,, 2009.

[69] D. Park, B. Debnath and D. H. C. Du, "A Workload-Aware Adaptive

Hybrid Flash Translation Layer with an Efficient Caching Strategy," in 2011

IEEE 19th Annual International Symposium on Modelling, Analysis, and

Simulation of Computer and Telecommunication Systems, 2011.

[70] P. Thontirawong, M. Ekpanyapong and P. Chongstitvatana, "SCFTL: An

efficient caching strategy for page-level flash translation layer," in 2014

International Computer Science and Engineering Conference (ICSEC), 2014.

115

[71] A. Gupta, R. Pisolkar, B. Urgaonkar and A. Sivasubramaniam, "Leveraging

Value Locality in Optimizing NAND Flash-based SSDs," in Proceedings of the

9th USENIX Conference on File and Stroage Technologies, Berkeley, 2011.

[72] F. Chen, T. Luo and X. Zhang, "CAFTL: A Content-aware Flash

Translation Layer Enhancing the Lifespan of Flash Memory Based Solid State

Drives," in Proceedings of the 9th USENIX Conference on File and Stroage

Technologies, Berkeley, 2011.

[73] P. Huang, G. Wu, X. He and W. Xiao, "An Aggressive Worn-out Flash

Block Management Scheme to Alleviate SSD Performance Degradation," in

Proceedings of the Ninth European Conference on Computer Systems, New

York, NY, USA,, 2014.

[74] L.-P. Chang, "On Efficient Wear Leveling for Large-scale Flash-memory

Storage Systems," in Proceedings of the 2007 ACM Symposium on Applied

Computing, New York, NY, USA,, 2007.

[75] Y. Hu, H. Jiang, D. Feng, L. Tian, H. Luo and C. Ren, "Exploring and

Exploiting the Multilevel Parallelism Inside SSDs for Improved Performance

and Endurance," IEEE Transactions on Computers, vol. 62, pp. 1141-1155, jun

2013.

[76] Y. Kim, A. Gupta, B. Urgaonkar, P. Berman and A. Sivasubramaniam,

"HybridStore: A Cost-Efficient, High-Performance Storage System Combining

SSDs and HDDs," in 2011 IEEE 19th Annual International Symposium on

Modelling, Analysis, and Simulation of Computer and Telecommunication

Systems, 2011.

116

[77] D. Shue and M. J. Freedman, "From Application Requests to Virtual IOPs:

Provisioned Key-value Storage with Libra," in Proceedings of the Ninth

European Conference on Computer Systems, New York, NY, USA,, 2014.

[78] F. Armknecht, C. Boyd, C. Carr, K. Gjøsteen, A. Jäschke, C. A. Reuter and

M. Strand, A Guide to Fully Homomorphic Encryption, 2015.

[79] S. S. W. Jr, Cryptanalysis of number theoretic ciphers, CRC Press, 2002.

[80] C. Swenson, Modern cryptanalysis: techniques for advanced code

breaking., John Wiley & Sons, 2008.

[81] J. Yi-ming and L. Sheng-li, "The Analysis of Security Weakness in

BitLocker Technology," in Proceedings of the 2010 Second International

Conference on Networks Security, Wireless Communications and Trusted

Computing - Volume 01, Washington, 2010.

[82] J. Suter, Geometric Algebra Primer, 2013.

[83] K.-D. Suh, B.-H. Suh, Y.-H. Lim, J.-K. Kim, Y.-J. Choi, Y.-N. Koh, S.-S.

Lee, S.-C. Kwon, B.-S. Choi, J.-S. Yum and others, "A 3.3 V 32 Mb NAND

flash memory with incremental step pulse programming scheme," IEEE

Journal of Solid-State Circuits, vol. 30, pp. 1149-1156, 1995.

[84] D. Stehlé, Floating-Point LLL: Theoretical and Practical Aspects, Springer,

2010, pp. 179-213.

[85] D. Stehlé and R. Steinfeld, "Faster Fully Homomorphic Encryption,"

{IACR} Cryptology ePrint Archive, vol. 2010, p. 299, 2010.

117

[86] D. Stehlé and R. Steinfeld, "Faster Fully Homomorphic Encryption," in

ASIACRYPT, 2010.

[87] R. Snyder, "Some Security Alternatives for Encrypting Information on

Storage Devices," in Proceedings of the 3rd Annual Conference on Information

Security Curriculum Development, New York, NY, USA,, 2006.

[88] B. Schneier, Secrets & Lies: Digital Security in a Networked World, 1st ed.,

New York, NY, USA: John Wiley & Sons, Inc., 2000.

[89] V. Rijmen and B. Preneel, "Improved Characteristics for Differential

Cryptanalysis of Hash Functions Based on Block Ciphers," in Fast Software

Encryption: Second International Workshop. Leuven, Belgium, 14-16

December 1994, Proceedings, 1994.

[90] N. Palaniswamy, D. M. Dipesh, J. N. D. Kumar and S. G. Raaja, "Notice of

Violation of IEEE Publication Principles Enhanced Blowfish algorithm using

bitmap image pixel plotting for security improvisation," in 2010 2nd

International Conference on Education Technology and Computer, 2010.

[91] packetizer, "AES Crypt or AES-Crypt," 2018. [Online]. Available:

https://www.aescrypt.com.

[92] E. OŚullivan and F. Regazzoni, "Efficient Arithmetic for Lattice-based

Cryptography: Special Session Paper," in Proceedings of the Twelfth

IEEE/ACM/IFIP International Conference on Hardware/Software Codesign

and System Synthesis Companion, New York, NY, USA, 2017.

118

[93] R. Olsson, Performance differences in encryption software versus storage

devices, 2012, p. 36.

[94] D. Mittal, D. Kaur and A. Aggarwal, "Secure Data Mining in Cloud Using

Homomorphic Encryption," in 2014 IEEE International Conference on Cloud

Computing in Emerging Markets (CCEM), 2014.

[95] K. Minematsu, "Improved Security Analysis of XEX and LRW Modes," in

Proceedings of the 13th International Conference on Selected Areas in

Cryptography, Berlin, 2007.

[96] D. N. G. C. R. Micheloni, VLSI-Design of Non-Volatile Memories, New

York,,: (Springer), 2005.

[97] D. Micciancio, The Geometry of Lattice Cryptography, A. Aldini and R.

Gorrieri, Eds., Berlin, Heidelber: Springer Berlin Heidelberg, 2011, pp. 185-

210.

[98] L. Martin, "XTS: A Mode of AES for Encrypting Hard Disks," IEEE

Security Privacy, vol. 8, pp. 68-69, may 2010.

[99] J.-D. Lee, S.-H. Hur and J.-D. Choi, "Effects of floating-gate interference

on NAND flash memory cell operation," IEEE Electron Device Letters, vol. 23,

pp. 264-266, may 2002.

[100] S. K. Lai, J. Lee and V. K. Dham, "Electrical properties of nitrided-oxide

systems for use in gate dielectrics and EEPROM," in 1983 International

Electron Devices Meeting, 1983.

119

[101] D. Kahng and S. M. Sze, "A floating gate and its application to memory

devices," The Bell System Technical Journal, vol. 46, pp. 1288-1295, jul 1967.

[102] C. Gentry, S. Halevi and N. P. Smart, "Homomorphic Evaluation of the

AES Circuit," in Proceedings of the 32Nd Annual Cryptology Conference on

Advances in Cryptology --- CRYPTO 2012 - Volume 7417, New York, NY,

USA, 2012.

[103] N. Ferguson, J. Kelsey, S. Lucks, B. Schneier, M. Stay, D. Wagner and D.

Whiting, "Improved Cryptanalysis of Rijndael," in Proceedings of the 7th

International Workshop on Fast Software Encryption, London, 2001.

[104] Y. Doröz, J. Hoffstein, J. Pipher, J. H. Silverman, B. Sunar, W. Whyte and

Z. Zhang, "Fully Homomorphic Encryption from the Finite Field Isomorphism

Problem," {IACR} Cryptology ePrint Archive, vol. 2017, p. 548, 2017.

[105] Diskcryptor, "Diskcryptor," 2011. [Online]. Available:

https://diskcryptor.net/wiki/Main_Page.

[106] W. Dai, Y. Doröz, Y. Polyakov, K. Rohloff, H. Sajjadpour, E. Savas and B.

Sunar, "Implementation and Evaluation of a Lattice-Based Key-Policy ABE

Scheme," {IEEE} Trans. Information Forensics and Security, vol. 13, pp. 1169-

1184, 2018.

[107] A. Czeskis, D. J. S. Hilaire, K. Koscher, S. D. Gribble, T. Kohno and B.

Schneier, "Defeating Encrypted and Deniable File Systems: TrueCrypt V5.1a

and the Case of the Tattling OS and Applications," Berkeley, 2008.

120

[108] J. H. Cheon and D. Stehlé, "Fully Homomorphic Encryption over the

Integers Revisited," {IACR} Cryptology ePrint Archive, vol. 2016, p. 837,

2016.

[109] J. H. Cheon and D. Stehlé, "Fully Homomophic Encryption over the

Integers Revisited," in EUROCRYPT (1), 2015.

[110] K. K. Chauhan, A. K. S. Sanger and A. Verma, "Homomorphic Encryption

for Data Security in Cloud Computing," in 2015 International Conference on

Information Technology (ICIT), 2015.

[111] N. Chan, M. F. Beug, R. Knoefler, T. Mueller, T. Melde, M. Ackermann, S.

Riedel, M. Specht, C. Ludwig and A. T. Tilke, "Metal control gate for sub-

30nm floating gate NAND memory," in 2008 9th Annual Non-Volatile Memory

Technology Symposium (NVMTS), 2008.

[112] A. Chakraborti, C. Chen and R. Sion, "POSTER: DataLair: A Storage

Block Device with Plausible Deniability," in Proceedings of the 2016 ACM

SIGSAC Conference on Computer and Communications Security, New York,

NY, USA,, 2016.

[113] Z. Brakerski, C. Gentry and V. Vaikuntanathan, "(Leveled) Fully

Homomorphic Encryption Without Bootstrapping," in Proceedings of the 3rd

Innovations in Theoretical Computer Science Conference, New York, NY,

USA, 2012.

[114] E. Biham, O. Dunkelman and N. Keller, "The Rectangle Attack -

Rectangling the Serpent," in Advances in Cryptology – Proceedings of

EUROCRYPT 2001, LNCS 2045, 2001.

121

[115] E. Biham, "New Types of Cryptanalytic Attacks Using Related Keys," in

Advances in Cryptology --- Eurocrypt'93, Berl, 1994.

[116] D. Benarroch, Z. Brakerski and T. Lepoint, "FHE over the Integers:

Decomposed and Batched in the Post-Quantum Regime," in Proceedings, Part

II, of the 20th IACR International Conference on Public-Key Cryptography ---

PKC 2017 - Volume 10175, New York, NY, USA, 2017.

Appendix A – Cloud Storage SSDThe steps walk through creating various types of VMs created selecting various types

of SSD in Amazon Cloud. Login into Amazon Cloud and select EC2. Launch instance

and select instance type select i2.xlarge and t2.micro. Both are SSD storage VMs.

122

1. Visit Amazon Cloud Services EC2 website at

https://us-west-2.console.aws.amazon.com/console.

2. Create a VM following instruction from Amazon.

In the first evaluation I compared between these two types of VMS and proved

storage optimized VMs sure will have better performance. First, I created two type of

VMs in AWS. Instance type i2.xlarge follows:

Instance type t2.micro follows

3. Installed FIO benchmark tool.

root@ip-172-31-17-80: /home/ubuntu/Desktop#wget http://brick.kernel.dk/snaps/fio-2.1.10.tar.gz .

root@ip-172-31-17-80:/home/ubuntu/Desktop# gunzip fio-2.1.10.tar.gzroot@ip-172-31-17-80:/home/ubuntu/Desktop# tar -xf fio-2.1.10.tar

Run the following command to calculate benchmarks for performance

fio --filename=/dmcrypt/4krandreadwrite6040j8 --direct=1 --rw=randrw --size=1024m --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=4k --rwmixread=60 --iodepth=8 --numjobs=8 --runtime=60 --group_reporting --name=4krandreadwrite60j8--output=/home/output/4kdmcryptrandreadwrite60j8

http://brick.kernel.dk/snaps/fio-2.1.10.tar.gz

http://brick.kernel.dk/snaps/fio-2.1.10.tar.gz

https://us-west-2.console.aws.amazon.com/console

123

Sample Generated output:

Used this output gathered IOPS information for 4k to 1024kb block sizes for 1GB

files. Also calculated the time for sequential and random read writes. I used this

performance metrics to understand SSD characteristics in terms of performance in the

Cloud.

Appendix B – Cloud Storage and EncryptionsAfter evaluating the storage optimized SSD VM versus regular SSD, and the results

showed storage optimized SSD outperformed regular SSD. After that I evaluated regular

SSD, hardware encrypted SSD and software encryption creating container performance

running the FIO benchmarks to understand the performance of the penalties of encryption

software in Cloud environment.

124

1. Visit Amazon Cloud Services EC2 website at

https://us-west-2.console.aws.amazon.com/console.

All the VM types are t2.micro (Variable ECUs, 1 vCPU, 2.5 GHz, Intel Xeon

Family, 1 GiB memory, EBS only). Ubuntu Server 16.04 LTS (HVM), SSD Volume

Type - ami-efd0428f

2. Create two VMs following instruction from Amazon.

Instance type t2.micro with regular SSD

3. Created Instance type t2.micro with encrypted SSD.

4. Installed an encryption Best crypt to 3GB volume and Dm-Crypt software on

3GB volume on one of the t2.micro regular SSD.

https://us-west-2.console.aws.amazon.com/console

125

root@ip-172-31-17-80:yum install gcc kernel-devel kernel-headers dkmsroot@ip-172-31-17-80: wget -O /etc/yum.repos.d/bestcrypt.repo https://www.jetico.com/packages/el/bestcrypt.repo

root@ip-172-31-17-80: yum install bestcrypt bestcrypt-panelroot@ip-172-31-17-80: bctool new /root/BestCrypt -a Rijndael -s 3gb -d password root@ip-172-31-17-80: bctool format /root/BestCrypt -t ext3root@ip-172-31-17-80: Enter password:

root@ip-172-31-17-80:/sys/block/xvda/queue# apt-get install cryptsetupReading package lists... DoneBuilding dependency treeReading state information... Donecryptsetup is already the newest version.0 upgraded, 0 newly installed, 0 to remove and 50 not upgraded.root@ip-172-31-17-80:/sys/block/xvda/queue# fallocate -l 2048M /root/dmcryptroot@ip-172-31-17-80:/sys/block/xvda/queue# cryptsetup -y luksFormat /root/dmcrypt

WARNING!========This will overwrite data on /root/dmcrypt irrevocably.

Are you sure? (Type uppercase yes): yroot@ip-172-31-17-80:/sys/block/xvda/queue# cryptsetup -y luksFormat /root/dmcrypt


Are you sure? (Type uppercase yes): yesroot@ip-172-31-17-80:/sys/block/xvda/queue# cryptsetup -y luksFormat /root/dmcrypt


Are you sure? (Type uppercase yes): YESEnter passphrase:Verify passphrase:Passphrases do not match.root@ip-172-31-17-80:/sys/block/xvda/queue# cryptsetup -y luksFormat /root/dmcrypt


Are you sure? (Type uppercase yes): YESEnter passphrase:

126

Verify passphrase:root@ip-172-31-17-80:/sys/block/xvda/queue# df -hFilesystem Size Used Avail Use% Mounted onudev 492M 12K 492M 1% /devtmpfs 100M 384K 99M 1% /run/dev/xvda1 7.8G 3.7G 3.7G 50% /none 4.0K 0 4.0K 0% /sys/fs/cgroupnone 5.0M 0 5.0M 0% /run/locknone 497M 68K 497M 1% /run/shmnone 100M 8.0K 100M 1% /run/userroot@ip-172-31-17-80:/sys/block/xvda/queue# cd /rootroot@ip-172-31-17-80:~# ls -lastotal 2097208 4 drwx------ 8 root root 4096 Apr 9 20:40 . 4 drwxr-xr-x 22 root root 4096 Apr 9 09:06 .. 8 -rw------- 1 root root 6914 Apr 9 13:17 .bash_history 4 -rw-r--r-- 1 root root 3106 Feb 20 2014 .bashrc 4 drwxr-xr-x 3 root root 4096 Apr 9 10:11 BestCrypt 4 drwx------ 2 root root 4096 Apr 9 09:07 .cache 4 drwxr-xr-x 3 root root 4096 Apr 9 10:02 .config 4 drwx------ 3 root root 4096 Apr 9 10:02 .dbus2097156 -rw-r--r-- 1 root root 2147483648 Apr 9 20:41 dmcrypt 4 drwxr-xr-x 2 ubuntu ubuntu 4096 Apr 9 13:09 plain 4 -rw-r--r-- 1 root root 140 Feb 20 2014 .profile 4 drwx------ 2 root root 4096 Apr 9 09:06 .ssh 4 -rw------- 1 root root 648 Apr 9 10:41 .viminforoot@ip-172-31-17-80:~# file /root/dmcrypt/root/dmcrypt: LUKS encrypted file, ver 1 [aes, xts-plain64, sha1] UUID: 5302390d-a47a-47cc-99a7-a846d164197croot@ip-172-31-17-80:~# cryptsetup luksOpen /root/dmcrypt dmcryptEnter passphrase for /root/dmcrypt:root@ip-172-31-17-80:~# df -hFilesystem Size Used Avail Use% Mounted onudev 492M 12K 492M 1% /devtmpfs 100M 388K 99M 1% /run/dev/xvda1 7.8G 3.7G 3.7G 50% /none 4.0K 0 4.0K 0% /sys/fs/cgroupnone 5.0M 0 5.0M 0% /run/locknone 497M 68K 497M 1% /run/shmnone 100M 8.0K 100M 1% /run/userroot@ip-172-31-17-80:~# mkfs.ext4 -j /dev/mapper/dmcryptmke2fs 1.42.9 (4-Feb-2014)Filesystem label=OS type: LinuxBlock size=4096 (log=2)Fragment size=4096 (log=2)Stride=0 blocks, Stripe width=0 blocks131072 inodes, 523776 blocks26188 blocks (5.00%) reserved for the super userFirst data block=0Maximum filesystem blocks=536870912

127

16 block groups32768 blocks per group, 32768 fragments per group8192 inodes per groupSuperblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912

Allocating group tables: doneWriting inode tables: doneCreating journal (8192 blocks): doneWriting superblocks and filesystem accounting information: done

root@ip-172-31-17-80:~# df -hFilesystem Size Used Avail Use% Mounted onudev 492M 12K 492M 1% /devtmpfs 100M 388K 99M 1% /run/dev/xvda1 7.8G 3.7G 3.7G 50% /none 4.0K 0 4.0K 0% /sys/fs/cgroupnone 5.0M 0 5.0M 0% /run/locknone 497M 68K 497M 1% /run/shmnone 100M 8.0K 100M 1% /run/userroot@ip-172-31-17-80:~# mkdir dmcryptmkdir: cannot create directory ‘dmcrypt’: File existsroot@ip-172-31-17-80:~# pwd/rootroot@ip-172-31-17-80:~# cd /root@ip-172-31-17-80:/# mkdir dmcryptroot@ip-172-31-17-80:/# mount /dev/mapper/dmcrypt /dmcryptroot@ip-172-31-17-80:/# df -hFilesystem Size Used Avail Use% Mounted onudev 492M 12K 492M 1% /devtmpfs 100M 388K 99M 1% /run/dev/xvda1 7.8G 3.7G 3.7G 50% /none 4.0K 0 4.0K 0% /sys/fs/cgroupnone 5.0M 0 5.0M 0% /run/locknone 497M 68K 497M 1% /run/shmnone 100M 8.0K 100M 1% /run/user/dev/mapper/dmcrypt 2.0G 3.0M 1.9G 1% /dmcryptroot@ip-172-31-17-80:/#

5. Run FIO benchmarks on these four types of VMs: SSD, Encrypted SSD, Dm-

Crypt container, and Bestcrypt container.

fio --filename=/dmcrypt/4krandreadwrite6040j8 --direct=1 --rw=randrw --size=1024m --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=4k --rwmixread=60 --iodepth=8 --numjobs=8 --runtime=60 --group_reporting --name=4krandreadwrite60j8--output=/home/output/4kdmcryptrandreadwrite60j8

128

6. Sample Generated output:

Used this similar output gathered IOPS information for 4k to 1024kb block sizes for

1GB files. Also calculated the time for sequential and random read writes for all VMs.

I used this performance metrics to understand SSD characteristics versus encryption

software performance penalties in the Cloud. It proved there is a performance

overhead for software-based encryption versus regular or encrypted SSDs.

7. Hidden encrypted containers information.

129

In the above command df-h simply can be run by anyone and will show the encrypted

container information may be security concern.

Appendix C – Multi-Vector Based EncryptionAfter that I performed survey of Homomorphic encryption techniques. I converted

Multi-Vector based homomorphic encryption proposed by David Williams Honorio

Araujo Da Silva for his Masters’ thesis. Converted that into executable program and run it

on AWSVM on file level encryption. This is a symmetric encryption that’s why I chose

AES-Crypt symmetric encryption software to compare the results. file level encryption

and compared the results. Written similar type of program as AES-Crypt.

1. //2. // main.c3. // XLogos with MPQ4. //

5. #include <stdio.h>6. #include "test_xlg.h"7. #include "test_xlm.h"8. #include "test_xlg_massive_encryption.h"

9. int main(int argc, const char * argv[]) {

10.//====================== TEST XLM ======================11.//test_xlm_set_xlz();12.//test_xlm_set_int();13.//test_xlm_import();14.//test_xlm_encryption_decryption();15.//test_xlm_pack_unpack();

16.//xlg_test_pair_unpair();

130

17.//xlg_test_compression();

18.//====================== TEST XLG ======================19.//test_xlg_encrypt_decrypt_str();20.//test_xlg_encrypt_decrypt_int();21.//test_xlg_encrypt_decrypt_file();22.//test_xlm_encode();23.//test_xlm_decode();

24.encrypt_decrypt_file(argc,argv);

25.return 0;26.}

27.//28.// xlg.h

29.#ifndef xlg_h30.#define xlg_h

31.#include <stdio.h>32.#include "xlm.h"33.#include <time.h>

34.struct xlg_t{35.xlm_t key1;36.xlm_t key2;37.xlm_t key1_inverse;38.xlm_t key2_inverse;39.};40.typedef struct xlg_t xlg_t;

41.//============================== INIT and SET ==============================

42.void xlg_init(xlg_t *xlg);43.void xlg_generate_keys(xlg_t *xlg, int key_size);44.void xlg_set_keys(xlg_t *xlg, xlm_t k1, xlm_t k2);

45.//============================== OPERATIONS ==============================

46.void xlg_encrypt(xlm_t *dest_cypher, xlm_t message, xlg_t xlg);47.void xlg_decrypt(xlm_t *dest_decrypt, xlm_t cypher, xlg_t xlg);

48.//============================== UTILS ==============================49.void xlg_clear(xlg_t *xlg);50.void xlg_print(xlg_t xlg);

51.#endif /* xlg_h */

52.//53.// xlg.c

54.#include "xlg.h"

131

55.//============================== INIT and SET ==============================

56.void xlg_init(xlg_t *xlg){57.xlm_init(&xlg->key1);58.xlm_init(&xlg->key2);59.xlm_init(&xlg->key1_inverse);60.xlm_init(&xlg->key2_inverse);61.}

62.void xlg_generate_keys(xlg_t *xlg, int key_size){63.gmp_randstate_t state;64.gmp_randinit_default(state);65.time_t t;66.gmp_randseed_ui(state, time(&t));

67.mpz_t key1_z;68.mpz_init(key1_z);69.mpz_urandomb(key1_z, state, key_size);70.xlm_t key1_m;71.xlm_init(&key1_m);72.xlm_set_z(&key1_m, key1_z);

73.mpz_t key2_z;74.mpz_init(key2_z);75.mpz_urandomb(key2_z, state, key_size);76.xlm_t key2_m;77.xlm_init(&key2_m);78.xlm_set_z(&key2_m, key2_z);

79.xlg_set_keys(xlg, key1_m, key2_m);

80.//Clean up81.mpz_clear(key1_z);82.mpz_clear(key2_z);83.xlm_clear(&key1_m);84.xlm_clear(&key2_m);85.gmp_randclear(state);86.}

87.void xlg_set_keys(xlg_t *xlg, xlm_t k1, xlm_t k2){88.//Set keys89.xlm_set(&xlg->key1,k1);90.xlm_set(&xlg->key2,k2);

91.//Set inverse keys92.xlm_t key1_inverse;93.xlm_init(&key1_inverse);94.xlm_set(&key1_inverse,k1);95.xlm_inverse(&key1_inverse);96.xlm_set(&xlg->key1_inverse, key1_inverse);97.xlm_clear(&key1_inverse);

98.xlm_t key2_inverse;99.xlm_init(&key2_inverse);100.xlm_set(&key2_inverse,k2);101.xlm_inverse(&key2_inverse);102.xlm_set(&xlg->key2_inverse, key2_inverse);103.xlm_clear(&key2_inverse);

104.}

132

105.//============================== OPERATIONS ==============================

106.void xlg_encrypt(xlm_t *dest_cypher, xlm_t message, xlg_t xlg){107.//Encrypt108.xlm_t gp1_encryption;109.xlm_init(&gp1_encryption);110.gmp_printf("Key: %Qd + %Qd\n", xlg.key1.m0, xlg.key1.m1);111.gmp_printf("Message: %Qd + %Qd\n", message.m0, message.m1);112.xlm_geometric_product(&gp1_encryption,&xlg.key1,&message);

113.xlm_t cypher;114.xlm_init(&cypher);115.xlm_geometric_product_bivector(&cypher,&gp1_encryption,&xlg.key2);116.gmp_printf("Encrypted Values: %Qd + %Qd\n", cypher.m0, cypher.m1);117.xlm_set(dest_cypher,cypher);

118.//Clean up119.xlm_clear(&gp1_encryption);120.xlm_clear(&cypher);121.}

122.void xlg_decrypt(xlm_t *dest_decrypt, xlm_t cypher, xlg_t xlg){123.//Decrypt124.xlm_t gp1_decryption;125.xlm_init(&gp1_decryption);126.gmp_printf("\nKey1 Inverse: %Qd + %Qd\n", &xlg.key1_inverse.m0,

&xlg.key1_inverse.m1);127.gmp_printf("Key2 Inverse: %Qd + %Qd\n", &xlg.key2_inverse.m0,

&xlg.key2_inverse.m1);128.xlm_geometric_product(&gp1_decryption,&cypher,&xlg.key2_inverse);129.gmp_printf("Decrypted Values: %Qd + %Qd\n", gp1_decryption.m0,

gp1_decryption.m1);130.xlm_t decrypt;131.xlm_init(&decrypt);132.xlm_geometric_product_bivector_vector(&decrypt,&xlg.key1_inverse,&gp1

_decryption);133.gmp_printf("Decrypted Values: %Qd + %Qd\n", decrypt.m0, decrypt.m1);134.xlm_set(dest_decrypt,decrypt);

135.//Clean up136.xlm_clear(&gp1_decryption);137.xlm_clear(&decrypt);

138.}

139.//============================== UTILS ==============================140.void xlg_clear(xlg_t *xlg){141.xlm_clear(&xlg->key1);142.xlm_clear(&xlg->key2);143.xlm_clear(&xlg->key1_inverse);144.xlm_clear(&xlg->key2_inverse);145.}

146.void xlg_print(xlg_t xlg){147.mpz_t key1;148.mpz_init(key1);149.xlm_get_z(&key1, xlg.key1);

150.mpz_t key2;151.mpz_init(key2);152.xlm_get_z(&key2, xlg.key2);

133

153.gmp_printf("key 1 => %Zd\n", key1);154.gmp_printf("key 2 => %Zd\n", key2);

155.mpz_clear(key1);156.mpz_clear(key2);157.}

158.//159.// xlm.h

160.#ifndef xlm_h161.#define xlm_h

162.#include <stdio.h>163.#include <stdlib.h>164.#include <string.h>165.#include <gmp.h>166.#include <math.h>167.#include "xlg_compression.h"

168.struct xlm_t {169.mpq_t m0;170.mpq_t m1;

171.};172.typedef struct xlm_t xlm_t;

173.//============================== INIT and SET ==============================

174.void xlm_init(xlm_t * dest);175.void xlm_set(xlm_t *dest, xlm_t src);176.void xlm_set_z(xlm_t * dest, mpz_t z);177.void xlm_set_si(xlm_t * dest, signed long int si);178.void xlm_import_str(xlm_t * dest,char* str);179.void xlm_import_str_w_size(xlm_t * dest,char* str, long size);

180.//============================== XLM EXPORT ==============================

181.void xlm_get_z(mpz_t *dest, xlm_t xlm);182.signed long int xlm_get_si(xlm_t xlm);183.char* xlm_export_str(xlm_t xlm, long *buffer_size);

184.//============================== UTILS ==============================185.void xlm_print(xlm_t m);186.void xlm_clear(xlm_t * m);

187.void xlm_pack(mpz_t dst, xlm_t src);188.void xlm_unpack(xlm_t *dst, mpz_t src);189.size_t xlm_out_raw(FILE* stream, xlm_t src);190.size_t xlm_inp_raw(xlm_t *dst,FILE* stream);

191.//============================== OPERATIONS ==============================

192.void xlm_geometric_product(xlm_t *dest, xlm_t * m0, xlm_t * m1);193.void xlm_geometric_product_bivector(xlm_t *dest, xlm_t * m0, xlm_t *

134

m1);194.void xlm_geometric_product_bivector_vector(xlm_t *dest, xlm_t * m0,

xlm_t * m1);195.void xlm_clifford_conjugation(xlm_t *m);196.void xlm_reverse(xlm_t *m);197.void xlm_amplitude_squared(xlm_t * m);198.void xlm_amplitude_squared_reversed(xlm_t * m);199.void xlm_rationalize(xlm_t *m);200.void xlm_scalar_div(xlm_t * m, mpq_t scalar);201.void xlm_inverse(xlm_t *m);202.void xlm_lambda_0(mpq_t *m0, xlm_t * mv1, xlm_t * mv2);203.void xlm_lambda_1(mpq_t *m1, xlm_t * mv1, xlm_t * mv2);204.void xlm_lambda_0_bivector(mpq_t *m0, xlm_t * mv1, xlm_t * mv2);205.void xlm_lambda_1_bivector(mpq_t *m1, xlm_t * mv1, xlm_t * mv2);206.void xlm_lambda_0_bivector_vector(mpq_t *m0, xlm_t * mv1, xlm_t *

mv2);207.void xlm_lambda_1_bivector_vector(mpq_t *m1, xlm_t * mv1, xlm_t *

mv2);

208.#endif /* xlm_h */

209.//210.// xlm.c

211.#include "xlm.h"

212.//============================== INIT and SET ==============================

213.void xlm_init(xlm_t * dest){214.mpq_init(dest->m0);215.mpq_init(dest->m1);

216.}

217.void xlm_set(xlm_t *dest, xlm_t src){218.mpq_set(dest->m0, src.m0);219.mpq_set(dest->m1, src.m1);220.}

221.void xlm_set_z(xlm_t * dest, mpz_t z){222.//Init base and reminder223.mpz_t base;224.mpz_init(base);225.mpz_t reminder;226.mpz_init(reminder);

227.//Compute values228.mpz_div_ui(base,z,2);229.mpz_mod_ui(reminder,z,2);

230.//Get reminder in mpq231.mpq_t reminder_mpq;232.mpq_init(reminder_mpq);233.mpq_set_z(reminder_mpq,reminder);

234.mpq_set_z(dest->m0,base);235.mpq_set_z(dest->m1, base);

135

236.mpq_add(dest->m1,dest->m1,reminder_mpq);

237.//Adjust coefficients238.if(mpz_cmp_ui(reminder,0) == 0){239.mpq_t mpq_1;240.mpq_init(mpq_1);241.mpq_set_ui(mpq_1,1,1);242.mpq_add(dest->m1, dest->m1, mpq_1);243.mpq_clear(mpq_1);244.}

245.mpz_clear(base);246.mpz_clear(reminder);247.mpq_clear(reminder_mpq);248.}

249.void xlm_set_si(xlm_t * dest, signed long int si){250.mpz_t z;251.mpz_init_set_si(z,si);252.xlm_set_z(dest,z);253.mpz_clear(z);254.}

255.void xlm_import_str(xlm_t * dest,char* str){256.mpz_t z;257.mpz_init(z);258.mpz_import(z,sizeof(str),1,sizeof(str[0]), 0, 0,str);259.xlm_set_z(dest,z);260.mpz_clear(z);261.}

262.void xlm_import_str_w_size(xlm_t * dest,char* str, long size){263.mpz_t z;264.mpz_init(z);265.mpz_import(z,size,1,sizeof(str[0]),0, 0,str);266.xlm_set_z(dest,z);267.mpz_clear(z);268.}

269.//============================== XLM GET ==============================

270.void xlm_get_z(mpz_t *dest, xlm_t xlm){271.mpz_t mpz_m0;mpz_init(mpz_m0);272.mpz_t mpz_m1;mpz_init(mpz_m1);

273.mpz_set_q(mpz_m0,xlm.m0);274.mpz_set_q(mpz_m1,xlm.m1);

275.mpz_add(*dest, mpz_m0, mpz_m1);

276.mpz_clear(mpz_m0);277.mpz_clear(mpz_m1);278.}

279.signed long int xlm_get_si(xlm_t xlm){280.mpz_t z;281.mpz_init(z);282.xlm_get_z(&z,xlm);283.signed long int si = mpz_get_si(z);284.mpz_clear(z);

136

285.return si;286.}

287.char* xlm_export_str(xlm_t xlm, long *buffer_size){288.mpz_t z;289.mpz_init(z);290.xlm_get_z(&z,xlm);

291.//Alloc memory to destination buffer292.long size =sizeof(char);293.long nail = 0;294.long numb = 8*size - nail;295.long count = (mpz_sizeinbase (z, 2) + numb-1) / numb;296.char* buffer;297.buffer = malloc(count * size);

298.if(*buffer_size != NULL){299.*buffer_size =count * size;300.}

301.//Export to buffer302.mpz_export(buffer, NULL, 1, size, 0, nail, z);303.mpz_clear(z);

304.return buffer;305.}

306.//============================== UTILS ==============================307.void xlm_clear(xlm_t * m){308.mpq_clear(m->m0);309.mpq_clear(m->m1);

310.}

311.void xlm_print(xlm_t m){312.gmp_printf("%+Qd e0 ", m.m0);313.gmp_printf("%+Qd e1 \n", m.m1);

314.}

315.void xlm_pack(mpz_t dst, xlm_t src){316.mpz_t m0_m1;317.mpz_t m0,m1;318.mpz_inits(m0_m1,m0,m1,NULL);

319.//Get mpz values of coefficients320.mpz_set_q(m0,src.m0);321.mpz_set_q(m1,src.m1);

322.//Set absolute values323.mpz_abs(m0,m0);324.mpz_abs(m1,m1);

325.//Pair coefficients326.xlg_pair(dst, m0, m1);

137

327.//Pack signs of coefficients328.unsigned int sings = 0;329.sings = sings + (int)((mpq_cmp_si(src.m0,0,0)<0)? pow(2,7):0);330.sings = sings + (int)((mpq_cmp_si(src.m1,0,0)<0)? pow(2,6):0);

331.mpz_mul_ui(dst,dst,256);332.mpz_add_ui(dst,dst,sings);

333.mpz_clears(m0_m1,m0,m1,NULL);334.}

335.void xlm_unpack(xlm_t *dst, mpz_t src){336.mpz_t m0_m1;337.mpz_t m0,m1;338.mpz_inits(m0_m1,m0,m1,NULL);

339.//Get sings340.mpz_t signs_z;341.mpz_init(signs_z);342.mpz_mod_ui(signs_z,src,256);343.mpz_div_ui(src,src,256);344.unsigned long signs = mpz_get_ui(signs_z);

345.//Unpair coefficients

346.xlg_unpair(m0, m1, src);

347.//Adjust sign348.if((signs & 1) > 0)349.mpz_mul_si(m0,m0,-1);350.if((signs & 2) > 0)351.mpz_mul_si(m1,m1,-1);

352.//Set coefficients353.mpq_set_z(dst->m0,m0);354.mpq_set_z(dst->m1,m1);

355.mpz_clear(signs_z);356.mpz_clears(m0_m1,m0,m1,NULL);357.}

358.size_t xlm_out_raw(FILE* stream, xlm_t src){

359.mpz_t blades[2];360.for (int i = 0; i < 2; i++) {361.mpz_init(blades[i]);362.}

363.mpz_set_q(blades[0],src.m0);364.mpz_set_q(blades[1],src.m1);

138

365.size_t size = 0;366.for (int i = 0; i < 2; i++) {367.size += mpz_out_raw(stream,blades[i]);368.}

369.for (int i = 0; i < 2; i++) {370.mpz_clear(blades[i]);371.}372.return size;373.}

374.size_t xlm_inp_raw(xlm_t *dst,FILE* stream){

375.mpz_t blades[2];376.for (int i = 0; i < 2; i++) {377.mpz_init(blades[i]);378.}379.size_t rsize = 0;380.for (int i = 0; i < 2; i++) {381.rsize += mpz_inp_raw(blades[i],stream);382.}

383.mpq_set_z(dst->m0,blades[0]);384.mpq_set_z(dst->m1,blades[1]);

385.for (int i = 0; i < 2; i++) {386.mpz_clear(blades[i]);387.}

388.return rsize;389.}

390.//============================== OPERATIONS ==============================

391.void xlm_geometric_product(xlm_t * dest, xlm_t * m0, xlm_t * m1){392.xlm_lambda_0(&dest->m0,m0,m1);393.xlm_lambda_1(&dest->m1,m0,m1);

394.}

395.void xlm_geometric_product_bivector(xlm_t * dest, xlm_t * m0, xlm_t * m1){

396.xlm_lambda_0_bivector(&dest->m0,m0,m1);397.xlm_lambda_1_bivector(&dest->m1,m0,m1);

398.}

399.void xlm_geometric_product_bivector_vector(xlm_t * dest, xlm_t * m0, xlm_t * m1){

400.xlm_lambda_0_bivector_vector(&dest->m0,m0,m1);401.xlm_lambda_1_bivector_vector(&dest->m1,m0,m1);

402.}

139

403.void xlm_lambda_0(mpq_t *m, xlm_t * mv1, xlm_t *mv2){404.mpq_t ma;405.mpq_t mb;

406.mpq_init(ma);407.mpq_init(mb);408.mpq_mul(ma,mv1->m0,mv2->m0);409.mpq_mul(mb,mv1->m1,mv2->m1);410.mpq_add(*m,*m,ma);411.mpq_add(*m,*m,mb);412.mpq_clear(ma);413.mpq_clear(mb);

414.}


418.mpq_init(ma);419.mpq_init(mb);420.mpq_mul(ma,mv1->m0,mv2->m1);421.mpq_mul(mb,mv1->m1,mv2->m0);422.mpq_add(*m,*m,ma);423.mpq_sub(*m,*m,mb);424.mpq_clear(ma);425.mpq_clear(mb);

426.}

427.void xlm_lambda_0_bivector(mpq_t *m, xlm_t * mv1, xlm_t *mv2){428.mpq_t ma;429.mpq_t mb;


438.}


442.mpq_init(ma);443.mpq_init(mb);444.mpq_mul(ma,mv1->m0,mv2->m1);445.mpq_mul(mb,mv1->m1,mv2->m0);446.mpq_add(*m,*m,ma);447.mpq_sub(*m,*m,mb);

140

448.mpq_clear(ma);449.mpq_clear(mb);

450.}

451.void xlm_lambda_0_bivector_vector(mpq_t *m, xlm_t * mv1, xlm_t *mv2){452.mpq_t ma;453.mpq_t mb;


462.}



474.}

475.void xlm_clifford_conjugation(xlm_t *m) {476.mpq_t minus_one;477.mpq_init(minus_one);478.mpq_set_si(minus_one,1,-1);

479.//mpq_mul(m->m0,m->m0,minus_one);480.//mpq_mul(m->m1,m->m1,minus_one);481.//mpq_mul(m->m1,m->m1,minus_one);

482.mpq_clear(minus_one);483.}

484.void xlm_reverse(xlm_t *m) {485.mpq_t minus_one;486.mpq_init(minus_one);487.mpq_set_si(minus_one,1,-1);

141

488.//mpq_mul(m->m0,m->m0,minus_one);489.// mpq_mul(m->m0,m->m0,minus_one);490.mpq_mul(m->m1,m->m1,minus_one);


493.void xlm_amplitude_squared(xlm_t * m) {494.gmp_printf("Input1: %Qd\n", m->m0);495.gmp_printf("Input2: %Qd\n", m->m1);

496.//Compute clifford congugation of m and store it on clifford_conj497.xlm_t clifford_conj;498.xlm_init(&clifford_conj);499.xlm_set(&clifford_conj, *m);500.xlm_clifford_conjugation(&clifford_conj);501.gmp_printf("Clifford Conjugate: %Qd\n", clifford_conj.m0);502.gmp_printf("Clifford Conjugate: %Qd\n", clifford_conj.m1);503.//Compute geometric product of m and cg and store it on

amplitude_squared504.xlm_t amplitude_squared;505.xlm_init(&amplitude_squared);506.xlm_geometric_product(&amplitude_squared, m, &clifford_conj);

507.gmp_printf("Amplitude squared: %Qd\n", amplitude_squared.m0);508.//Make the pointer content equal amplitude_squared509.xlm_clear(m);510.xlm_init(m);511.xlm_set(m, amplitude_squared);

512.//Clean up513.xlm_clear(&clifford_conj);514.xlm_clear(&amplitude_squared);515.}

516.void xlm_amplitude_squared_reversed(xlm_t * m) {517.//Compute amplitude squared of m518.xlm_t amplitude_squared_reversed;519.xlm_init(&amplitude_squared_reversed);520.xlm_set(&amplitude_squared_reversed, *m);521.xlm_amplitude_squared(&amplitude_squared_reversed);

522.//Compute reverse of mv_amplitude_squared and store it on mv_amplitude_squared

523.xlm_reverse(&amplitude_squared_reversed);

524.//Make the pointer content equal amplitude_squared_reversed525.xlm_clear(m);526.xlm_init(m);527.xlm_set(m, amplitude_squared_reversed);

528.//Clean up529.xlm_clear(&amplitude_squared_reversed);530.}

531.void xlm_rationalize(xlm_t *m) {532.//Compute amplitude squared of m and store it on mv_amplitude_squared533.xlm_t mv_amplitude_squared;534.xlm_init(&mv_amplitude_squared);

142

535.xlm_set(&mv_amplitude_squared, *m);536.xlm_amplitude_squared(&mv_amplitude_squared);

537.//Compute amplitude squared reversed of m and store it on mv_amplitude_squared_reversed

538.xlm_t mv_amplitude_squared_reversed;539.xlm_init(&mv_amplitude_squared_reversed);540.xlm_set(&mv_amplitude_squared_reversed, *m);541.xlm_amplitude_squared_reversed(&mv_amplitude_squared_reversed);

542.//Compute geometric product of mv_amplitude_squared and mv_amplitude_squared_reversed and store it on mv_geometric_product

543.xlm_t mv_geometric_product;544.xlm_init(&mv_geometric_product);545.xlm_geometric_product(&mv_geometric_product,&mv_amplitude_squared,&mv

_amplitude_squared_reversed);

546.//Make the pointer content equal mv_geometric_product547.xlm_clear(m);548.xlm_init(m);549.xlm_set(m,mv_geometric_product);

550.//Clean up551.xlm_clear(&mv_amplitude_squared);552.xlm_clear(&mv_amplitude_squared_reversed);553.xlm_clear(&mv_geometric_product);554.}

555.void xlm_scalar_div(xlm_t *m, mpq_t scalar) {556.mpq_div(m->m0, m->m0, scalar);557.mpq_div(m->m1, m->m1, scalar);

558.}

559.void xlm_inverse(xlm_t *m){560.//Compute clifford congugation of m and store it on clifford_conj561.xlm_t clifford_conj;562.xlm_init(&clifford_conj);563.xlm_set(&clifford_conj, *m);564.xlm_clifford_conjugation(&clifford_conj);


566.xlm_t mv_amplitude_squared;567.xlm_init(&mv_amplitude_squared);568.xlm_set(&mv_amplitude_squared, *m);569.xlm_amplitude_squared(&mv_amplitude_squared);

570.//Rationalize571.xlm_t mv_rationalize;572.xlm_init(&mv_rationalize);573.xlm_set(&mv_rationalize, *m);574.xlm_rationalize(&mv_rationalize);

575.xlm_t mv_geometric_product;576.xlm_init(&mv_geometric_product);577.xlm_set(&mv_geometric_product, *m);578.//Perform scalar div on geometric product579.xlm_scalar_div(&mv_geometric_product, mv_amplitude_squared.m0);

143

580.//Make the pointer content equal mv_geometric_product581.xlm_clear(m);582.xlm_init(m);583.xlm_set(m, mv_geometric_product);

584.//Clean up585.xlm_clear(&clifford_conj);586.//xlm_clear(&mv_amplitude_squared_reversed);587.//xlm_clear(&mv_geometric_product);588.xlm_clear(&mv_rationalize);589.}

590.//591.// xlg_massive_encryption.h

592.#ifndef xlg_massive_encryption_h593.#define xlg_massive_encryption_h

594.#include <stdio.h>595.#include "xlg.h"

596.void xlg_encrypt_file(char* src, char* dst, xlg_t xlg);597.void xlg_decrypt_file(char* src, char* dst, xlg_t xlg);598.void xlg_append_encypted_data(char* dst_path, char* data_buffer,

xlg_t xlg);599.void xlg_encode(xlg_t xlg);600.void xlg_decode(xlg_t xlg);

601.#endif /* xlg_massive_encryption_h */

602.//603.// xlg_massive_encryption.c

604.#include "xlg_massive_encryption.h"605.int BUFFER_SIZE = 1024*10;

606.void xlg_encrypt_file(char* src, char* dst, xlg_t xlg){607.FILE *src_file = fopen(src, "rb");608.FILE *dst_file= fopen(dst, "wb");

609.while (!feof(src_file)) {610.//Read file611.long nread = 1;612.char buffer[BUFFER_SIZE];613.buffer[0] = 1;614.while(nread<BUFFER_SIZE-1 && !feof(src_file)){615.int c = getc(src_file);616.if(c!= EOF){617.buffer[nread]=c;618.nread++;619.}620.}621.//Import

144

622.xlm_t message;623.xlm_init(&message);624.xlm_import_str_w_size(&message,buffer,nread);

625.gmp_printf("Message Values: %Qd + %Qd\n", message.m0, message.m1);626.//Encrypt627.xlm_t cypher_xlm;628.xlm_init(&cypher_xlm);629.xlg_encrypt(&cypher_xlm, message, xlg);630.gmp_printf("Encrypted Values: %Qd + %Qd\n", cypher_xlm.m0,

cypher_xlm.m1);631.//Write to file632.xlm_out_raw(dst_file,cypher_xlm);

633.//Clean up634.xlm_clear(&message);635.xlm_clear(&cypher_xlm);636.}637.fclose(src_file);638.fclose(dst_file);639.}

640.void xlg_decrypt_file(char* src, char* dst, xlg_t xlg){641.FILE *src_file= fopen(src, "rb");642.FILE *dst_file= fopen(dst, "wb");

643.while(!feof(src_file)){644.//Read File645.xlm_t cypher_xlm;646.xlm_init(&cypher_xlm);647.size_t nread = xlm_inp_raw(&cypher_xlm, src_file);648.if(nread <=0){649.xlm_clear(&cypher_xlm);650.break;651.}

652.//Decrypt653.xlm_t decrypt;654.xlm_init(&decrypt);655.xlg_decrypt(&decrypt,cypher_xlm,xlg);656.gmp_printf("Decrypted Values: %Qd + %Qd\n", decrypt.m0, decrypt.m1);

657.//Export658.long size;659.char* buffer = xlm_export_str(decrypt,&size);

660.//Write file661.long nwrite = 1;662.while(nwrite<size){663.putc(buffer[nwrite++], dst_file);664.}

665.//Clean up666.free(buffer);667.xlm_clear(&cypher_xlm);668.xlm_clear(&decrypt);669.}

670.fclose(src_file);671.fclose(dst_file);672.}

145

673.void xlg_append_encypted_data(char* dst_path, char* data_buffer, xlg_t xlg){

674.FILE *dst_file= fopen(dst_path, "ab");

675.char * buffer = malloc(strlen(data_buffer)+2);676.memset(buffer,0,strlen(data_buffer)+2);677.buffer[0]=1;678.buffer = strcat(buffer, data_buffer);

679.//Import680.xlm_t data;681.xlm_init(&data);682.xlm_import_str_w_size(&data,buffer,strlen(data_buffer)+2);

683.//Encrypt684.xlm_t cypher_xlm;685.xlm_init(&cypher_xlm);686.xlg_encrypt(&cypher_xlm, data, xlg);

687.//Write to file688.xlm_out_raw(dst_file,cypher_xlm);

689.//Clean up690.xlm_clear(&data);691.xlm_clear(&cypher_xlm);692.free(buffer);

693.fclose(dst_file);694.}

695.void xlg_encode(xlg_t xlg){

696.while (!feof(stdin)) {

697.long nread = 1;698.char buffer[BUFFER_SIZE];699.buffer[0] = 1; // THANKS HANES!!!700.while(nread<BUFFER_SIZE-1 && !feof(stdin)){701.int c = getchar();702.if(c!= EOF){703.buffer[nread]=c;704.nread++;705.}706.}

707.//Import708.xlm_t message;709.xlm_init(&message);710.xlm_import_str_w_size(&message,buffer,nread);

711.//Encrypt712.xlm_t cypher_xlm;713.xlm_init(&cypher_xlm);714.xlg_encrypt(&cypher_xlm, message, xlg);

715.gmp_fprintf(stdout,"%Qd\n", cypher_xlm.m0);716.gmp_fprintf(stdout,"%Qd\n", cypher_xlm.m1);

146

717.xlm_clear(&message);718.xlm_clear(&cypher_xlm);719.}720.}

721.void xlg_decode(xlg_t xlg){722.FILE *stream;723.char *line = NULL;724.size_t len = 0;725.size_t read;

726.stream = stdin;727.if (stream == NULL)728.exit(0);

729.int count = 0;730.mpq_t m0;731.mpq_t m1;

732.xlm_t cypher;733.while ((read = getline(&line, &len, stdin)) != -1) {734.if(count%2 == 0){735.mpq_init(m0);736.mpq_set_str(m0,line,10);737.count++;738.}739.else if(count%2 == 1){740.mpq_init(m1);741.mpq_set_str(m1,line,10);742.count++;

743.xlm_init(&cypher);744.mpq_set(cypher.m0,m0);745.mpq_set(cypher.m1,m1);

746.//Decrypt747.xlm_t decrypt;748.xlm_init(&decrypt);749.xlg_decrypt(&decrypt,cypher,xlg);


753.long nwrite = 1; //Thanks Hanes754.while(nwrite<size){755.putchar(buffer[nwrite++]);756.}

757.count =0;758.xlm_clear(&cypher);759.xlm_clear(&decrypt);760.free(buffer);761.mpq_clear(m0);762.mpq_clear(m1);

763.}

147

764.}

765.free(line);766.fclose(stream);767.}

8. Using the following commands compared against AES-Crypt.

AES-crypt:



RVTHE:




Appendix D – RVTHE

In this section once I design RVTHE then I converted into similar executable program

like AES-Crypt and run it on AWS VMs on file level encryption. RVTHE and AES-

148

Crypt both are symmetric encryptions and it is very much comparable to each other. file

level encryption and compared the results. Written similar type of program as AES-

Crypt.

1. //2. // main.c

3. #include <stdio.h>4. #include "test_xlg.h"5. #include "test_xlm.h"6. #include "test_xlg_massive_encryption.h"

7. int main(int argc, const char * argv[]) {8. encrypt_decrypt_file(argc,argv);9. //10. // xlg.h11. // XLogos with MPQ

12. #ifndef xlg_h13. #define xlg_h

14. #include <stdio.h>15. #include "xlm.h"16. #include <time.h>

17. struct xlg_t{18. xlm_t key1;19. xlm_t key2;20. xlm_t key1_inverse;21. xlm_t key2_inverse;22. };23. typedef struct xlg_t xlg_t;

24. //============================== INIT and SET ==============================

25. void xlg_init(xlg_t *xlg);26. void xlg_generate_keys(xlg_t *xlg, int key_size);27. void xlg_set_keys(xlg_t *xlg, xlm_t k1, xlm_t k2);

28. //============================== OPERATIONS ==============================

29. void xlg_encrypt(xlm_t *dest_cypher, xlm_t message, xlg_t xlg);30. void xlg_decrypt(xlm_t *dest_decrypt, xlm_t cypher, xlg_t xlg);

31. //============================== UTILS ==============================32. void xlg_clear(xlg_t *xlg);33. void xlg_print(xlg_t xlg);

34. #endif /* xlg_h */

35. //36. // xlg.c

37. #include "xlg.h"

149

38. //============================== INIT and SET ==============================

39. void xlg_init(xlg_t *xlg){40. xlm_init(&xlg->key1);41. xlm_init(&xlg->key2);42. xlm_init(&xlg->key1_inverse);43. xlm_init(&xlg->key2_inverse);44. }

45. void xlg_generate_keys(xlg_t *xlg, int key_size){46. gmp_randstate_t state;47. gmp_randinit_default(state);48. time_t t;49. gmp_randseed_ui(state, time(&t));

50. mpz_t key1_z;51. mpz_init(key1_z);52. mpz_urandomb(key1_z, state, key_size);53. xlm_t key1_m;54. xlm_init(&key1_m);55. xlm_set_z(&key1_m, key1_z);

56. mpz_t key2_z;57. mpz_init(key2_z);58. mpz_urandomb(key2_z, state, key_size);59. xlm_t key2_m;60. xlm_init(&key2_m);61. xlm_set_z(&key2_m, key2_z);

62. xlg_set_keys(xlg, key1_m, key2_m);

63. //Clean up64. mpz_clear(key1_z);65. mpz_clear(key2_z);66. xlm_clear(&key1_m);67. xlm_clear(&key2_m);68. gmp_randclear(state);69. }

70. void xlg_set_keys(xlg_t *xlg, xlm_t k1, xlm_t k2){71. //Set keys72. xlm_set(&xlg->key1,k1);73. xlm_set(&xlg->key2,k2);

74. //Set inverse keys75. xlm_t key1_inverse;76. xlm_init(&key1_inverse);77. xlm_set(&key1_inverse,k1);78. xlm_inverse(&key1_inverse);79. xlm_set(&xlg->key1_inverse, key1_inverse);80. xlm_clear(&key1_inverse);

81. xlm_t key2_inverse;82. xlm_init(&key2_inverse);83. xlm_set(&key2_inverse,k2);84. xlm_inverse(&key2_inverse);85. xlm_set(&xlg->key2_inverse, key2_inverse);86. xlm_clear(&key2_inverse);

87. }

150

88. //============================== OPERATIONS ==============================

89. void xlg_encrypt(xlm_t *dest_cypher, xlm_t message, xlg_t xlg){90. //Encrypt91. xlm_t gp1_encryption;92. xlm_init(&gp1_encryption);93. xlm_geometric_product(&gp1_encryption,&xlg.key1,&message);

94. xlm_t cypher;95. xlm_init(&cypher);96. xlm_geometric_product_bivector(&cypher,&gp1_encryption,&xlg.key2);97. xlm_set(dest_cypher,cypher);

98. //Clean up99. xlm_clear(&gp1_encryption);100.xlm_clear(&cypher);101.}

102.void xlg_decrypt(xlm_t *dest_decrypt, xlm_t cypher, xlg_t xlg){103.//Decrypt104.xlm_t gp1_decryption;105.xlm_init(&gp1_decryption);106.xlm_geometric_product(&gp1_decryption,&cypher,&xlg.key2_inverse);107.xlm_t decrypt;108.xlm_init(&decrypt);109.xlm_geometric_product_bivector_vector(&decrypt,&xlg.key1_inverse,&gp1

_decryption);110.xlm_set(dest_decrypt,decrypt);

111.//Clean up112.xlm_clear(&gp1_decryption);113.xlm_clear(&decrypt);

114.}

115.//============================== UTILS ==============================116.void xlg_clear(xlg_t *xlg){117.xlm_clear(&xlg->key1);118.xlm_clear(&xlg->key2);119.xlm_clear(&xlg->key1_inverse);120.xlm_clear(&xlg->key2_inverse);121.}

122.void xlg_print(xlg_t xlg){123.mpz_t key1;124.mpz_init(key1);125.xlm_get_z(&key1, xlg.key1);

126.mpz_t key2;127.mpz_init(key2);128.xlm_get_z(&key2, xlg.key2);

129.gmp_printf("key 1 => %Zd\n", key1);130.gmp_printf("key 2 => %Zd\n", key2);

131.mpz_clear(key1);132.mpz_clear(key2);133.}

134.//

151

135.// xlm.h

136.#ifndef xlm_h137.#define xlm_h

138.#include <stdio.h>139.#include <stdlib.h>140.#include <string.h>141.#include <gmp.h>142.#include <math.h>143.#include "xlg_compression.h"

144.struct xlm_t {145.mpq_t m0;146.mpq_t m1;

147.};148.typedef struct xlm_t xlm_t;

149.//============================== INIT and SET ==============================

150.void xlm_init(xlm_t * dest);151.void xlm_set(xlm_t *dest, xlm_t src);152.void xlm_set_z(xlm_t * dest, mpz_t z);153.void xlm_set_si(xlm_t * dest, signed long int si);154.void xlm_import_str(xlm_t * dest,char* str);155.void xlm_import_str_w_size(xlm_t * dest,char* str, long size);

156.//============================== XLM EXPORT ==============================

157.void xlm_get_z(mpz_t *dest, xlm_t xlm);158.signed long int xlm_get_si(xlm_t xlm);159.char* xlm_export_str(xlm_t xlm, long *buffer_size);

160.//============================== UTILS ==============================161.void xlm_print(xlm_t m);162.void xlm_clear(xlm_t * m);

163.void xlm_pack(mpz_t dst, xlm_t src);164.void xlm_unpack(xlm_t *dst, mpz_t src);165.size_t xlm_out_raw(FILE* stream, xlm_t src);166.size_t xlm_inp_raw(xlm_t *dst,FILE* stream);

167.//============================== OPERATIONS ==============================

168.void xlm_geometric_product(xlm_t *dest, xlm_t * m0, xlm_t * m1);169.void xlm_geometric_product_bivector(xlm_t *dest, xlm_t * m0, xlm_t *

m1);170.void xlm_geometric_product_bivector_vector(xlm_t *dest, xlm_t * m0,

xlm_t * m1);171.void xlm_clifford_conjugation(xlm_t *m);172.void xlm_reverse(xlm_t *m);173.void xlm_amplitude_squared(xlm_t * m);174.void xlm_amplitude_squared_reversed(xlm_t * m);175.void xlm_rationalize(xlm_t *m);176.void xlm_scalar_div(xlm_t * m, mpq_t scalar);177.void xlm_inverse(xlm_t *m);178.void xlm_lambda_0(mpq_t *m0, xlm_t * mv1, xlm_t * mv2);179.void xlm_lambda_1(mpq_t *m1, xlm_t * mv1, xlm_t * mv2);

152

180.void xlm_lambda_0_bivector(mpq_t *m0, xlm_t * mv1, xlm_t * mv2);181.void xlm_lambda_1_bivector(mpq_t *m1, xlm_t * mv1, xlm_t * mv2);182.void xlm_lambda_0_bivector_vector(mpq_t *m0, xlm_t * mv1, xlm_t *

mv2);183.void xlm_lambda_1_bivector_vector(mpq_t *m1, xlm_t * mv1, xlm_t *

mv2);

184.#endif /* xlm_h */

185.//186.// xlm.c

187.#include "xlm.h"

188.//============================== INIT and SET ==============================

189.void xlm_init(xlm_t * dest){190.mpq_init(dest->m0);191.mpq_init(dest->m1);

192.}

193.void xlm_set(xlm_t *dest, xlm_t src){194.mpq_set(dest->m0, src.m0);195.mpq_set(dest->m1, src.m1);196.}

197.void xlm_set_z(xlm_t * dest, mpz_t z){198.//Init base and reminder199.mpz_t base;200.mpz_init(base);201.mpz_t reminder;202.mpz_init(reminder);

203.//Compute values204.mpz_div_ui(base,z,2);205.mpz_mod_ui(reminder,z,2);

206.//Get reminder in mpq207.mpq_t reminder_mpq;208.mpq_init(reminder_mpq);209.mpq_set_z(reminder_mpq,reminder);

210.mpq_set_z(dest->m0,base);211.mpq_set_z(dest->m1, base);

212.mpq_add(dest->m1,dest->m1,reminder_mpq);

213.//Adjust coefficients214.if(mpz_cmp_ui(reminder,0) == 0){215.mpq_t mpq_1;216.mpq_init(mpq_1);217.mpq_set_ui(mpq_1,1,1);218.mpq_add(dest->m1, dest->m1, mpq_1);219.mpq_clear(mpq_1);

153

220.}

221.mpz_clear(base);222.mpz_clear(reminder);223.mpq_clear(reminder_mpq);224.}

225.void xlm_set_si(xlm_t * dest, signed long int si){226.mpz_t z;227.mpz_init_set_si(z,si);228.xlm_set_z(dest,z);229.mpz_clear(z);230.}

231.void xlm_import_str(xlm_t * dest,char* str){232.mpz_t z;233.mpz_init(z);234.mpz_import(z,sizeof(str),1,sizeof(str[0]), 0, 0,str);235.xlm_set_z(dest,z);236.mpz_clear(z);237.}

238.void xlm_import_str_w_size(xlm_t * dest,char* str, long size){239.mpz_t z;240.mpz_init(z);241.mpz_import(z,size,1,sizeof(str[0]),0, 0,str);242.xlm_set_z(dest,z);243.mpz_clear(z);244.}

245.//============================== XLM GET ==============================

246.void xlm_get_z(mpz_t *dest, xlm_t xlm){247.mpz_t mpz_m0;mpz_init(mpz_m0);248.mpz_t mpz_m1;mpz_init(mpz_m1);

249.mpz_set_q(mpz_m0,xlm.m0);250.mpz_set_q(mpz_m1,xlm.m1);

251.mpz_add(*dest, mpz_m0, mpz_m1);

252.mpz_clear(mpz_m0);253.mpz_clear(mpz_m1);254.}

255.signed long int xlm_get_si(xlm_t xlm){256.mpz_t z;257.mpz_init(z);258.xlm_get_z(&z,xlm);259.signed long int si = mpz_get_si(z);260.mpz_clear(z);261.return si;262.}

263.char* xlm_export_str(xlm_t xlm, long *buffer_size){264.mpz_t z;265.mpz_init(z);266.xlm_get_z(&z,xlm);

267.//Alloc memory to destination buffer

154

268.long size =sizeof(char);269.long nail = 0;270.long numb = 8*size - nail;271.long count = (mpz_sizeinbase (z, 2) + numb-1) / numb;272.char* buffer;273.buffer = malloc(count * size);

274.if(*buffer_size != NULL){275.*buffer_size =count * size;276.}

277.//Export to buffer278.mpz_export(buffer, NULL, 1, size, 0, nail, z);279.mpz_clear(z);

280.return buffer;281.}

282.//============================== UTILS ==============================283.void xlm_clear(xlm_t * m){284.mpq_clear(m->m0);285.mpq_clear(m->m1);

286.}

287.void xlm_print(xlm_t m){288.gmp_printf("%+Qd e0 ", m.m0);289.gmp_printf("%+Qd e1 \n", m.m1);

290.}

291.void xlm_pack(mpz_t dst, xlm_t src){292.mpz_t m0_m1;293.mpz_t m0,m1;294.mpz_inits(m0_m1,m0,m1,NULL);

295.//Get mpz values of coefficients296.mpz_set_q(m0,src.m0);297.mpz_set_q(m1,src.m1);

298.//Set absolute values299.mpz_abs(m0,m0);300.mpz_abs(m1,m1);

301.//Pair coefficients302.xlg_pair(dst, m0, m1);

303.//Pack signs of coefficients304.unsigned int sings = 0;305.sings = sings + (int)((mpq_cmp_si(src.m0,0,0)<0)? pow(2,7):0);306.sings = sings + (int)((mpq_cmp_si(src.m1,0,0)<0)? pow(2,6):0);

307.mpz_mul_ui(dst,dst,256);308.mpz_add_ui(dst,dst,sings);

155

309.mpz_clears(m0_m1,m0,m1,NULL);310.}

311.void xlm_unpack(xlm_t *dst, mpz_t src){312.mpz_t m0_m1;313.mpz_t m0,m1;314.mpz_inits(m0_m1,m0,m1,NULL);

315.//Get sings316.mpz_t signs_z;317.mpz_init(signs_z);318.mpz_mod_ui(signs_z,src,256);319.mpz_div_ui(src,src,256);320.unsigned long signs = mpz_get_ui(signs_z);

321.//Unpair coefficients

322.xlg_unpair(m0, m1, src);

323.//Adjust sign324.if((signs & 1) > 0)325.mpz_mul_si(m0,m0,-1);326.if((signs & 2) > 0)327.mpz_mul_si(m1,m1,-1);

328.//Set coefficients329.mpq_set_z(dst->m0,m0);330.mpq_set_z(dst->m1,m1);

331.mpz_clear(signs_z);332.mpz_clears(m0_m1,m0,m1,NULL);333.}

334.size_t xlm_out_raw(FILE* stream, xlm_t src){

335.mpz_t blades[2];336.for (int i = 0; i < 2; i++) {337.mpz_init(blades[i]);338.}

339.mpz_set_q(blades[0],src.m0);340.mpz_set_q(blades[1],src.m1);

341.size_t size = 0;342.for (int i = 0; i < 2; i++) {343.size += mpz_out_raw(stream,blades[i]);344.}

345.for (int i = 0; i < 2; i++) {

156

346.mpz_clear(blades[i]);347.}348.return size;349.}

350.size_t xlm_inp_raw(xlm_t *dst,FILE* stream){

351.mpz_t blades[2];352.for (int i = 0; i < 2; i++) {353.mpz_init(blades[i]);354.}355.size_t rsize = 0;356.for (int i = 0; i < 2; i++) {357.rsize += mpz_inp_raw(blades[i],stream);358.}

359.mpq_set_z(dst->m0,blades[0]);360.mpq_set_z(dst->m1,blades[1]);

361.for (int i = 0; i < 2; i++) {362.mpz_clear(blades[i]);363.}

364.return rsize;365.}

366.//============================== OPERATIONS ==============================

367.void xlm_geometric_product(xlm_t * dest, xlm_t * m0, xlm_t * m1){368.xlm_lambda_0(&dest->m0,m0,m1);369.xlm_lambda_1(&dest->m1,m0,m1);

370.}

371.void xlm_geometric_product_bivector(xlm_t * dest, xlm_t * m0, xlm_t * m1){

372.xlm_lambda_0_bivector(&dest->m0,m0,m1);373.xlm_lambda_1_bivector(&dest->m1,m0,m1);

374.}

375.void xlm_geometric_product_bivector_vector(xlm_t * dest, xlm_t * m0, xlm_t * m1){

376.xlm_lambda_0_bivector_vector(&dest->m0,m0,m1);377.xlm_lambda_1_bivector_vector(&dest->m1,m0,m1);

378.}


382.mpq_init(ma);383.mpq_init(mb);384.mpq_mul(ma,mv1->m0,mv2->m0);

157

385.mpq_mul(mb,mv1->m1,mv2->m1);386.mpq_add(*m,*m,ma);387.mpq_add(*m,*m,mb);388.mpq_clear(ma);389.mpq_clear(mb);

390.}



402.}



414.}



426.}


158


438.}



450.}

451.void xlm_clifford_conjugation(xlm_t *m) {452.mpq_t minus_one;453.mpq_init(minus_one);454.mpq_set_si(minus_one,1,-1);

455.//mpq_mul(m->m0,m->m0,minus_one);456.//mpq_mul(m->m1,m->m1,minus_one);457.//mpq_mul(m->m1,m->m1,minus_one);


460.void xlm_reverse(xlm_t *m) {461.mpq_t minus_one;462.mpq_init(minus_one);463.mpq_set_si(minus_one,1,-1);

464.//mpq_mul(m->m0,m->m0,minus_one);465.// mpq_mul(m->m0,m->m0,minus_one);466.mpq_mul(m->m1,m->m1,minus_one);


159

469.void xlm_amplitude_squared(xlm_t * m) {470.//suni gmp_printf("Input1: %Qd\n", m->m0);471.//suni gmp_printf("Input2: %Qd\n", m->m1);

472.//Compute clifford congugation of m and store it on clifford_conj473.xlm_t clifford_conj;474.xlm_init(&clifford_conj);475.xlm_set(&clifford_conj, *m);476.xlm_clifford_conjugation(&clifford_conj);477.//suni gmp_printf("Clifford Conjugate: %Qd\n", clifford_conj.m0);478.// suni gmp_printf("Clifford Conjugate: %Qd\n", clifford_conj.m1);479.//Compute geometric product of m and cg and store it on

amplitude_squared480.xlm_t amplitude_squared;481.xlm_init(&amplitude_squared);482.xlm_geometric_product(&amplitude_squared, m, &clifford_conj);

483.//suni gmp_printf("Amplitude squared: %Qd\n", amplitude_squared.m0);484.//Make the pointer content equal amplitude_squared485.xlm_clear(m);486.xlm_init(m);487.xlm_set(m, amplitude_squared);

488.//Clean up489.xlm_clear(&clifford_conj);490.xlm_clear(&amplitude_squared);491.}

492.void xlm_amplitude_squared_reversed(xlm_t * m) {493.//Compute amplitude squared of m494.xlm_t amplitude_squared_reversed;495.xlm_init(&amplitude_squared_reversed);496.xlm_set(&amplitude_squared_reversed, *m);497.xlm_amplitude_squared(&amplitude_squared_reversed);

498.//Compute reverse of mv_amplitude_squared and store it on mv_amplitude_squared

499.xlm_reverse(&amplitude_squared_reversed);

500.//Make the pointer content equal amplitude_squared_reversed501.xlm_clear(m);502.xlm_init(m);503.xlm_set(m, amplitude_squared_reversed);

504.//Clean up505.xlm_clear(&amplitude_squared_reversed);506.}

507.void xlm_rationalize(xlm_t *m) {508.//Compute amplitude squared of m and store it on mv_amplitude_squared509.xlm_t mv_amplitude_squared;510.xlm_init(&mv_amplitude_squared);511.xlm_set(&mv_amplitude_squared, *m);512.xlm_amplitude_squared(&mv_amplitude_squared);


514.xlm_t mv_amplitude_squared_reversed;515.xlm_init(&mv_amplitude_squared_reversed);516.xlm_set(&mv_amplitude_squared_reversed, *m);

160

517.xlm_amplitude_squared_reversed(&mv_amplitude_squared_reversed);

518.//Compute geometric product of mv_amplitude_squared and mv_amplitude_squared_reversed and store it on mv_geometric_product

519.xlm_t mv_geometric_product;520.xlm_init(&mv_geometric_product);521.xlm_geometric_product(&mv_geometric_product,&mv_amplitude_squared,&mv

_amplitude_squared_reversed);

522.//Make the pointer content equal mv_geometric_product523.xlm_clear(m);524.xlm_init(m);525.xlm_set(m,mv_geometric_product);

526.//Clean up527.xlm_clear(&mv_amplitude_squared);528.xlm_clear(&mv_amplitude_squared_reversed);529.xlm_clear(&mv_geometric_product);530.}

531.void xlm_scalar_div(xlm_t *m, mpq_t scalar) {532.mpq_div(m->m0, m->m0, scalar);533.mpq_div(m->m1, m->m1, scalar);

534.}

535.void xlm_inverse(xlm_t *m){536.//Compute clifford congugation of m and store it on clifford_conj537.xlm_t clifford_conj;538.xlm_init(&clifford_conj);539.xlm_set(&clifford_conj, *m);540.xlm_clifford_conjugation(&clifford_conj);


542.xlm_t mv_amplitude_squared;543.xlm_init(&mv_amplitude_squared);544.xlm_set(&mv_amplitude_squared, *m);545.xlm_amplitude_squared(&mv_amplitude_squared);

546.//Rationalize547.xlm_t mv_rationalize;548.xlm_init(&mv_rationalize);549.xlm_set(&mv_rationalize, *m);550.xlm_rationalize(&mv_rationalize);

551.xlm_t mv_geometric_product;552.xlm_init(&mv_geometric_product);553.xlm_set(&mv_geometric_product, *m);554.//Perform scalar div on geometric product555.xlm_scalar_div(&mv_geometric_product, mv_amplitude_squared.m0);

556.//Make the pointer content equal mv_geometric_product557.xlm_clear(m);558.xlm_init(m);559.xlm_set(m, mv_geometric_product);

560.//Clean up561.xlm_clear(&clifford_conj);

161

562.//xlm_clear(&mv_amplitude_squared_reversed);563.//xlm_clear(&mv_geometric_product);564.xlm_clear(&mv_rationalize);565.}

566.//567.// xlg_massive_encryption.h

568.#ifndef xlg_massive_encryption_h569.#define xlg_massive_encryption_h

570.#include <stdio.h>571.#include "xlg.h"

572.void xlg_encrypt_file(char* src, char* dst, xlg_t xlg);573.void xlg_decrypt_file(char* src, char* dst, xlg_t xlg);574.void xlg_append_encypted_data(char* dst_path, char* data_buffer,

xlg_t xlg);575.void xlg_encode(xlg_t xlg);576.void xlg_decode(xlg_t xlg);

577.#endif /* xlg_massive_encryption_h */

578.//579.// xlg_massive_encryption.c

580.#include "xlg_massive_encryption.h"581.int BUFFER_SIZE = 1024*10;

582.void xlg_encrypt_file(char* src, char* dst, xlg_t xlg){583.FILE *src_file = fopen(src, "rb");584.FILE *dst_file= fopen(dst, "wb");

585.while (!feof(src_file)) {586.//Read file587.long nread = 1;588.char buffer[BUFFER_SIZE];589.buffer[0] = 1;590.while(nread<BUFFER_SIZE-1 && !feof(src_file)){591.int c = getc(src_file);592.if(c!= EOF){593.buffer[nread]=c;594.nread++;595.}596.}597.//Import598.xlm_t message;599.xlm_init(&message);600.xlm_import_str_w_size(&message,buffer,nread);

601.//suni gmp_printf("Message Values: %Qd + %Qd\n", message.m0, message.m1);

602.//Encrypt603.xlm_t cypher_xlm;

162

604.xlm_init(&cypher_xlm);605.xlg_encrypt(&cypher_xlm, message, xlg);606.//suni gmp_printf("Encrypted Values: %Qd + %Qd\n", cypher_xlm.m0,

cypher_xlm.m1);607.//Write to file608.xlm_out_raw(dst_file,cypher_xlm);

609.//Clean up610.xlm_clear(&message);611.xlm_clear(&cypher_xlm);612.}613.fclose(src_file);614.fclose(dst_file);615.}

616.void xlg_decrypt_file(char* src, char* dst, xlg_t xlg){617.FILE *src_file= fopen(src, "rb");618.FILE *dst_file= fopen(dst, "wb");

619.while(!feof(src_file)){620.//Read File621.xlm_t cypher_xlm;622.xlm_init(&cypher_xlm);623.size_t nread = xlm_inp_raw(&cypher_xlm, src_file);624.if(nread <=0){625.xlm_clear(&cypher_xlm);626.break;627.}

628.//Decrypt629.xlm_t decrypt;630.xlm_init(&decrypt);631.xlg_decrypt(&decrypt,cypher_xlm,xlg);632.//suni gmp_printf("Decrypted Values: %Qd + %Qd\n", decrypt.m0,

decrypt.m1);


636.//Write file637.long nwrite = 1;638.while(nwrite<size){639.putc(buffer[nwrite++], dst_file);640.}

641.//Clean up642.free(buffer);643.xlm_clear(&cypher_xlm);644.xlm_clear(&decrypt);645.}

646.fclose(src_file);647.fclose(dst_file);648.}

649.void xlg_append_encypted_data(char* dst_path, char* data_buffer, xlg_t xlg){

650.FILE *dst_file= fopen(dst_path, "wb");

651.char * buffer = malloc(strlen(data_buffer)+2);

163

652.memset(buffer,0,strlen(data_buffer)+2);653.buffer[0]=1;654.buffer = strcat(buffer, data_buffer);

655.//Import656.xlm_t data;657.xlm_init(&data);658.xlm_import_str_w_size(&data,buffer,strlen(data_buffer)+2);

659.//Encrypt660.xlm_t cypher_xlm;661.xlm_init(&cypher_xlm);662.xlg_encrypt(&cypher_xlm, data, xlg);

663.//Write to file664.xlm_out_raw(dst_file,cypher_xlm);

665.//Clean up666.xlm_clear(&data);667.xlm_clear(&cypher_xlm);668.free(buffer);

669.fclose(dst_file);670.}

671.void xlg_encode(xlg_t xlg){

672.while (!feof(stdin)) {

673.long nread = 1;674.char buffer[BUFFER_SIZE];675.buffer[0] = 1; // THANKS HANES!!!676.while(nread<BUFFER_SIZE-1 && !feof(stdin)){677.int c = getchar();678.if(c!= EOF){679.buffer[nread]=c;680.nread++;681.}682.}

683.//Import684.xlm_t message;685.xlm_init(&message);686.xlm_import_str_w_size(&message,buffer,nread);

687.//Encrypt688.xlm_t cypher_xlm;689.xlm_init(&cypher_xlm);690.xlg_encrypt(&cypher_xlm, message, xlg);

691.gmp_fprintf(stdout,"%Qd\n", cypher_xlm.m0);692.gmp_fprintf(stdout,"%Qd\n", cypher_xlm.m1);

693.xlm_clear(&message);694.xlm_clear(&cypher_xlm);695.}696.}

697.void xlg_decode(xlg_t xlg){

164

698.FILE *stream;699.char *line = NULL;700.size_t len = 0;701.size_t read;

702.stream = stdin;703.if (stream == NULL)704.exit(0);

705.int count = 0;706.mpq_t m0;707.mpq_t m1;

708.xlm_t cypher;709.while ((read = getline(&line, &len, stdin)) != -1) {710.if(count%2 == 0){711.mpq_init(m0);712.mpq_set_str(m0,line,10);713.count++;714.}715.else if(count%2 == 1){716.mpq_init(m1);717.mpq_set_str(m1,line,10);718.count++;

719.xlm_init(&cypher);720.mpq_set(cypher.m0,m0);721.mpq_set(cypher.m1,m1);

722.//Decrypt723.xlm_t decrypt;724.xlm_init(&decrypt);725.xlg_decrypt(&decrypt,cypher,xlg);


729.long nwrite = 1; //Thanks Hanes730.while(nwrite<size){731.putchar(buffer[nwrite++]);732.}733.count =0;734.xlm_clear(&cypher);735.xlm_clear(&decrypt);736.free(buffer);737.mpq_clear(m0);738.mpq_clear(m1);

739.}740.}

741.free(line);742.fclose(stream);743.}

165

Using the following commands compared against AES-Crypt.

AES-crypt:



RVTHE:




Sample Output of created encrypted file:

Cipher text size and contents:

166

Sample Performance Metrics:

167

Appendix E – Acronym List

Abbreviation Term HDD Hard Disk Drive SATA Serial AT AttachmentSSD Solid State DriveFDE Full-disk encryptionAES Advanced Encryption Standard DES Data Encryption Standard TDEA Triple Data Encryption AlgorithmRSA Rivest–Shamir–AdlemanMD5 Message-digest algorithmSHA Secure Hash Algorithm CBC Cipher Block ChainingCTR Counter GCM Galois/Counter Mode OCB Offset Codebook ModeECB Electronic Codebook OFB Output FeedbackAWS Amazon Web ServicesNIST National Institute of Standards and Technology

ESD Every Stage of Data FHE Fully Homomorphic Encryption RVTHE Reduced Vector Technique Homomorphic Encryption SSL Secure Sockets Layer UCCS University of Colorado, Colorado Springs

Regeldokument - Linnéuniversitetetcs.uccs.edu/.../doc/OGNSuneethaTedlaPhDThesisV2.docx · Web...

Documents

Transcript of Regeldokument - Linnéuniversitetetcs.uccs.edu/.../doc/OGNSuneethaTedlaPhDThesisV2.docx · Web...