Public Integrity Auditing for Dynamic Data Sharing with Multi-User ... Integrity Auditing for...

11
1 Public Integrity Auditing for Dynamic Data Sharing with Multi-User Modification Jiawei Yuan, Shucheng Yu, Member, IEEE Abstract—In past years, the rapid development of cloud storage services makes it easier than ever for cloud users to share data with each other. To ensure users’ confidence of the integrity of their shared data on cloud, a number of techniques have been proposed for data integrity auditing with focuses on various practical features, e.g., the support of dynamic data, public integrity auditing, low communication/computational audit cost, low storage overhead. However, most of these techniques consider that only the original data owner can modify the shared data, which limits these techniques to client read-only applications. Recently, a few attempts started considering more realistic scenarios by allowing multiple cloud users to modify data with integrity assurance. Nevertheless, these attempts are still far from practical due to the tremendous computational cost on cloud users, especially when high error detection probability is required by the system. In this paper, we propose a novel integrity auditing scheme for cloud data sharing services characterized by multi-user modification, public auditing, high error detection probability, efficient user revocation as well as practical computational/communication auditing performance. Our scheme can resist user impersonation attack, which is not considered in existing techniques that support multi-user modification. Batch auditing of multiple tasks is also efficiently supported in our scheme. Extensive experiments on Amazon EC2 cloud and different client devices (contemporary and mobile devices) show that our design allows the client to audit the integrity of a shared file with a constant computational cost of 340ms on PC (4.6s on mobile device) and a bounded communication cost of 77KB for 99% error detection probability with data corruption rate of 1%. Index Terms—Integrity Auditing, Cloud Storage, Public Verification, Dynamic Data, Batch Verification. 1 I NTRODUCTION T HE continuous development of cloud techniques has boosted a number of public cloud storage ap- plications. In particular, more and more cloud storage applications are being used as collaboration platforms, in which data are not only persisted in cloud for storage but also subject to frequent modifications from multiple users. Real-world examples are cloud-based storage syn- chronization platforms such as Dropbox for Business [2] and Sugarsync [3], Version Control Systems (VCS) such as Subversion [4] and Concurrent Versions System [5], which enable multiple team members to work in sync, accessing and modifying same files on cloud servers anywhere anytime. For correct execution of this kind of collaborative applications, one problem is to assure data integrity, i.e., each data modification operation is indeed performed by an authorized group member and the data remains intact and update to date thereafter. This problem is important given the fact that cloud storage platforms, even well-known cloud platforms, may experience hardware/software failures, human er- Copyright (c) 2013 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to [email protected] The preliminary version of this paper appeared in IEEE INFOCOM 2014 [1]. This work was supported by the US National Science Foundation award CNS-1338102 and Amazon AWS in Education Research Grant. J.Yuan and S.Yu are with Department of Computer Science, University of Arkansas at Little Rock, Little Rock, AR, 72204. E-mail: {jxyuan,sxyu1}@ualr.edu rors and external malicious attacks [6], [7]. In addition, we observed that there have been large discrepancies between the numbers of data corruption events reported by users and those acknowledged by service providers [7], which also causes users to doubt whether or not their data on cloud are truly intact. With the concern on data integrity of cloud storage services, users wish to have a way of auditing the cloud server to ensure that the server stores all their latest data without any corruption. To offer such a service, a series of schemes [8]–[22] have been proposed. However, for most of these existing schemes, only the data owner who holds secret keys can modify the data and all other users who share data with the data owner only have read permission. If these solutions are trivially extended to support multiple writers with data integrity assurance, the data owner has to stay online, collecting modified data from other users and regenerating authentication tags for them. Obviously, this kind of trivial extension will introduce a tremendous workload to the data owner, especially in scenarios with a large number of writers (users) and/or a high frequency of data modification operations. Considering practical scenarios wherein all users share data with each other in cloud have both read and write privileges, Wang et al. [18] proposed a public integrity auditing scheme using ring signature-based homomor- phic authenticators. Nevertheless, the scalability of ref. [18] is limited by the group size and data size. Moreover, user revocation is not considered in this paper. In order IEEE Transactions on Information Forensics and Security Volume: 10,Year: 2015

Transcript of Public Integrity Auditing for Dynamic Data Sharing with Multi-User ... Integrity Auditing for...

Page 1: Public Integrity Auditing for Dynamic Data Sharing with Multi-User ... Integrity Auditing for … · Public Integrity Auditing for Dynamic Data Sharing with Multi-User Modification

1

Public Integrity Auditing for Dynamic DataSharing with Multi-User Modification

Jiawei Yuan, Shucheng Yu, Member, IEEE

Abstract—In past years, the rapid development of cloud storage services makes it easier than ever for cloud users to share data with

each other. To ensure users’ confidence of the integrity of their shared data on cloud, a number of techniques have been proposed

for data integrity auditing with focuses on various practical features, e.g., the support of dynamic data, public integrity auditing, low

communication/computational audit cost, low storage overhead. However, most of these techniques consider that only the original data

owner can modify the shared data, which limits these techniques to client read-only applications. Recently, a few attempts started

considering more realistic scenarios by allowing multiple cloud users to modify data with integrity assurance. Nevertheless, these

attempts are still far from practical due to the tremendous computational cost on cloud users, especially when high error detection

probability is required by the system.

In this paper, we propose a novel integrity auditing scheme for cloud data sharing services characterized by multi-user modification,

public auditing, high error detection probability, efficient user revocation as well as practical computational/communication auditing

performance. Our scheme can resist user impersonation attack, which is not considered in existing techniques that support multi-user

modification. Batch auditing of multiple tasks is also efficiently supported in our scheme. Extensive experiments on Amazon EC2 cloud

and different client devices (contemporary and mobile devices) show that our design allows the client to audit the integrity of a shared

file with a constant computational cost of 340ms on PC (4.6s on mobile device) and a bounded communication cost of 77KB for 99%

error detection probability with data corruption rate of 1%.

Index Terms—Integrity Auditing, Cloud Storage, Public Verification, Dynamic Data, Batch Verification.

1 INTRODUCTION

THE continuous development of cloud techniqueshas boosted a number of public cloud storage ap-

plications. In particular, more and more cloud storageapplications are being used as collaboration platforms,in which data are not only persisted in cloud for storagebut also subject to frequent modifications from multipleusers. Real-world examples are cloud-based storage syn-chronization platforms such as Dropbox for Business [2]and Sugarsync [3], Version Control Systems (VCS) suchas Subversion [4] and Concurrent Versions System [5],which enable multiple team members to work in sync,accessing and modifying same files on cloud serversanywhere anytime. For correct execution of this kindof collaborative applications, one problem is to assuredata integrity, i.e., each data modification operation isindeed performed by an authorized group member andthe data remains intact and update to date thereafter.This problem is important given the fact that cloudstorage platforms, even well-known cloud platforms,may experience hardware/software failures, human er-

Copyright (c) 2013 IEEE. Personal use of this material is permitted. However,permission to use this material for any other purposes must be obtained fromthe IEEE by sending a request to [email protected]

• The preliminary version of this paper appeared in IEEE INFOCOM 2014[1]. This work was supported by the US National Science Foundationaward CNS-1338102 and Amazon AWS in Education Research Grant.

• J.Yuan and S.Yu are with Department of Computer Science, University ofArkansas at Little Rock, Little Rock, AR, 72204.E-mail: {jxyuan,sxyu1}@ualr.edu

rors and external malicious attacks [6], [7]. In addition,we observed that there have been large discrepanciesbetween the numbers of data corruption events reportedby users and those acknowledged by service providers[7], which also causes users to doubt whether or not theirdata on cloud are truly intact.

With the concern on data integrity of cloud storageservices, users wish to have a way of auditing the cloudserver to ensure that the server stores all their latest datawithout any corruption. To offer such a service, a seriesof schemes [8]–[22] have been proposed. However, formost of these existing schemes, only the data owner whoholds secret keys can modify the data and all other userswho share data with the data owner only have readpermission. If these solutions are trivially extended tosupport multiple writers with data integrity assurance,the data owner has to stay online, collecting modifieddata from other users and regenerating authenticationtags for them. Obviously, this kind of trivial extensionwill introduce a tremendous workload to the data owner,especially in scenarios with a large number of writers(users) and/or a high frequency of data modificationoperations.

Considering practical scenarios wherein all users sharedata with each other in cloud have both read and writeprivileges, Wang et al. [18] proposed a public integrityauditing scheme using ring signature-based homomor-phic authenticators. Nevertheless, the scalability of ref.[18] is limited by the group size and data size. Moreover,user revocation is not considered in this paper. In order

IEEE Transactions on Information Forensics and Security Volume: 10,Year: 2015

Page 2: Public Integrity Auditing for Dynamic Data Sharing with Multi-User ... Integrity Auditing for … · Public Integrity Auditing for Dynamic Data Sharing with Multi-User Modification

2

to further enhance previous work, another attempt wasmade by Wang et al, [21]. However, ref. [21] still suffersfrom a non-trivial computational cost which is linear tothe number of modifiers (especially for achieving higherror detection probability with small data corruptionrate, e.g., less then 0.1%) and the number of checkingtasks, and thus is limited in scalability. In addition, userrevocation in ref. [21] is based on the assumption thatthe cloud node responsible for updating signatures willnot be compromised nor encounter internal errors, whichhas been proven not always true in practice. As a matterof fact, compromisation and internal errors of the cloudnode in ref. [21] will lead to the disclosure of users’secrets, and thus causing severe attacks such as userimpersonation.

To design an efficient public integrity auditing schemesupporting multi-user modification and efficient userrevocation simultaneously, we need to overcome thefollowing major (not necessarily complete list of) chal-lenges: 1) Aggregation of individually generated authen-tication tags. Specifically, any user with read and writeprivilege should be able to modify the data and generatenew authentication tags without the help of the dataowner. In this context, how to aggregate tags from differ-ent users becomes a challenge problem, because the tagsare signed with individual users’ secret keys which aredifferent from each other. Without aggregation, a dataintegrity verifier has to process tags from different mod-ifiers one by one for an auditing task, and thus limitingthe scalability of scheme. A straightforward solution tothis problem is to let all users share the same secret key,so all authentication tags are in the same format and canbe easily aggregated. Nevertheless, this kind of solutionis limited by another challenge: 2) efficient and secureuser revocation. User revocation is a challenging issuein most security systems and usually involves disablinguser secret keys. Upon user revocation, all authenticationtags generated by the revoked user should be updated,and this heavy task is usually delegated to the cloud bydisclosing partial secrets to it. This method, however, canlead to disclosure of secret keys of valid users once thecloud server node colludes with a revoked user. 3) publicauditing. In practical systems, data integrity auditing canbe performed not only by data owners or other groupusers but also by a third-party auditor or any generaluser who has public keys of the system.

In this work, we address above challenges andpropose an efficient public integrity auditing schemefor cloud data sharing that supports multiple writers.Thanks to our novel design on polynomial-based au-thentication tags, we can empower the cloud to ag-gregate authentication tags from multiple writers intoone when sending the integrity proof information to theverifier. As a result, just a constant size of integrity proofinformation and a constant number of computational op-erations are needed for the verifier, no matter how largethe audited file is and how many writers are associatedwith the data blocks. Moreover, with the novel proxy

authentication tag update technique, our scheme allowssecure delegation of user revocation operations to thecloud and can defeat impersonation attacks from illegit-imate users. By uniquely incorporating Shamir’s SecretSharing scheme [23], we further enhance the reliabilityof our design during the user revocation procedure. Lastbut not least, our proposed scheme allows aggregationof integrity auditing operations for multiple tasks (files)through our batch integrity auditing technique, whichpromote our scheme in terms of auditing efficiency anddata corruption detection probability. Thorough analy-sis and extensive experimental results on Amazon EC2cloud [24] and various client devices (PC and mobiledevices) demonstrate the scalability and efficiency of ourdesign.

The rest of this paper is organized as follows: InSection 2, we introduce models of our scheme. Weprovide technique preliminaries for our design in Section3. Section 4 provides our detailed construction, whichis followed by the security analysis in Section 5. Theperformance evaluation of our scheme is presented inSection 6. Section 7 introduces practical applications, inwhich our proposed scheme is applicable for integrityassurance. In Section 8, we discuss existing works relatedto our design. We conclude this paper in Section 9.

2 MODELS

2.1 System Models

We consider a cloud system composed of three majorentities: the cloud server, group users and the third-partyauditor (TPA). The cloud server is the party that providesdata storage services to group users. Group users consistof a number of general users and a master user, who is theowner of the shared data and manages the membershipof other group users. All group users can access andmodify data. The TPA refers to any party that checksthe integrity of data being stored on the cloud. As ourproposed scheme allows public integrity auditing, theTPA can actually be any cloud user as long as he/shehas access to the public keys. Once the TPA detectsa data corruption during the auditing process, he/shewill report the error to group users. In our design, datacan be uploaded/created by either the master user orother group users. We assume data are stored in formof files which are further divided into a number ofblocks. For integrity auditing, each data block is attachedwith an authentication tag that is originally generatedby the master user. When a user adds or modifiesa block, he/she (the user) updates the correspondingauthentication tag with his/her own secret key withoutcontacting the master user.

2.2 Threat Model

In this work, we assume the cloud server to be curious-but-honest, which is consistent with ref. [21]. Specifically,the cloud server will follow the protocol, but it may lie to

IEEE Transactions on Information Forensics and Security Volume: 10,Year: 2015

Page 3: Public Integrity Auditing for Dynamic Data Sharing with Multi-User ... Integrity Auditing for … · Public Integrity Auditing for Dynamic Data Sharing with Multi-User Modification

3

the users about the corruption of their data stored on it inorder to preserve the reputation of its services. This kindof situation occurs many times, being it internationallyor not, with existing cloud storage platforms [7]: thedata loss events claimed by users are much more thanthose acknowledged by service providers. In this context,we consider the following factors that may impact dataintegrity: 1) hardware/software failures and operationalerrors of system administrator; 2) external adversariesthat corrupts data stored on the cloud; 3) revoked userswho no longer have data access privilege but try toillegally impersonate valid users. Since valid users arealways allowed to modify data, we assume that theyare always honest when they are in the group. We alsoassume that secure communication channels (e.g., SSL)exist between each pair of entities.

3 TECHNIQUE PRELIMINARIES

3.1 Shamir’s Secret Sharing

A (k, n)-Shamir’s Secret Sharing scheme [23] divides(based on polynomials) a secret S into n shares. Withany k shares, one can recover the secret S. However,knowledge of any k-1 or fewer shares leaves the secretcompletely undetermined. Specifically, as any k pointscan uniquely define a polynomial of degree k − 1, bychoosing k− 1 random positive integers a1, a2, · · · , ak−1

from a finite filed of size q and set a0 = S, we canconstruct the polynomial: f(x) = a0 + a1x+ a2x

2 + · · ·+ak−1x

k−1, where ai < q and q is a prime number. Whenthere are n participants to share the secret S, we canconstruct n points out of f(x) as (j, f(j))1≤j≤n and giveeach participant a point. After that, given any k out ofthese n points, one can compute the coefficients of f(x)using polynomial interpolation and recover the constantterm a0, which is set as the secret S. For more detailsabout Shamir’s Secret Sharing scheme, please refer toref. [23].

4 OUR CONSTRUCTION

4.1 Detailed Construction

We now give the detailed construction of our design.In our scheme, there are K users uk, 0 ≤ k ≤ K − 1in a group sharing data stored on cloud and u0 is themaster user. u0 is also the owner of data and managesthe membership of the group. Therefore, u0 can revokeany other group users when necessary. All users inthe group can access and modify data stored on cloud.For simplicity of expression, we assume that the TPAperforms the data integrity auditing procedure, who inpractice can be any user knowing the public key. Table.1summarizes the notation to be used in our construction.

Setup: To setup the system, the master user u0 firstruns the Key Generation part of the Setup algorithm asshown in Fig.1 and generates public keys (PK), masterkeys (MK) of the system and secret keys (SK) of users.

H(·) One-way hash function [25]G A multiplicative cyclic groupq The prime order of Group G

e : G × G → G1 A bilinear mapg, u Random generators of group Gλ Security parameterF Data file that will be split into n blocksmi A data block of file F and will be split into s elementsmij A block element of data block mi

σi Authentication tag generated for data block mi

f~α(x) a polynomial with coefficient vector ~α = (α0, α1,· · · , αs−1), αj ∈ Z∗

q

ǫk, α, µ,R Random numbers, where 0 ≤ k ≤ K − 1

Table.1 Notation

Algorithm: Setup(1λ, F ) → (PK,MK,SKk, σi)

Key Generation: Select K random number ǫkR← Z∗

q and

generate ν = gαǫ0 , κ0 = gǫ0 , {κk = gǫk , gǫ0ǫk }1≤k≤K−1.

Randomly choose αR← Z∗

q and generate {gαj}, 0 ≤ j ≤ s+1.

The public keys (PK), master keys (MK) of the system andsecret keys (SK) of users are:

PK = {g, u, q, ν, {gαj}0≤j≤s+1, κ0, {κk, g

ǫ0ǫk }1≤k≤K−1}

MK = {ǫ0, α} SKk = {ǫk}1≤k≤K−1

File Processing: Split file F into n data blocks, and each blockinto s elements: {mij}, 1 ≤ i ≤ n, 0 ≤ j ≤ s − 1. Computeauthentication tag σi, 1 ≤ i ≤ n for each data block as:

σi = (uBi ·

s−1∏

j=0

gmijαj+2

)ǫ0 = (uBi · gf ~βi

(α))ǫ0 (1)

where ~βi = {0, 0, βi,0, βi,1, · · · , βi,s−1} and βi,j = mi,j .Bi = H({fname||i||ti||k}), fname is the file name, i is theindex of data block mi, ti is the time stamp and k is the indexof user in the group.

Fig. 1. Detailed Construction: System Setup

In our design, each user will have his/her own secretkeys for data modification.

To outsource a file F , the master user u0 runs the FileProcessing procedure of the Setup algorithm to generatedata blocks mi and the corresponding authenticationtags σi. u0 then uploads data blocks and tags to thecloud. u0 also publishes and maintains a Log for the file,which contains {i||ti||k} information for each block. Notethat the size of each log record is 16 bytes, which is 1

256of the data block size (typically 4KB). For instance, theowner only needs to maintain a 4MB log file for a 1GBfile that is split into 4KB blocks.

Update: We now show how to allow group users tomodify the shared data. Suppose a group user uk, k 6= 0modifies a data block mi to m′

i. uk computes the tag form′

i with his own secret key ǫk as:

σ′i = (uB

i ·

s−1∏

j=0

gm′

ijαj+2

)ǫk = (uB′

i · gf ~β′i

(α))ǫk (3)

where ~β′i = {0, 0, β′

i,0, β′i,1, · · · , β

′i,s−1} and β′

i,j = m′i,j .

B′i = H({fname||i||t′i||k}). uk then uploads updated data

IEEE Transactions on Information Forensics and Security Volume: 10,Year: 2015

Page 4: Public Integrity Auditing for Dynamic Data Sharing with Multi-User ... Integrity Auditing for … · Public Integrity Auditing for Dynamic Data Sharing with Multi-User Modification

4

Algorithm: Challenge(1λ, PK) → (CM = {D,X, gR, µ})

1) Randomly choose d data blocks as a set D (The size of set D is discussed in Section 4.3.2).

2) Suppose the chosen d blocks are modified by a set of users, denoted by C, 0 ≤ |C| ≤ K − 1. Generate two random numbers R

and µ and produce set X = {(gǫ0ǫk )R}k∈C . If the set D contains blocks last modified by any revoked user, add (g

ǫ0ǫ0+ρ )

R

to set

X , where (gǫ0

ǫ0+ρ ) is generated by the master user during the user revocation procedure (see Fig.3 for details).

Algorithm: Prove(CM,F, PK) → (Prf = {π, ψ, y})

1) Generate {pi = µi mod q}, i ∈ D and compute y = f ~A(µ) mod q, where ~A = {0, 0,

∑i∈D pimi,0, · · · ,

∑i∈D pimi,s−1}.

2) Divide the polynomial f ~A(x) − f ~A

(µ) with (x − µ) using polynomial long division, and denote the coefficients vector of the

resulting quotient polynomial by ~w = (w0, w1, · · · , ws), i.e., f~w(x) ≡f ~A

(x)−f ~A(µ)

x−µ.

With ~w, compute ψ =∏s

j=0(gαj

)wj = gf~w(α) .

3) For data blocks in D that were last-modified by user uk, k ∈ C, compute πi = e(σi, gǫ0R

ǫk ) = e((uBi · gf ~βi

(α)), g)ǫ0R.

For data blocks never modified by any group user or only modified by u0, compute πi = e(σi, gR) = e((uBi · g

f ~βi(α)

), g)ǫ0R.

For data blocks in D last modified by revoked users, compute πi = e(σ′i, (g

ǫ0ǫ0+ρ )

R

) = e((uBi · gf ~βi

(α)), g)ǫ0R, where σ′

i is theupdated authentication tag of blocks that are last modified by revoked users (see Fig.3 for details).

Aggregate πi as π =∏

i∈D πpii .

Algorithm: Verify(Prf, PK) → (VerifyRst)

1) Compute η = uω , where ω =∑

i∈D Bipi.2) Verify the integrity of file as

e(η, κR0 ) · e(ψR, ν · κ−µ0 )

?= π · e(κ−y

0 , gR) (2)

If Eq.2 holds, output the verification result V erifyRst as Accept; otherwise, output V erifyRst as Reject.

Fig. 2. Detailed Construction: Auditing Process

blocks m′i and its corresponding tags σ′

i to the cloud.uk will also updates {i||t′i||k} in the log file. Note that,the size of our authentication tag for each data block isonly 128 bytes, which is 1

32 of the block size and thusintroducing slight communication overhead for users.

Challenge: To audit data integrity, the TPA generatesthe challenge message CM = {D,X, gR, µ} by runningthe Challenge algorithm as shown in Fig.2. Note that,the TPA is aware of the set C used in the Challengealgorithm by looking at records {i||ti||k}i∈D in the logfile. The TPA then sends the challenging message CM

to the cloud.Prove: On receiving the challenging message CM =

{D,X, gR, µ}, the cloud will run the Prove algorithm inFig.2 to generate the proof information Prf = {π, ψ, y},which shows that it actually store the challenged datafile correctly. Note that, polynomials f(x) ∈ Z[x] havethe algebraic property that (x − r) perfectly divides the

polynomial f(x) − f(r), rR← Z∗

p . With this property, thecloud can efficiently divide the polynomial f ~A(x)−f ~A(µ)with (x−µ) using polynomial long division in Step 2 ofthe Prove algorithm. The cloud responds the TPA withproof information Prf = {π, ψ, y}.

Verify: Based on the proof information Prf , the TPAverifies the integrity of file F by running the Verifyalgorithm as shown in Fig.2.

Algorithm: User Revocation – Basic

1) The master user u0 computes χ = ǫ0+ρǫk

mod q and

sends it to the cloud, where ρR← Z∗

q . u0 also generates

gǫ0

ǫ0+ρ and sends it to valid group users and the TPAas part of the PK.

2) By receiving χ the cloud updates the authentication tagsof blocks that are last modified by uk as: σ′

i = σχi =

(uBi · gf ~βi

(α))ǫ0+ρ

.3) The TPA and valid group users discard the public

information gǫ0ǫk .

Fig. 3. Detailed Construction: Basic User Revocation

User Revocation: We now provide a basic User Re-vocation design of our scheme. We will also introducean advanced version of user revocation with improvedreliability in Section 4.3.1.

Whenever there is a user to be revoked, say uk, k 6= 0,the master user u0 and the cloud run the User Revoca-tion - Basic algorithm as shown in Fig.3. In particular,all authentication tags generated by revoked users areupdated so that the revoked users’ secret keys are re-moved from the tags. Depending on how many the tags

IEEE Transactions on Information Forensics and Security Volume: 10,Year: 2015

Page 5: Public Integrity Auditing for Dynamic Data Sharing with Multi-User ... Integrity Auditing for … · Public Integrity Auditing for Dynamic Data Sharing with Multi-User Modification

5

were modified by the revoked users, these update oper-ations can be potentially communication/computation-intensive. To relieve the master user from this potentialburden, our design securely offloads all tag updateoperations to the cloud, which is resource abundant andsupports parallel processing. The master user only needsto compute two group elements in each user revocationevent (see step 1 of Fig.3).

During the auditing process (as shown in Fig.2), ifthe set D contains blocks last modified by any revokedusers, the TPA, while executing the Challenge algorithm,

adds (gǫ0

ǫ0+ρ )R

to set X as part of the challenge messageCM = {D,X, gR, µ}; on receiving the challenge message,

the cloud generates πi = e(σ′i, (g

ǫ0ǫ0+ρ )

R) for each block

in set D that is modified by a revoked user (see step3) of the Prove algorithm). Finally, the TPA can verifythe integrity of the challenged file by running the Verifyalgorithm.

Correctness: We analyze the correctness of our con-struction based on Eq.2 as:Integrity auditing without user revocation:

= π · e(κ−y0 , gR) (4)

=∏

i∈D

e((uBi · gf ~βi

(α)), g)Rǫ0 · e(g−y, g)Rǫ0

=∏

i∈D

e(uBi , g)Rǫ0 ·∏

i∈D

e(gf ~βi

(α), g)Rǫ0 · e(g−y, g)Rǫ0

=∏

i∈D

e(uBi , g)Rǫ0 · e(gf ~A(α), g)Rǫ0 · e(g−y, g)Rǫ0

= e(u∑

i∈D Bi , g)Rǫ0 · e(gf~w(α), g(α−µ))Rǫ0

= e(η, κR0 ) · e(ψR, ν · κ−µ

0 )

Integrity auditing after user revocation:

= π · e(κ−y0 , gR) (5)

= e(g−y, g)Rǫ0 ·∏

i∈D,i/∈Rev

e((uBi · gf ~βi

(α)), g)Rǫ0

·∏

i∈D,i∈Rev

e((uBi · gf ~βi

(α))ǫ0+ρ

, (gǫ0

ǫ0+ρ )R)

=∏

i∈D

e(uBi , g)Rǫ0 ·∏

i∈D

e(gf ~βi

(α), g)Rǫ0 · e(g−y, g)Rǫ0

= e(u∑

i∈D Bi , g)Rǫ0 · e(gf~w(α), g(α−µ))Rǫ0

= e(η, κR0 ) · e(ψR, ν · κ−µ

0 )

From Eq.4 and Eq.5, it is easy to see that our schemeis correct.

4.2 Efficient Support for Multi-file Auditing

For cloud systems in which data blocks are subject to fre-quent modifications by group users, the TPA may needto audit data integrity often to assure data integrity. Inlarge-scale systems, performing integrity auditing file byfile is inefficient in terms of both bandwidth consump-tion and computational cost. Specifically, given a set of Tdifferent files Ft = {mti,j} 1 ≤ t ≤ T, 1 ≤ i ≤ nt, 0 ≤ j ≤

st − 1, it is desirable if the TPA can aggregate integrityauditing operations of T files into one challenge and oneverification to reduce cost, where nt is the number ofdata blocks in file Ft and st is the number of elementsin each block. To this end, we design a batch verificationalgorithm based on our single file auditing solution,and enable the TPA to handle integrity auditing of Tfiles at the cost comparable to the single file scenario.In the batch verification algorithm, Setup and Updatealgorithms are same as those in the single file scenariorespectively. Here we focus on introducing the Batch-Challenge, Batch-Prove and Batch-Verify algorithms.

Batch-Challenge: The challenge process for auditingT files is same as the single file scenario. Notably, thechallenging message CM = {D,X, gR, µ} now containsthe information for data blocks of all these T files.

Batch-Prove: On receiving the challengingmessage CM = {D,X, gR, µ}, the cloud firstruns the Prove algorithm in Fig.2 for each singlefile and generates ψt, πt, 1 ≤ t ≤ T . Then, thecloud aggregates the proof information into twoelements as π =

∏Tt=1 πt and ψ =

∏Tt=1 ψt. The

cloud also computes y = f ~A(µ) mod q and ~A =

{0, 0,∑T

t=1

∑i∈D pi ∗mti,0, · · · ,

∑Tt=1

∑i∈D pi ∗mti,st−1}.

Finally, the cloud responds the TPA with proofinformation Prf = {π, ψ, y}.

Batch-Verify: On receiving the proof information Prf ,the TPA first computes ωt =

∑i∈D Bti for each file and

generates η = u∑T

t=1 ωt . Then it verifies the integrity ofthese T files as

e(η, κR0 ) · e(ψR, ν · κ−µ

0 )?= π · e(κ−y

0 , gR) (6)

If Eq.6 holds, the TPA outputs the verification resultV erifyRst as Accept for these T files; otherwise, outputsV erifyRst as Reject.

Based on above batch verification construction, wecan see that the computational cost on the TPA forverification of T files is almost the same as that of onefile.

Batch-Correctness: We analyze the correctness of ourbatch verification based on Eq.6 as follows:

e(η, κR0 ) · e(ψR, ν · κ−µ

0 ) (7)

= e(u∑T

i∈D,t=1 Bti , g)Rǫ0 · e(gf ~A(α)−f ~A

(µ), g)Rǫ0

=

T∏

i∈D,t=1

e(uBti , g)Rǫ0 ·

T∏

i∈D,t=1

e(gf ~βti

(α), g)Rǫ0

·e(g−f ~A(µ), g)Rǫ0

=

T∏

i∈D,t=1

e((uBti · gf ~βti

(α)), g)Rǫ0 · e(g−f ~A

(µ), g)Rǫ0

= π · e(κ−y0 , gR)

Based on Eq.7, it is easy to validate the correctness ofour batch verification construction.

IEEE Transactions on Information Forensics and Security Volume: 10,Year: 2015

Page 6: Public Integrity Auditing for Dynamic Data Sharing with Multi-User ... Integrity Auditing for … · Public Integrity Auditing for Dynamic Data Sharing with Multi-User Modification

6

4.3 Discussion and Extension

In this section, we discuss the error detection probabilityof our design and the choice of set D’s size in theChallenge algorithm. We also provide extensions toour proposed scheme in terms of reliability and errordetection probability.

4.3.1 Advanced User Revocation Design

Algorithm: User Revocation – Advanced

1) The master user u0 runs a (U,N)-Shamir Secret Sharingscheme and generates N points (j, f(j)) of a U − 1degree polynomial f(x) = χ + a1x + a2x

2 + · · · +aU−1x

U−1. N points will be sent to N nodes of a cloudserver.

2) To update an authentication tag σ, any U cloud nodeswith the point (j, f(j)) on f(x) compute Lagrange basispolynomials Lj(x) =

∏0≤m≤U,m 6=j

x−xmxj−xm

.

3) Each cloud node will update a piece of the tag as σ′j =

σf(j)Lj(0).4) Aggregate U updated tag pieces as σ′ =

∏1≤j≤U σ

′j =

σ∑U

j=0 f(j)Lj(0) = σχ.

Fig. 4. Advanced User Revocation Design

In our basic User Revocation algorithm, we utilize asingle cloud node to update the authentication tag lastupdated by the revoked users. In this scenario, if thecloud node responsible for tag update is compromiseddue to internal errors or outside attacks, the revokeduser will be able to generate valid authentication tagsagain. The main issue that causes such compromisationattack is the attacker can access χ that is used to up-date authentication tags when it compromises the cloudnode. Therefore, to prevent this kind of compromisationand enhance the reliability of our design, we uniquelyincorporate a (U,N)-Shamir Secret Sharing [23] techniqueinto our design and distribute χ and the authenticationtag update process to multiple cloud nodes.

Specifically, instead of sending χ to single cloud nodeas our basic User Revocation algorithm, the master useru0 runs the (U,N)-Shamir Secret Sharing on χ and sharesit to N cloud nodes as shown in the Step 1 of Fig.4. Then,any U out of N nodes with the shared χ will be chosen tocompute their own pieces of the updated authenticationtag as shown in Step 2-3 of Fig.4. Note that U cloudnodes can update their own authentication tag piecesin parallel without interaction with each other, thus itcan achieve comparable realtime update performance asthe update on single cloud node. Finally, the updatedtag pieces from U cloud nodes will be aggregated togenerate the final updated tag. In our tag aggregationprocess, the shared χ of each node is built into theircorresponding authentication tag piece σ′

j . Therefore,even the attack can compromise the cloud node thataggregates authentication tag pieces, it cannot access theshared pieces of χ and recover it.

With our extended authentication tag update design,attackers have to compromise at least U cloud nodeswith shared secrets at the same time to recover thesecret χ, because χ is shared to U cloud users usingShamir Secret Sharing [23]. Particularly, the knowledgeof any U−1 or fewer pieces leaves the secret completelyundetermined as shown in Section 3.1. Compared withour basic User Revocation algorithm, which depends onone cloud node for revoked tag updating, our advancedalgorithm significantly enhances the reliability of thescheme.

4.3.2 Error Detection Probability

As mentioned in Section 4.1, instead of choosing all thedata blocks of a file to audit its integrity, we randomlychoose d blocks as set D, in order to save communicationand computational costs while remaining an acceptablelevel of error detection probability. Specifically, as shownin ref. [9], the error detection probability is P = 1 −

(1− E)d, where E is the error rate. Therefore, if there are

1% corrupted data blocks, 460 challenge data blocks willresult in 99% detection probability, and 95% detectionprobability only requires 300 challenge blocks, despitethe total number (greater than 460 and 300 respectively)of data blocks in the file. Therefore, the number d canbe considered a fixed number in our scheme once therequired error detection probability is determined.

To achieve a high error detection probability for smalldata corruption rate, our design can increase the size ofset D. For example, if the system requires 99% detectionconfidence for 0.1% data corruption rate, the size of setD can be set as 4603. In Section 6.1.1, we will show thatincreasing set D’s size to achieve better error detectionconfidence has slight influence on our auditing perfor-mance, which makes our design significantly outperformref. [21] in terms of efficiency.

5 SECURITY ANALYSIS

Theorem 5.1. If there exists a probabilistic polynomial timeadversary Adv that successfully convinces the TPA to acceptthe fake proof information for a corrupted file with non-negligible probability, we can use Adv to construct a poly-nomial time algorithm B that solves the BDH problem or thet-SDH problem with non-negligible probability.

Theorem 5.2. If there exists a probabilistic polynomial timeadversary Adv that can impersonate valid users and generateauthentication tags on behalf of them, we can construct apolynomial time algorithm B that solves the CDH problem.

Due to the space limitation, we provide detailed proofof Theorem 5.1 and Theorem 5.2 in the full version of thiswork [26].

6 PERFORMANCE EVALUATION

6.1 Numerical Analysis

In this section, we numerically analyze our proposedscheme and compare it with ref. [21] in terms of compu-tational cost and communication cost. For simplicity, in

IEEE Transactions on Information Forensics and Security Volume: 10,Year: 2015

Page 7: Public Integrity Auditing for Dynamic Data Sharing with Multi-User ... Integrity Auditing for … · Public Integrity Auditing for Dynamic Data Sharing with Multi-User Modification

7

Single Task Multiple Tasks User RevocationComp.Cost (User) Comm.Cost Comp.Cost (User) Comm.Cost Comp.Cost (Cloud)(d + |C|)EXP (d + |C|)T EXP

Ref.[21] +(d + 2|C|)MUL d ∗ index + 2|C||G| +(d + 2|C|)T MUL dT ∗ index + 2|C|T |G| Y Exp+(|C| + 1)Pairing +(|C| + 1)T Pairing

Our Scheme 6 EXP + 3 MUL + 3 Pairing d ∗ (index + |log|) 6 EXP + 3 MUL + 3 Pairing d ∗ (index + T ∗ |log|) Y EXP+Y Pairing+(|C| + 3)|G| + 2λ +(|C| + 3)|G| + 2λ

Table.2 Realtime Auditing Complexity Summary: in this table, |C| is number of users that modify files, d is the number of of blocks selectedfor challenging, index is the size of block index, |G| is the size of a group element, λ is the size of security parameter and T is the number offiles for verification; EXP and MUL are one multiplication operation and one exponentiation operation on Group G respectively, Pairing is abilinear pairing operation.

the rest of this paper, we use EXP and MUL1 to denotethe complexity of one exponentiation operation andone multiplication operation on Group G respectively.We ignore hash operations in our evaluation, since itscost is negligible compared to EXP, MUL and Pairingoperations. For example, the cost of one EXP operationon a PC can be over 100,000 times more expensive thanthe cost of one SHA-1 hash operation required in ourdesign.

6.1.1 Computational Cost

As shown in Section 4, there are 6 algorithms in ourscheme: Setup, Challenge, Update, Prove, V erify andUser Revocation. Setup is a pre-processing procedure,which can be performed by group users off-line and willnot influence the real-time verification performance. Inthe Setup algorithm, the master user first needs to per-form (s+K+4) EXP operations to generate public keys,master keys of the system and secret keys of users, wheres is the number of elements in block and K is the numberof group users. To process a data file, the master userconducts (s+ 2)n EXP and sn MUL operations for eachfile, where n is the number of blocks in a file. When auser needs to modify or add data blocks, he executes theUpdate algorithm with (s+2)EXP+(s+1)MUL operationsto generate the corresponding tags. In order to checkthe integrity of a file, the TPA performs the Challenge

algorithm to generate the challenging message CM . InCM , the selection of a constant number of random num-bers with given system requirement is at a negligiblecost. To generate set X in CM , |C| EXP operations arerequired by the TPA, where |C| is number of users whomodify latest file blocks. Note that the set of X can beprecomputed and stored by the TPA without influencingthe realtile auditing performance. For instance, a 1,000users group only requires the TPA to store 128KB forany possible elements of set X in one round integrityauditing (the same size for the auditing of multiple files).The cloud server then runs the Prove algorithm with s

EXP, (s + d) MUL and d Pairing operations, where d isa constant number of blocks selected for challenging asdiscussed in Section 4.3.2. To verify the integrity of thefile, the TPA only needs 6 EXP, 3 MUL and 3 Pairingoperations. This property is interesting because such acost can even be affordable to less powerful devices suchas mobile phones. If there are some blocks last modified

1. When the operation is on the elliptic curve, EXP means scalar multiplicationoperation and MUL means one point addition operation.

by revoked users, the TPA only needs to perform onemore EXP operation compared to the scenarios withoutuser revocation. In case multiple files shall be verifiedat the same time, our design can batch the verificationoperations of these files and aggregate them into oneoperation. Consequentially, the TPA only needs 6 EXP, 3MUL and 3 Pairing operations to perform multiple filechecking, the cost of which is the same as the single filescenario. To revoke a group user, only one EXP operationis required for the master user; the cloud server needs YEXP operations to update the corresponding tags, whereY is the number of data blocks last modified by therevoked user. Considering our Advanced User Revocationdesign, U cloud nodes will perform UY EXP and UY

MUL operations in parallel to update authentication tagslast modified by revoked users.

We now compare our proposed scheme with the exist-ing scheme [21] and summarize the result in Table.2. Toverify the integrity of a single file, ref. [21] costs the TPA(d+ |C|)EXP+(d+2|C|)MUL+(|C|+1)Pairing operations.Note that, the value of d will increase significantly whenusers want to increase the error detection probability,and thus limiting the auditing performance of the TPA.For example, in order to achieve 99% confidence todetect error of a file with 0.1% data error rate, ref. [21]needs to challenge d data blocks and d should satisfy1− (1− 0.001)d ≥ 0.99 as we discussed in Section 4.3.2.In this case, d will be larger than 4603. Differently, ourdesign makes the cost of the TPA independent to thenumber of challenge blocks (d) and group users (|C|).Considering the integrity auditing of multiple files, thecost of the TPA in ref. [21] is linear to the number offiles. In contrast, our scheme can aggregate operationsbrought by multiple files and make the cost similar tothe single file scenario, and thus outperforms ref. [21].

6.1.2 Communication Cost

In our scheme, the communication cost for data integrityauditing mainly comes from the log records, challengingmessage CM and the proof information Prf . To generatethe challenging message, the TPA will require d log

records with size d ∗ |log| (Typically, the size of a log

record is 16 bytes). The size of the CM = {D,X, gR, µ}is d∗ index+(|C|+1)|G|+λ bits, where index is the sizeof block index and |G| is the size of a group element.If there are some blocks last modified by revoked users,one more group element of size |G| will be included inCM . For the proof information Prf = {π, ψ, y}, there are

IEEE Transactions on Information Forensics and Security Volume: 10,Year: 2015

Page 8: Public Integrity Auditing for Dynamic Data Sharing with Multi-User ... Integrity Auditing for … · Public Integrity Auditing for Dynamic Data Sharing with Multi-User Modification

8

two group elements and the result of a polynomial with2|G|+ λ bits. Therefore, the total communication cost ofan integrity verification task is d∗index+(|C|+3)|G|+2λbits. Considering the simultaneous checking of multiplefiles, our batch verification design allows users to ag-gregate these tasks into one challenge and one proof,and thus achieving the same communication cost as thesingle file scenario. When a user revocation occurs, ourscheme only requires the master user to send one groupelement to the cloud and add one group element to thepublic keys.

Now, we compare our proposed scheme with theexisting scheme [21] and summarize the result in Table.2.In ref.[21] the communication cost for single verificationtask is d∗(index+ |log|)+2|C||G| bits which is attributedto the challenge and proof generation processes and iscomparable to our scheme. However, when processingmultiple files together, the size of proof information inref. [21] will increase linearly to the number of filesas shown in Table.2, where T is the number of filesfor verification. Differently, our batch verification designomits this kind of linear growth for group element withinformation aggregation. As the size of a log record isonly 1

8 of a group element, our scheme can save morethan 80% communication cost compared with ref. [21].In order to revoke a user, although ref. [21] has a similarcost to our scheme, their user revocation solution mayallow the revoked users to impersonate valid users.

6.2 Experimental Results

To evaluate the performance of our proposed scheme,we fully implemented our proposed scheme on AmazonEC2 cloud using JAVA with JAVA Pairing-Based Cryp-tography Library (jPBC) [27]. On the cloud server, wedeploy nodes running Linux with 8-core CPU and 32GBmemory. The verifier is a desktop running Linux with3.4GHz Intel i7-3770 CPU and 16GB memory, or a ASUSAndroid Iconia A200 Tablet with 1-GHz Nvidia Tegra 2Dual Core Mobile CPU and with 1GB memory. Machinesfor group users are laptops running Linux with 2.50GHzIntel i5-2520M CPU and 8GB memory. We set the secu-rity parameter λ = 160 bits, which achieve 1024-bits RSAequivalent security since our implementation is based onECC. We set size of each data block as 4KB, which isconsistent with the ref. [9], [28]. In order to verify the filesize’s influence on the auditing performance, we changeit from 4MB to 400MB in our experiments, which con-tains 1,000 blocks to 100,000 blocks respectively. In orderto compare our scheme with ref. [21], we implement thescheme proposed in ref. [21] with the same experimentenvironment. In our experiments, we mainly evaluatethe computational and communication performance ofour scheme. All experimental results represent the meanof 50 trials. Our implementation is not optimized (e.g.it is a single process/thread program in some parts).Therefore, further performance improvements of ourscheme is possible.

0 50 100 150 200 250 300 350 400 450 500600

700

800

900

1000

1100

1200

1300

Number of Group Users(a)

Ke

y G

en

era

tio

n T

ime

(m

s)

0 1 2 3 4 5 6 7 8 9 10

x 104

0

20

40

60

80

100

120

140

160

180

200

Block Number(b)

Au

the

nti

ca

tio

n T

ag

Ge

ne

rati

on

Tim

e (

se

co

nd

)

Fig. 5. (a) Key generation time (b) Authentication tag

generation time

6.2.1 System Setup

We first evaluate the generation of public keys, masterkeys and secret keys for the system. Our result in Fig.5(a)indicates that the key generation cost is proportional tothe group size, since the master user needs to generatesecret keys for each group user separately. To show theperformance of authentication tag generation, we varythe number of blocks in the file is from 1000 to 100,000.As shown in Fig.5(b), the tag generation time increasesproportionally to the number of blocks, from 2.11s to198.23s. Note that, the system setup cost is one-time,which can be conducted off-line and will not influencethe realtime performance of integrity auditing.

500 1000 1500 2000 2500 3000 3500 4000 4500 50000

20

40

60

80

100

120

140

160

Block Number(a)

Au

then

ticati

on

Tag

Up

date

Tim

e (

seco

nd

)

Our Scheme−PC

Our Scheme−Tablet

Ref.[21]−PC

1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

1

2

3

4

5

6

7

8

Block Signed By Revoked User(b)

Us

er

Re

vo

ca

tio

n C

os

t (s

ec

on

d)

Our Basic Algorithm

Our Advanced Algorithm

Ref.[21]−PC

Fig. 6. (a) Block update cost (b) User revocation cost on

cloud

6.2.2 Update and User Revocation

The update procedure of data blocks in our scheme issimilar to the setup procedure. As shown in Fig.6(a), theupdate cost is proportional to the number of modifiedblocks, since it has to generate a new authentication foreach block.

To revoke a user, the computational cost and com-munication cost are required for group users and theTPA are negligible in our scheme as discussed in Section6.1.1 and Section 6.1.2. The cloud server updates the cor-responding authentication tags when a user revocationoccurs. Fig.6(b) shows that our tag update performancesof the Basic User Revocation and the Advanced User Re-vocation are comparable on cloud server. This is becausethe cloud nodes involved for tag update procedure inour advanced algorithm perform tasks in parallel, and

IEEE Transactions on Information Forensics and Security Volume: 10,Year: 2015

Page 9: Public Integrity Auditing for Dynamic Data Sharing with Multi-User ... Integrity Auditing for … · Public Integrity Auditing for Dynamic Data Sharing with Multi-User Modification

9

the only additional cost compared with our basic algo-rithm is the final tag aggregation process. Although ouradvanced user revocation algorithm requires more costand cloud node resources, it achieves better reliability forthe system as we discussed in Section 4.3.1. In practicaldeployment, the parameters can be selected to trade onefor the other (i.e., cost versus reliability) based on thesystem’s actual requirement.

Fig.6(a)-(b) show that our scheme is comparable to ref.[21] in terms of performance on block update and userrevocation.

0 1 2 3 4 5 6 7 8 9 10

x 104

0

1000

2000

3000

4000

5000

6000

7000

8000

File Size (Block Number)(a)

Us

er

Ve

rifi

ca

tio

n T

ime

(m

s)

Our Scheme−PC

Our Scheme−Tablet

Ref.[21]−PC

0 1 2 3 4 5 6 7 8 9 10

x 104

10

15

20

25

30

35

40

File Size (Block Number)(b)

Us

er

Co

mm

un

ica

tio

n C

os

t (K

B)

Our Scheme−PC and Tablet

Ref.[21]−PC

Fig. 7. (a) User verification time on different file size (b)

Communication cost on different file size

0 50 100 150 200 250 300 350 400 450 5000

5000

10000

15000

Number of Users (Modified Chellenge Data Blocks)(b)

Us

er

Ve

rifi

ca

tio

n T

ime

(m

s)

Our Scheme−PC

Our Scheme−Tablet

Ref.[21]−PC

0 50 100 150 200 250 300 350 400 450 50010

20

30

40

50

60

70

80

Number of Users (Modified Chellenge Data Blocks)(a)

Us

er

Co

mm

un

ica

tio

n C

os

t (K

B)

Our Scheme−PC and Tablet

Ref.[21]−PC

Fig. 8. (a) User verification time on different number of

users (b) Communication cost on different number of

users

6.2.3 Real-time Auditing – Single File

As discussed in Section 4.3.2, we set the number ofchallenge blocks as 460 in our experiments to achieve99% error detection probability. We first measure theauditing performance of our design in terms of file size.Fig.7(a) and Fig.7(b) show that the computational costand communication cost on the TPA side is constantversus file size, i.e., the file size has no influence onthe TPA’s cost, which is consistent with our previousanalysis in Section 6.1.1. Fig.7(a) and (b) also indicatethat, although ref. [21] has comparable communicationcost to our scheme, its computational cost on the TPA istens that of our scheme. This is because ref. [21] requiresnumber of exponentiation operations and multiplicationoperations on Group linear to the number of challengeblocks (460 in our setting), which are constant and only6 and 3 in our scheme respectively.

We now evaluate the influence of the number of userson the performance of our design. As shown in Fig.2,

the computational/communication costs of our designis related to the number of users who last modifiedthe challenge data blocks in set D. In the best case,none of the group users ever modified these data blocks;in the worst case, however, every data block in D ismodified by a different group user. In our experiment,we vary the number of such group users from 0 to 460(recall that we set the number of challenge blocks as 460in our experiments) and evaluate its impact on systemperformance. Fig.8(a) shows that the computational costof the TPA is constant and independent to the numberof users who last modified the challenge blocks. Thisis because our design can transform authentication tagsunder different users’ secrets to the same format duringthe auditing process, which enables the aggregation ofcomputational tasks into a constant number. However,the communication cost increases proportionally to thenumber of users who last modified the challenge blocksas shown in Fig.8(b). This is because the TPA needs toadd a group element in the challenge message for eachuser who last modified the challenge blocks. Note that inthe worst case (i.e., 460 users modified the 460 challengedata blocks in D), the communication cost of our designis only 76.71KB, which can be efficiently transmitted viamost of today’s communication networks.

Ref. [21] has a similar communication cost to ours.However, its computational cost on the TPA is pro-portional to the number of users who last modifiedthe challenge blocks, which can be several magnitudeshigher than our scheme as shown in Fig.8(a). Theseexperimental results are also consistent with our anal-ysis in Section 6.1.1. Considering the limited computa-tional/communication costs, we believe that our designallows efficient auditing via resource-less-abundant de-vices such as smartphones, which is not necessarily truefor ref. [21].

0 200 400 600 800 1000 1200 1400 1600 1800 20000

1000

2000

3000

4000

5000

6000

7000

8000

Task Number(a)

Av

era

ge

Us

er

Ve

rifi

ca

tio

n T

ime

(m

s)

Our Scheme−Batch Verification

Our Scheme−Linear Verification

Ref.[21]−PC

0 200 400 600 800 1000 1200 1400 1600 1800 20006

8

10

12

14

16

18

20

Task Number(b)

Av

era

ge

Us

er

Co

mm

un

ica

tio

n C

os

t (K

B)

Our Scheme−Batch Verification

Our Scheme−Linear Verification

Ref.[21]−PC

Fig. 9. (a) Average user verification time on different task

number (b) Communication cost on different task number

6.2.4 Real-time Auditing – Multiple Files

To show the benefits of our batch auditing designfor multiple-file scenario, we change the number ofsimultaneous auditing tasks from 16 to 2000. Here,we measure the average integrity auditing cost perfile and compare our batch auditing design withstraightforward auditing (i.e., processing the tasks

IEEE Transactions on Information Forensics and Security Volume: 10,Year: 2015

Page 10: Public Integrity Auditing for Dynamic Data Sharing with Multi-User ... Integrity Auditing for … · Public Integrity Auditing for Dynamic Data Sharing with Multi-User Modification

10

one by one). As shown in Fig.9(a), compared withstraightforward auditing, our batch auditing achievesthe average computational cost per file on the TPAis nearly inversely proportional to the task number.This is because our design allows the most expensiveauditing operations to be aggregated into one. Eachadditional auditing task only introduces a constantnumber of hash operations, which are extremelyefficient. Fig.9(b) shows that our batch auditing designalso reduces the average communication cost by about60%. Therefore, our batch auditing design significantlyenhances scalability of our scheme in terms of thenumber of tasks. Since ref. [21] does not support batchauditing, its computational performance is similar tothe straightforward verification as shown in Fig.9(a)and its communicational cost almost doubles that of ourbatch verification. Thus, our batch verification designsignificantly outperforms ref. [21] in terms of multipletasks processing.

7 APPLICATIONS

A potential application of our design is Version ControlSystems (VCS) [4], [5], which are widely utilized inautomating the management of source code, documen-tation and configuration files for software development.In a VCS, a number of developers work as a team anddevelop on shared source codes. To record all changesof development files from version to version, a VCSwill store the first version of a file entirety and itssubsequent versions of stored with the difference fromthe immediate previous version. In order to utilize ourproposed scheme for the integrity auditing for a VCS,we treat the developers as a group and let the developerwho creates the initial repository as the master user u0.u0 will setup the auditing system and generate keyswith our Setup algorithm. As the a file in VCS changesfrom one version to another by appending subsequentdifference to the original file, we can treat them asa whole file F and the commit operation as updateoperation of adding blocks. When a developer makeschanges to files and commits them to the server, he/she(i.e., his/her computer) will also generate authenticationtags for modified blocks with our Update algorithm.When developers want to audit the integrity of filesand their changes on the version control server, theycan first perform our Challenge algorithm to enforcethe server to generate corresponding proof informationusing our Prove algorithm. Then, by running our V erifyalgorithm with the proof information and public keys,the integrity can be checked with high error detectionprobability. As our design efficiently supports batchauditing, we can audit all development files at the sametime to save cost. Thus, our scheme can be easily appliedto existing VCSs to efficient support integrity assurancewithout changing their original design.

8 RELATED WORK

The problem of data integrity auditing in cloud havebeen extensively studied in past years by a number ofProof of Retrievability (POR) and Proof of Data Posses-sion (PDP) schemes [8-22]. In ref. [8], [9], concepts ofPOR and PDP were first proposed separately using RSA-based homomorphic authentication tags. The efficiencyof POR scheme was later enhanced by Shacham et al.[11] based on the BLS (Boneh-Lynn-Shacham) signature[29]. To further enhance the efficiency of data integrityauditing, batch integrity auditing was introduced byWang et al. [14]. Recently, Xu et al. [16] and Yuanet al. [20] proposed private and public POR schemesrespectively with constant communication cost by usinga nice algebraic property of polynomial.

To support dynamic operations in verification, Ate-niese et al. [10] proposed another private PDP schemewith symmetric encryption. A Public integrity auditingwith dynamic operations is introduced by Wang et al.[13] based on the Merkle Hash Tree. Based on the rankinformation, Erway et al. also achieve the dynamic PDP.Zhu et al. [15] later utilized the fragment structure tosave storage overhead of authentication tags with thesupport of dynamic data. A private POR scheme withthe support of dynamic data is recently proposed byCash et al. [22] by utilizing Oblivious RAM.

Although many efforts have been made to guaranteethe integrity of data on remote server, most of them onlyconsider single data owner who has the system secretkey and is the only party allowed to modify the shareddata on cloud. In order to improve the previous worksto support multiple writers, Wang et al. [18] first pro-posed a public integrity auditing scheme for shared dataon cloud based on ring signature-based homomorphicauthenticators. In their scheme, user revocation is notconsidered and the auditing cost grows with group sizeand data size. Recently, Wang et al. [21] enhanced theirprevious public integrity verification scheme with thesupport of user revocation. However, if the cloud noderesponsible for tag update is compromised during userrevocation process, attackers can discover the secret keysof all other valid users. What is more, verification costof the TPA (can also be users) in ref. [21] is significantlyinfluenced by the error detection probability requirementand is also linear to the number of data modifier. Batchverification is not supported in their design. Therefore,this scheme is limited in its scalability.

9 CONCLUSION

In this paper, we propose a novel data integrity auditingscheme that supports multiple writers for cloud-baseddata sharing services. Our proposed scheme is featuredby salient properties of public integrity auditing andconstant computational cost on the user side. We achievethis through our innovative design on polynomial-basedauthentication tags which allows aggregation of tags ofdifferent data blocks. For system scalability, we further

IEEE Transactions on Information Forensics and Security Volume: 10,Year: 2015

Page 11: Public Integrity Auditing for Dynamic Data Sharing with Multi-User ... Integrity Auditing for … · Public Integrity Auditing for Dynamic Data Sharing with Multi-User Modification

11

empower the cloud with the ability to aggregate au-thentication tags from multiple writers into one whensending the integrity proof information to the verifier(who may be general cloud users). As a result, just aconstant size of integrity proof information needs to betransmitted to the verifier no matter how many datablocks are being checked and how many writers are asso-ciated with the data blocks. Moreover, our novel designallows secure delegation of user revocation operations tothe cloud with an efficient basic design and an advanceddesign with enhanced reliability. Last but not least,our proposed scheme allows aggregation of integrityauditing operations for multiple tasks (files) through ourbatch integrity auditing technique. We provide practicalapplication scenarios of proposed scheme. Extensive nu-merical analysis and real-world experiments validate theperformance of our scheme.

REFERENCES

[1] J. Yuan and S. Yu, “Efficient public integrity checking for clouddata sharing with multi-user modification,” in Proceedings of the33rd Annual IEEE International Conference on Computer Communi-cations, ser. INFOCOM’14, Toronto, Canada, 2014.

[2] “Dropbox for business,” https://www.dropbox.com/business.[3] “Sugarsync,” https://www.sugarsync.com/business/.[4] “Apache subversion,” http://subversion.apache.org/.[5] “Concurrent versions system,” http://cvs.nongnu.org.[6] “Amazon EC2 and Amazon RDS Service disruption,”

http://aiweb.techfak.uni-bielefeld.de/content/bworld-robot-control-software/.

[7] “Dropbox Forums on Data Loss Topic,”https://www.dropboxforum.com/hc/en-us/search?utf8=%E2%9C%93& query=data+loss&commit=Search.

[8] A. Juels and B. S. Kaliski, Jr., “Pors: Proofs of retrievability forlarge files,” in Proceedings of the 14th ACM Conference on Computerand Communications Security, ser. CCS ’07. Alexandria, Virginia,USA: ACM, 2007, pp. 584–597.

[9] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Pe-terson, and D. Song, “Provable data possession at untrustedstores,” in Proceedings of the 14th ACM Conference on Computer andCommunications Security, ser. CCS ’07. Alexandria, Virginia, USA:ACM, 2007, pp. 598–609.

[10] G. Ateniese, R. Di Pietro, L. V. Mancini, and G. Tsudik, “Scalableand efficient provable data possession,” in Proceedings of the 4thInternational Conference on Security and Privacy in CommunicationNetowrks, ser. SecureComm ’08. New York, NY, USA: ACM, 2008,pp. 9:1–9:10.

[11] H. Shacham and B. Waters, “Compact proofs of retrievability,”in Proceedings of the 14th International Conference on the Theoryand Application of Cryptology and Information Security: Advances inCryptology, ser. ASIACRYPT ’08. Melbourne, Australia: Springer-Verlag, 2008, pp. 90–107.

[12] C. Wang, Q. Wang, K. Ren, and W. Lou, “Ensuring data stor-age security in cloud computing,” in In Proceedings of the 17thIEEE International Workshop on Quality of Service, ser. IWQoS’09,Charleston, South Carolina, July 2009.

[13] Q. Wang, C. Wang, J. Li, K. Ren, and W. Lou, “Enabling publicverifiability and data dynamics for storage security in cloud com-puting,” in Proceedings of the 14th European conference on Researchin computer security, Saint-Malo, France, 2009, pp. 355–370.

[14] C. Wang, Q. Wang, K. Ren, and W. Lou, “Privacy-preservingpublic auditing for data storage security in cloud computing,”in Proceedings of the 29th IEEE International Conference on ComputerCommunications, ser. INFOCOM’10, San Diego, California, USA,2010, pp. 525–533.

[15] Y. Zhu, H. Wang, Z. Hu, G. J. Ahn, H. Hu, and S. S. Yau, “Dynamicaudit services for integrity verification of outsourced storages inclouds,” in Proceedings of the 2011 ACM Symposium on AppliedComputing, ser. SAC ’11. New York, NY, USA: ACM, 2011, pp.1550–1557.

[16] X. Jia and C. Ee-Chien, “Towards efficient proofs of retrievability,”in Proceedings of the 7th ACM Symposium on Information, Computerand Communications Security, ser. ASIACCS ’12, Seoul, Korea, 2012.

[17] H. Wang, “Proxy provable data possession in public clouds,”Services Computing, IEEE Transactions on, vol. 6, no. 4, pp. 551–559, Oct 2013.

[18] B. Wang, B. Li, and H. Li, “Oruta: Privacy-preserving publicauditing for shared data in the cloud,” in Proceedings of the 2012IEEE Fifth International Conference on Cloud Computing, ser. CLOUD’12, Washington, DC, USA, 2012, pp. 295–302.

[19] C. Wang, S. S. Chow, Q. Wang, K. Ren, and W. Lou, “Privacy-preserving public auditing for secure cloud storage,” IEEE Trans-actions on Computers, vol. 62, no. 2, pp. 362–375, 2013.

[20] J. Yuan and S. Yu, “Proofs of retrievability with public verifiabilityand constant communication cost in cloud,” in Proceedings of theInternational Workshop on Security in Cloud Computing, ser. CloudComputing ’13. Hangzhou, China: ACM, 2013, pp. 19–26.

[21] B. Wang, L. Baochun, and L. Hui, “Public auditing for shareddata with efficient user revocation in the cloud,” in Proceedings ofthe 32nd IEEE International Conference on Computer Communications,ser. INFOCOM ’13, Turin, Italy, 2013, pp. 2904–2912.

[22] D. Cash, A. Kp, and D. Wichs, “Dynamic proofs of retrievabilityvia oblivious ram,” in EUROCRYPT 2013, ser. Lecture Notes inComputer Science, T. Johansson and P. Nguyen, Eds. SpringerBerlin Heidelberg, 2013, vol. 7881, pp. 279–295.

[23] A. Shamir, “How to share a secret,” Commun. ACM, vol. 22, no. 11,pp. 612–613, Nov. 1979.

[24] “Amazon ec2 cloud,” http://aws.amazon.com/ec2/.[25] D. E. Eastlake and P. E. Jones, “US Secure Hash Algorithm 1

(SHA1),” http://www.ietf.org/rfc/rfc3174.txt?number=3174.[26] “Public integrity auditing for dynamic data sharing with multi-

user modification,” http://ualr.edu/jxyuan/papers/IEEE-TIFS-15.pdf.

[27] A. De Caro and V. Iovino, “jpbc: Java pairing based cryptogra-phy,” in Proceedings of the 16th IEEE Symposium on Computers andCommunications, ISCC 2011, Kerkyra, Corfu, Greece, June 28 - July1, 2011, pp. 850–855.

[28] J. Yuan and S. Yu, “Secure and constant cost public cloud stor-age auditing with deduplication,” in Proceedings of the 1st IEEEConference on Communications and Network Security, ser. CNS’13,Washington, USA, 2013, pp. 145–153.

[29] D. Boneh, B. Lynn, and H. Shacham, “Short signatures from theweil pairing,” in Proceedings of the 7th International Conference onthe Theory and Application of Cryptology and Information Security:Advances in Cryptology, ser. ASIACRYPT ’01. London, UK, UK:Springer-Verlag, 2001, pp. 514–532.

Jiawei Yuan is a Ph.D candidate at University ofArkansas at Little Rock. He received his BS in2011 from University of Electronic Science andTechnology of China. His research interests arein the areas of cloud computing and networksecurity, with current focus on securing data andcomputation outsourced into the cloud.

Shucheng Yu (S’07-M’10) received his Ph.Din Electrical and Computer Engineering fromWorcester Polytechnic Institute. He joined theComputer Science department at the Univer-sity of Arkansas at Little Rock as an assistantprofessor in 2010. His research interests arein the general areas of Network Security andApplied Cryptography. His current research in-terests include Secure Data Services in CloudComputing, Attribute Based Cryptography, andSecurity and Privacy Protection in Cyber Physi-

cal Systems. He is a member of IEEE and ACM.

IEEE Transactions on Information Forensics and Security Volume: 10,Year: 2015