EFF: 114-distributed-storage

8

Click here to load reader

Transcript of EFF: 114-distributed-storage

Page 1: EFF: 114-distributed-storage

8/14/2019 EFF: 114-distributed-storage

http://slidepdf.com/reader/full/eff-114-distributed-storage 1/8

Filename: 114-distributed-storage.txtTitle: Distributed Storage for Tor Hidden Service DescriptorsVersion: $Revision: 12715 $Last-Modified: $Date: 2007-12-07 21:27:58 +0000 (Fri, 07 Dec 2007) $Author: Karsten LoesingCreated: 13-May-2007Status: Closed

Change history:

13-May-2007 Initial proposal14-May-2007 Added changes suggested by Lasse Øverlier30-May-2007 Changed descriptor format, key length discussion, typos09-Jul-2007 Incorporated suggestions by Roger, added status of specification

and implementation for upcoming GSoC mid-term evaluation11-Aug-2007 Updated implementation statuses, included non-consecutive

replication to descriptor format20-Aug-2007 Renamed config option HSDir as HidServDirectoryV202-Dec-2007 Closed proposal

Overview:

The basic idea of this proposal is to distribute the tasks of storing andserving hidden service descriptors from currently three authoritativedirectory nodes among a large subset of all onion routers. The threereasons to do this are better robustness (availability), betterscalability, and improved security properties. Further,this proposal suggests changes to the hidden service descriptor format toprevent new security threats coming from decentralization and to gain evenbetter security properties.

Status:

As of December 2007, the new hidden service descriptor format is implemented

and usable. However, servers and clients do not yet make use of descriptorcookies, because there are open usability issues of this feature that mightbe resolved in proposal 121. Further, hidden service directories do notperform replication by themselves, because (unauthorized) replica fetchrequests would allow any attacker to fetch all hidden service descriptors inthe system. As neither issue is critical to the functioning of v2descriptors and their distribution, this proposal is considered as Closed.

 Motivation:

The current design of hidden services exhibits the following performance andsecurity problems:

First, the three hidden service authoritative directories constitute aperformance bottleneck in the system. The directory nodes are responsible forstoring and serving all hidden service descriptors. As of May 2007 there areabout 1000 descriptors at a time, but this number is assumed to increase inthe future. Further, there is no replication protocol for descriptors betweenthe three directory nodes, so that hidden services must ensure theavailability of their descriptors by manually publishing them on alldirectory nodes. Whenever a fourth or fifth hidden service authoritativedirectory is added, hidden services will need to maintain an equallyincreasing number of replicas. These scalability issues have an impact on thecurrent usage of hidden services and put an even higher burden on the

Page 2: EFF: 114-distributed-storage

8/14/2019 EFF: 114-distributed-storage

http://slidepdf.com/reader/full/eff-114-distributed-storage 2/8

development of new kinds of applications for hidden services that mightrequire storing even more descriptors.

Second, besides posing a limitation to scalability, storing all hiddenservice descriptors on three directory nodes also constitutes a securityrisk. The directory node operators could easily analyze the publish and fetchrequests to derive information on service activity and usage and read thedescriptor contents to determine which onion routers work as introduction

points for a given hidden service and need to be attacked or threatened toshut it down. Furthermore, the contents of a hidden service descriptor offeronly minimal security properties to the hidden service. Whoever gets aware ofthe service ID can easily find out whether the service is active at themoment and which introduction points it has. This applies to (former)clients, (former) introduction points, and of course to the directory nodes.It requires only to request the descriptor for the given service ID, whichcan be performed by anyone anonymously.

This proposal suggests two major changes to approach the describedperformance and security problems:

The first change affects the storage location for hidden service descriptors.

Descriptors are distributed among a large subset of all onion routers insteadof three fixed directory nodes. Each storing node is responsible for a subsetof descriptors for a limited time only. It is not able to choose whichdescriptors it stores at a certain time, because this is determined by itsonion ID which is hard to change frequently and in time (only routers whichare stable for a given time are accepted as storing nodes). In order toresist single node failures and untrustworthy nodes, descriptors arereplicated among a certain number of storing nodes. A first replicationprotocol makes sure that descriptors don't get lost when the node populationchanges; therefore, a storing node periodically requests the descriptors fromits siblings. A second replication protocol distributes descriptors amongnon-consecutive nodes of the ID ring to prevent a group of adversaries fromgenerating new onion keys until they have consecutive IDs to create a 'black

hole' in the ring and make random services unavailable. Connections tostoring nodes are established by extending existing circuits by one hop tothe storing node. This also ensures that contents are encrypted. The effectof this first change is that the probability that a single node operatorlearns about a certain hidden service is very small and that it is very hardto track a service over time, even when it collaborates with other nodeoperators.

 The second change concerns the content of hidden service descriptors.Obviously, security problems cannot be solved only by decentralizing storage;in fact, they could also get worse if done without caution. At first, adescriptor ID needs to change periodically in order to be stored on changingnodes over time. Next, the descriptor ID needs to be computable only for the

service's clients, but should be unpredictable for all other nodes. Further,the storing node needs to be able to verify that the hidden service is thetrue originator of the descriptor with the given ID even though it is not aclient. Finally, a storing node should learn as little information asnecessary by storing a descriptor, because it might not be as trustworthy asa directory node; for example it does not need to know the list ofintroduction points. Therefore, a second key is applied that is only known tothe hidden service provider and its clients and that is not included in thedescriptor. It is used to calculate descriptor IDs and to encrypt theintroduction points. This second key can either be given to all clientstogether with the hidden service ID, or to a group or a single client as

Page 3: EFF: 114-distributed-storage

8/14/2019 EFF: 114-distributed-storage

http://slidepdf.com/reader/full/eff-114-distributed-storage 3/8

an authentication token. In the future this second key could be the result ofsome key agreement protocol between the hidden service and one or moreclients. A new text-based format is proposed for descriptors instead of anextension of the existing binary format for reasons of future extensibility.

Design:

The proposed design is described by the required changes to the current

design. These requirements are grouped by content, rather than by affectedspecification documents or code files, and numbered for reference below.

Hidden service clients, servers, and directories:

/1/ Create routing list

All participants can filter the consensus status document received from thedirectory authorities to one routing list containing only those serversthat store and serve hidden service descriptors and which are running forat least 24 hours. A participant only trusts its own routing list and neverlearns about routing information from other parties.

/2/ Determine responsible hidden service directory

All participants can determine the hidden service directory that isresponsible for storing and serving a given ID, as well as the hiddenservice directories that replicate its content. Every hidden servicedirectory is responsible for the descriptor IDs in the interval fromits predecessor, exclusive, to its own ID, inclusive. Further, a hiddenservice directory holds replicas for its n predecessors, where n denotesthe number of consecutive replicas. (requires /1/)

[/3/ and /4/ were requirements to use BEGIN_DIR cells for directoryrequests which have not been fulfilled in the course of the implementationof this proposal, but elsewhere.]

Hidden service directory nodes: /5/ Advertise hidden service directory functionality

Every onion router that has its directory port open can decide whether itwants to store and serve hidden service descriptors by setting a new configoption "HidServDirectoryV2" 0|1 to 1. An onion router with this configoption being set includes the flag "hidden-service-dir" in its routerdescriptors that it sends to directory authorities.

/6/ Accept v2 publish requests, parse and store v2 descriptors

Hidden service directory nodes accept publish requests for hidden servicedescriptors and store them to their local memory. (It is not necessary tomake descriptors persistent, because after disconnecting, the onion routerwould not be accepted as storing node anyway, because it has not beenrunning for at least 24 hours.) All requests and replies are formatted asHTTP messages. Requests are directed to the router's directory port and arecontained within BEGIN_DIR cells. A hidden service directory node stores adescriptor only when it thinks that it is responsible for storing thatdescriptor based on its own routing table. Every hidden service directorynode is responsible for the descriptor IDs in the interval of its n-thpredecessor in the ID circle up to its own ID (n denotes the number of

Page 4: EFF: 114-distributed-storage

8/14/2019 EFF: 114-distributed-storage

http://slidepdf.com/reader/full/eff-114-distributed-storage 4/8

consecutive replicas). (requires /1/)

/7/ Accept v2 fetch requests

Same as /6/, but with fetch requests for hidden service descriptors.(requires /2/)

/8/ Replicate descriptors with neighbors

A hidden service directory node replicates descriptors from its twopredecessors by downloading them once an hour. Further, it checks itsrouting table periodically for changes. Whenever it realizes that apredecessor has left the network, it establishes a connection to the newn-th predecessor and requests its stored descriptors in the interval of its(n+1)-th predecessor and the requested n-th predecessor. Whenever itrealizes that a new onion router has joined with an ID higher than itsformer n-th predecessor, it adds it to its predecessors and discards alldescriptors in the interval of its (n+1)-th and its n-th predecessor.(requires /1/)

[Dec 02: This function has not been implemented, because arbitrary nodes

what have been able to download the entire set of v2 descriptors. Anauthorized replication request would be necessary. For the moment, thesystem runs without any directory-side replication. -KL]

Authoritative directory nodes:

/9/ Confirm a router's hidden service directory functionality

Directory nodes include a new flag "HSDir" for routers that decided toprovide storage for hidden service descriptors and that are running for atleast 24 hours. The last requirement prevents a node from frequentlychanging its onion key to become responsible for an identifier it wants totarget.

Hidden service provider:

/10/ Configure v2 hidden service

Each hidden service provider that has set the config option"PublishV2HidServDescriptors" 0|1 to 1 is configured to publish v2descriptors and conform to the v2 connection establishment protocol. Whenconfiguring a hidden service, a hidden service provider checks if it hasalready created a random secret_cookie and a hostname2 file; if not, itcreates both of them. (requires /2/)

/11/ Establish introduction points with fresh key

If configured to publish only v2 descriptors and no v0/v1 descriptors anymore, a hidden service provider that is setting up the hidden service atintroduction points does not pass its own public key, but the public keyof a freshly generated key pair. It also includes these fresh public keysin the hidden service descriptor together with the other introduction pointinformation. The reason is that the introduction point does not need to andtherefore should not know for which hidden service it works, so as toprevent it from tracking the hidden service's activity. (If a hiddenservice provider supports both, v0/v1 and v2 descriptors, v0/v1 clientsrely on the fact that all introduction points accept the same public key,

Page 5: EFF: 114-distributed-storage

8/14/2019 EFF: 114-distributed-storage

http://slidepdf.com/reader/full/eff-114-distributed-storage 5/8

so that this new feature cannot be used.)

/12/ Encode v2 descriptors and send v2 publish requests

If configured to publish v2 descriptors, a hidden service providerpublishes a new descriptor whenever its content changes or a newpublication period starts for this descriptor. If the current publicationperiod would only last for less than 60 minutes (= 2 x 30 minutes to allow

the server to be 30 minutes behind and the client 30 minutes ahead), thehidden service provider publishes both a current descriptor and one forthe next period. Publication is performed by sending the descriptor to allhidden service directories that are responsible for keeping replicas forthe descriptor ID. This includes two non-consecutive replicas that arestored at 3 consecutive nodes each. (requires /1/ and /2/)

Hidden service client:

/13/ Send v2 fetch requests

A hidden service client that has set the config option"FetchV2HidServDescriptors" 0|1 to 1 handles SOCKS requests for v2 onion

addresses by requesting a v2 descriptor from a randomly chosen hiddenservice directory that is responsible for keeping replica for thedescriptor ID. In total there are six replicas of which the first and thelast three are stored on consecutive nodes. The probability of picking oneof the three consecutive replicas is 1/6, 2/6, and 3/6 to incorporate thefact that the availability will be the highest on the node with next higherID. A hidden service client relies on the hidden service provider to storetwo sets of descriptors to compensate clock skew between service andclient. (requires /1/ and /2/)

/14/ Process v2 fetch reply and parse v2 descriptors

A hidden service client that has sent a request for a v2 descriptor can

parse it and store it to the local cache of rendezvous service descriptors.

/15/ Establish connection to v2 hidden service

A hidden service client can establish a connection to a hidden serviceusing a v2 descriptor. This includes using the secret cookie for decryptingthe introduction points contained in the descriptor. When contacting anintroduction point, the client does not use the public key of the hiddenservice provider, but the freshly-generated public key that is included inthe hidden service descriptor. Whether or not a fresh key is used insteadof the key of the hidden service depends on the available protocol versionsthat are included in the descriptor; by this, connection establishment isto a certain extend decoupled from fetching the descriptor.

Hidden service descriptor:

(Requirements concerning the descriptor format are contained in /6/ and /7/.) 

The new v2 hidden service descriptor format looks like this:

onion-address = h(public-key) + cookiedescriptor-id = h(h(public-key) + h(time-period + cookie + relica))descriptor-content = {descriptor-id,

Page 6: EFF: 114-distributed-storage

8/14/2019 EFF: 114-distributed-storage

http://slidepdf.com/reader/full/eff-114-distributed-storage 6/8

version,public-key,h(time-period + cookie + replica),timestamp,protocol-versions,{ introduction-points } encrypted with cookie

} signed with private-key

The "descriptor-id" needs to change periodically in order for thedescriptor to be stored on changing nodes over time. It may only becomputable by a hidden service provider and all of his clients to preventunauthorized nodes from tracking the service activity by periodicallychecking whether there is a descriptor for this service. Finally, thehidden service directory needs to be able to verify that the hidden serviceprovider is the true originator of the descriptor with the given ID.

 Therefore, "descriptor-id" is derived from the "public-key" of the hiddenservice provider, the current "time-period" which changes every 24 hours,a secret "cookie" shared between hidden service provider and clients, anda "replica" denoting the number of this non-consecutive replica. (The"time-period" is constructed in a way that time periods do not change at

the same moment for all descriptors by deriving a value between 0:00 and23:59 hours from h(public-key) and making the descriptors of this hiddenservice provider expire at that time of the day.) The "descriptor-id" isdefined to be 160 bits long. [extending the "descriptor-id" lengthsuggested by LØ]

 Only the hidden service provider and the clients are able to generatefuture "descriptor-ID"s. Hence, the "onion-address" is extended from nowthe hash value of "public-key" by the secret "cookie". The "public-key" isdetermined to be 80 bits long, whereas the "cookie" is dimensioned to be120 bits long. This makes a total of 200 bits or 40 base32 chars, which isquite a lot to handle for a human, but necessary to provide sufficientprotection against an adversary from generating a key pair with same

"public-key" hash or guessing the "cookie". 

A hidden service directory can verify that a descriptor was created by thehidden service provider by checking if the "descriptor-id" corresponds tothe "public-key" and if the signature can be verified with the"public-key".

The "introduction-points" that are included in the descriptor are encryptedusing the same "cookie" that is shared between hidden service provider andclients. [correction to use another key than h(time-period + cookie) asencryption key for introduction points made by LØ]

A new text-based format is proposed for descriptors instead of an extension

of the existing binary format for reasons of future extensibility.

Security implications:

The security implications of the proposed changes are grouped by the roles ofnodes that could perform attacks or on which attacks could be performed.

Attacks by authoritative directory nodes

Authoritative directory nodes are no longer the single places in thenetwork that know about a hidden service's activity and introduction

Page 7: EFF: 114-distributed-storage

8/14/2019 EFF: 114-distributed-storage

http://slidepdf.com/reader/full/eff-114-distributed-storage 7/8

points. Thus, they cannot perform attacks using this information, e.g.track a hidden service's activity or usage pattern or attack itsintroduction points. Formerly, it would only require a single corruptedauthoritative directory operator to perform such an attack.

Attacks by hidden service directory nodes

A hidden service directory node could misuse a stored descriptor to track a

hidden service's activity and usage pattern by clients. Though there is nocountermeasure against this kind of attack, it is very expensive to track acertain hidden service over time. An attacker would need to run a largenumber of stable onion routers that work as hidden service directory nodesto have a good probability to become responsible for its changingdescriptor IDs. For each period, the probability is:

1-(N-c choose r)/(N choose r) for N-c>=r and 1 otherwise, with Nas totalnumber of hidden service directories, c as compromised nodes, and r asnumber of replicas

The hidden service directory nodes could try to make a certain hidden

service unavailable to its clients. Therefore, they could discard allstored descriptors for that hidden service and reply to clients that thereis no descriptor for the given ID or return an old or false descriptorcontent. The client would detect a false descriptor, because it could notcontain a correct signature. But an old content or an empty reply couldconfuse the client. Therefore, the countermeasure is to replicatedescriptors among a small number of hidden service directories, e.g. 5.The probability of a group of collaborating nodes to make a hidden servicecompletely unavailable is in each period:

(c choose r)/(N choose r) for c>=r and N>=r, and 0 otherwise,with N as totalnumber of hidden service directories, c as compromised nodes, and r as

number of replicas

A hidden service directory could try to find out which introduction pointsare working on behalf of a hidden service. In contrast to the previousdesign, this is not possible anymore, because this information is encryptedto the clients of a hidden service.

Attacks on hidden service directory nodes

An anonymous attacker could try to swamp a hidden service directory withfalse descriptors for a given descriptor ID. This is prevented by requiringthat descriptors are signed.

Anonymous attackers could swamp a hidden service directory with correctdescriptors for non-existing hidden services. There is no countermeasureagainst this attack. However, the creation of valid descriptors is moreexpensive than verification and storage in local memory. This should makethis kind of attack unattractive.

Attacks by introduction points

Current or former introduction points could try to gain information on thehidden service they serve. But due to the fresh key pair that is used bythe hidden service, this attack is not possible anymore.

Page 8: EFF: 114-distributed-storage

8/14/2019 EFF: 114-distributed-storage

http://slidepdf.com/reader/full/eff-114-distributed-storage 8/8

Attacks by clients

Current or former clients could track a hidden service's activity, attackits introduction points, or determine the responsible hidden servicedirectory nodes and attack them. There is nothing that could prevent themfrom doing so, because honest clients need the full descriptor content toestablish a connection to the hidden service. At the moment, the only

countermeasure against dishonest clients is to change the secret cookie andpass it only to the honest clients.

Compatibility:

The proposed design is meant to replace the current design for hidden servicedescriptors and their storage in the long run.

There should be a first transition phase in which both, the current designand the proposed design are served in parallel. Onion routers should startserving as hidden service directories, and hidden service providers andclients should make use of the new design if both sides support it. Hiddenservice providers should be allowed to publish descriptors of the current

format in parallel, and authoritative directories should continue storing andserving these descriptors.

After the first transition phase, hidden service providers should stoppublishing descriptors on authoritative directories, and hidden serviceclients should not try to fetch descriptors from the authoritativedirectories. However, the authoritative directories should continue servinghidden service descriptors for a second transition phase. As of this point,all v2 config options should be set to a default value of 1.

After the second transition phase, the authoritative directories should stopserving hidden service descriptors.