14 wuyw Data Socialization and Cloud Storage

18
Data Socialization and Cloud Storage Yongwei Wu Dept. of Computer Science, Tsinghua U. [email protected] Outline Cloud Storage Case Study: Corsair and MeePo Security of cloud storage from a system perspective Applications based on cloud storage

Transcript of 14 wuyw Data Socialization and Cloud Storage

Page 1: 14 wuyw Data Socialization and Cloud Storage

Data Socialization and Cloud Storage

Yongwei WuDept. of Computer Science, Tsinghua [email protected]

Outline

Cloud Storage

Case Study: Corsair and MeePo

Security of cloud storage from a system

perspective

Applications based on cloud storage

Page 2: 14 wuyw Data Socialization and Cloud Storage

Story One

4+ billion phones by 2010 [Source: Nokia]

PCs

TVs

PDAs

London, UK

Story Two

Manage

Contact

Work

Share

Google Charts

Page 3: 14 wuyw Data Socialization and Cloud Storage

Company/Lab Clouds ~

Cloud Computing and Data CentersCloud computing as a service is an important form of information infrastructure in internetageCloud computing is most likely to be a top popular computing/business pattern/model in the futureCloud storage :

the right arm of cloud computingno fun in computingwithout storage

Page 4: 14 wuyw Data Socialization and Cloud Storage

A Vision of Cloud Storage from Social Network

Meet new peopleA virtual worldMessage deliveringICQ、QQ

Contact with acquaintanceReal life connectionsMessage delivering & Profile & File sharing (limited & directed)Facebook、LinkedIn、QQ groups

Data socializing: meet people with same interestsReal life connection or Same charactersBased on data creation and sharingOn campus: Mac fruits, Swimming team, C.S.53, etc. more than 600 communities in Corsair.

Cloud Storage

Features of cloud storageGain through network: Patterns of cloud

Easy to get: Anytime, anywhere, any way

Never lost: life-time possession

Safe: Manageable, Controllable, Auditable

Group sharing and downloadingMovies & Tv SoftwareLearning Materials Music …

Current cloud storageNetwork backup of user dataSharing in different time and places

Current network storage service

FTP

Page 5: 14 wuyw Data Socialization and Cloud Storage

Cloud Storage Service: CorsairStorage and sharing of data, providing a unified view for data management of local resources combined with remote resources

all usersgroup users personal users

9

Unified View Combine local and remoteresources as unified view

Data Trans. Parallel transferring, Break-point resume, Flow control

Data Search Resource search in Corsair

Corsair: System Architecture

10

Distributed Storage Service

AdapterFTP Interface ZettaDS Interface ……

Virtual Dir.Bookmarks Search

User ManageData Transfer Group Manage

Service

ApplicationGUI Shell API Web Portal

Page 6: 14 wuyw Data Socialization and Cloud Storage

Corsair: InterfaceUnified viewVirtual directories and index serviceUser and group managementDistributed information search

Corsair: Transferring performance

12

Page 7: 14 wuyw Data Socialization and Cloud Storage

Corsair: Features

Rapid sharingA 100GB group space after simple apply and approval

Portable flash disk2GB personal space after registration

Disk expending70TB public network resources for every user

Fast response and rapid data transfer(5MBps on campus)Easy to use (Client looks like Windows Resource Manager or Nautilus under Linux Gnome)Complete application programming interface.

6 months after the system

starting to run, most

departments and

associations in Tsinghua

U choose to close their

own FTP and migrate to

Corsair.

Corsair: Deployment OverviewStatistics

Tsinghua U.

Lanzhou U.

Dalian U. of Tech.

Shanghai U.

Deplyed in:

Tsinghua U.

33,000 students count

80,000 download times

16,000 registered users

3,000 daily online

600 groups

100 TB data

63.1% >5MB/s

1.3TB daily data transfer

Competition

Product form

Client forWindows LinuxMac

Web siteRegisterManageForum

Page 8: 14 wuyw Data Socialization and Cloud Storage

Typical Groups

MeePo: a New Cloud Storage SystemFile data storage and sharing. Seamless integration with the local resource managerDifferent cache strategies based on various requirements, optimizing user experience

Public Groups My Space Group Space

My Space: Every registered user owned a 20GB space, online and offlineGroup Space: A 500GB group space after apply and approvalPublic Groups: Groups created by system, open to all users

Page 9: 14 wuyw Data Socialization and Cloud Storage

A Glance at MeePo’s Architecture

Client

Layout Manager

Data Service

MeePo Client DesignClient generate a file system in user space through FUSE/Dokan.Looks just like another hard disk drive for users.User can do anything to the disk MeePo mount with any legacy software, and all user requests will be handled by MeePo Client.

Linux VFS/Win Driver Service

MeePo Client

Video & Audio Player, Office, other software

OS Kernal

FUSE/Dokan

NTFS

Ext3

Sync Comm.

Page 10: 14 wuyw Data Socialization and Cloud Storage

MeePo: a Preview Version - 1

MeePo: a Preview Version - 2

Page 11: 14 wuyw Data Socialization and Cloud Storage

MeePo: Backend System Architecture

Clients

Metadata Servers

Supervisors

Data Servers

C1 C2 C3 C4 MnesiaDistributed Database

DFS

Security door: access controlSpecifying road: route authorization

Body armor: data process protection

Disaster prevention: Based on

multiple copies of data chunk

Monitor: trail tracing

22

Security Structure in Tsinghua Data Center

Page 12: 14 wuyw Data Socialization and Cloud Storage

Authorization and access control based on switch (1)

Internet

Trusted terminal

DFSclient

Hierarchical dynamic secure LAN

DFSmonitor

DFS meta

data server

Gateway

DFS data server

DFSdata server

Dynamic monitor switch set (access authorization)

Trusted terminal

Trusted terminal

Trusted terminal

Trusted terminal

……

23

Packet switching based on security policy

Secure network structure of DCN

1. Integrate security policy of DFS, and access control as a uniform one

2. Use OpenFlow-like flow management tools to map various flows to switch. Put middle box on the branch of flow via flow redirect technique, and process flows on demand (serial->parallel)

• Strengthen security, improve network performance and on-demand change of network topology

• Enhance flow control and management, making traffic flows in specified path and avoiding threats caused by untrusted or malicious flow 24

Page 13: 14 wuyw Data Socialization and Cloud Storage

Data process protection (1)

Hardware (with TPM, VT-d)

Bare-metal hypervisor Root TCB, measured and verified

Secure App 1In Cooperation with

Root TCBCommercial OS

Security irrelevant and no need to measure

Secure App 2In Cooperationwith Root TCB

Other App

Commercial OS

. . . . . . .

IsolationBoundary

The Root TCB is “a kernel providing services for all other kernels”

Why is cooperation possible?1. The Root TCB, as a hypervisor, manages the use of CPU,

memory and peripherals of the whole computing platform2. For example, the in-memory pages of an application can be

protected by the Root TCB to make it unavailable to all others. Of course, the application must request to do so.

25

Data process protection in Daoli(2)Image from Wikipedia

Southbridge (I/O device) is the path that information must go through. Chunk-based storage is adopted and isolation is provided by IOMMU.

New chipsete.g., VT-dof Intel

Northbridge allows noinfo out. Isolation is controlled MMU

Front-side bus

LPC bus

26

Page 14: 14 wuyw Data Socialization and Cloud Storage

Distributed file system

Chunk-BasedDistributed Database

Clients

Metadata Server Data ServersMetadata ServerMetadata Servers Data ServersData ServersData Servers

ClientsClientsClientsSupervisorsSupervisorsSupervisors

Data Servers

FUSE

27

1. Coarse-grained architecture2. Multi metadata servers,

maintaining directory tree, file metadata and chunk reflection table, based on distributed database – reliable

3. Multi chunk servers, each chunk is a file in local file system, support garbage collection and replica allocation, with dynamic replica strategy - reliable

4. Multi Supervisors, in charge of system monitoring, failure recovery, replica management, garbage collection, history log archive, audition interface – reliable & controllable

•Support all-sized files•Support rewrite•Provide a POSIX-like interface

Encryption/Decryption of Data during Write/Read

sys_read

Clear Data

DaoliEncrypting

Data

Encrypted Data

Read Data

Encrypted Data

Encrypted Data

Storage

Page 15: 14 wuyw Data Socialization and Cloud Storage

Large file(700M)Transmission performance

5.2020233.73893M/s19.45%+ SSL + Daoli

4.0941413.75903M/s15.39%+ SSL

0.33097110.75622M/s3.56%Original

CPU Utilization/SpeedSpeedCPU UtilizationServer

14.087063.75903M/s52.732%+SSL1.86468910.75622M/s20.057%Original

CPU Utilization/SpeedSpeedClient CPU UtilizationClient

Transmission Performance of Many Small files(10,000*100KB)

5.2020233.73893M/s19.45%+ SSL + Daoli4.0941413.75903M/s15.39%+ SSL0.33097110.75622M/s3.56%Original

CPU Utilization/SpeedSpeedCPU UtilizationServer

14.087063.75903M/s52.732%+SSL

1.86468910.75622M/s20.057%Original

CPU Utilization/SpeedSpeedClient CPU UtilizationClient

Page 16: 14 wuyw Data Socialization and Cloud Storage

Current Results

Performance AnalysisLowered access control overhead, improved efficiencyThe performance of file system is as good as HDFS

Process protection with existing technologies (cooperation with EMC)

Trusted computing technology: Chinese TCM standard availableVT-d: already shipped by Intel and AMD in their productsSoftware can based on Xen

Carrier has been used to support Tsinghua storage cloud (Corsair): security modules to be addedSecure LAN prototype available: to be verifiedPacket switching based on security policy: under experimentVerification environment: Tsinghua cloud storage system 31

MStreaming

32

Page 17: 14 wuyw Data Socialization and Cloud Storage

MCamera

33

MFriend

34

Page 18: 14 wuyw Data Socialization and Cloud Storage

Thanks!Q&A