NTU Cloud

NTU Cloud

2010/05/30

System Diagram

Architecture

• Gluster File System– Provide a distributed shared file system for

migration• NFS– A Prototype Image storage space

Node Node Node Node NodeGluster File System

Compute Img C- Img C- Img S- ImgStorage ImgC- Img

NFS

Prototype Img

Architecture

• Prototype Image– Original Image e.g. Hadoop MPI

• Compute Image– Modified Images for user– Do not preserve the content after cluster shutdown

Node Node Node Node NodeGluster File System

Compute Img C- Img C- Img S- ImgStorage ImgC- Img

NFS

Prototype Img

XEN

• A hypervisor• Virtualization

Cloud Master

• Monitor system state• Scheduling• Use NFS to store Prototype Image• Web server

OpenNebula

• A middleware• Provides an interface to manage virtual

infrastructure (computation and network)• VM Migration

=> We use OpenNebula to manage VM deployment, migration and set up virtual local area network(VLAN).

Gluster file system

• User level distributed file system• Client/Server Architecture• Use TCP/IP to transfer data

=>We use GlusterFS to build our share file system environment for VM live migration.

=>Our deployment is "symmetrical" - every machine is both a server and a client.

System Flow

Hadoop Benchmark• Case 1

– M1 : Master + Slave-01 + Slave02

• Case 2– M1 : Master– M2 : Slave-01 + Slave-02

• Case 3– M1 : Master– M3 : Slave-01 + Slave-02

• Case 4– M1 : Master– M2 : Slave-01– M3 : Slave-02

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134 141 1480

50

100

150

200

250

300

all in M1

master-06 slave01 02 in M2

master-06 slave-01 in M2 slave-02 in M3

master-06 slave-01 02 in M3

All in M1 Slave-01 02 in M2 Slave-01 02 in M3 Slave-01 in M2 Slave-02 in M3

215.45 188 191.86 139.59 Sec

Sec

Iteration

Set 1VM Host Machine VCPU Mem Purpose

Set 1.1 Single machine

Master M1 1 2.2G Namenode+Datanode+Jobtracker+Tasktracker

Worker M1 1 1.2G Datanode+Tasktracker

Set 1.2 Two machine



•M1&M2 has same CPU and Memory size.•HADOOP_HEAPSIZE=500MB•mapred.child.java.opts=100MB

•RandomWriter 10M for 30Maps•Sortting • HDFS_BYTES_READ=210543161 • HDFS_BYTES_WRITTEN=210541669

Sort

1 2 3 4 50

102030405060708090

100

73 73

84 86

7269 71 70 68 67

single machinetwo machine

iteration

Sec

Therefore, putting two VM into one machine performance slowdown to 88.92%two machine / single machine = 88.92 %

Launched reduce tasks=4Others=3

Reduce shuffle bytes=203039958

Reduce shuffle bytes=199629523

Exactly the same!

Set 2VM Host Machine VCPU Mem Purpose

Set 2.1 Single machine



Set 2.2 Two machine



1.RandomWriter 10M for 30Maps2.Sort

HADOOP_HEAPSIZE=500MBmapred.child.java.opts=100MB

RandomWriter

210522479 / 210545910

210542022 / 210545911

210548147 / 210545912

210522479 / 210545913

210562466 / 210545914

0

10

20

30

40

50

60

70

55 55 57 5563

49 46 47 44 44


BYTEs

Sec

Therefore, putting two VM into one machine performance slowdown to 80.70%two machine / single machine = 80.70 %

RandomWriterSingle machine Two machine

Iteration Sec HDFS_BYTES_WRITTEN Sec HDFS_BYTES_WRITTEN

1 55 210522479 49 210545910 2 55 210542022 46 210549359 3 57 210548147 47 210505092 4 55 210545917 44 210578791 5 63 210562466 44 210508035

Avg. 57.00 210544206.20 46.00 210537437.40 Avg. on 1,2,4 55.00 210536806.00 46.33 210558020.00

Sort

1 2 3 4 572

74

76

78

80

82

84

86

88

90

88

78

82 82

85

8786

87

8586


iteration

Sec

Current Progress

• Xen 4.0 is ready on each node.• We can offer two kinds of images– Hadoop– MPI

• Start up VMs to destination node automatically.

• Configure MPI and Hadoop environment for use automatically.

NTU Cloud

Documents

Transcript of NTU Cloud