Computer Cluster Final

52
Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008

Transcript of Computer Cluster Final

Page 1: Computer Cluster Final

Cluster Computer For Bioinformatics

Applications

Nile University,

Bioinformatics Group.

Hisham Adel2008

Page 2: Computer Cluster Final

2

Done By:1. Hisham Adel Hassan.

Supervised by:

Dr. Mohamed Aboualhouda

Page 3: Computer Cluster Final

3

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 4: Computer Cluster Final

4

Introduction

Page 5: Computer Cluster Final

5

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 6: Computer Cluster Final

6

Cluster Definition

•Group of computers and servers (connected together) that act like a single system.

•Each system called a Node.

•Node contain one or more Processor , Ram ,Hard disk and LAN card.

•Nodes work in Parallel.

•We can increase performance by adding more Nodes.

Page 7: Computer Cluster Final

7

Page 8: Computer Cluster Final

8

Page 9: Computer Cluster Final

9

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 10: Computer Cluster Final

10

Cluster types

•Load Balancing Cluster (Parallel BLAST).

•Computing Cluster(Parallel sequence alignment).

•High-availability (HA) clusters.

Page 11: Computer Cluster Final

11

Cluster types:Load Balancing Cluster

Task

Page 12: Computer Cluster Final

12

Cluster types:Computing Cluster

Task

Page 13: Computer Cluster Final

13

Cluster type:High-availability Clusters

Page 14: Computer Cluster Final

14

Cluster advantages

•Performance.

•Scalability.

•Maintenance.

•Cost.

Page 15: Computer Cluster Final

15

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 16: Computer Cluster Final

16

Node 1

switch

Node 4

Node 2

Node 3

Internet

Internet

Internet

Internet

Our Cluster

Page 17: Computer Cluster Final

17

Communication : Switch 5-Port 10/100Mbps.

Processor and Ram: -Master Node Duo core Processor 1.86 GHZ. Ram 1GB.-Node 1 Pentium 4 Ram 1GB.-Node 2 Pentium 4 Ram 1GB-Node 3 Pentium 4 Ram 512 MB

Our Cluster specification

Page 18: Computer Cluster Final

18

Operating System OPEN SUSE 10.3

http://software.opensuse.org/

MPICH2

http://www.mcs.anl.gov/research/projects/mpich2/

Our Cluster specification (cont’)

Page 19: Computer Cluster Final

19

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 20: Computer Cluster Final

20

Performance of the Cluster is affected by

1-Node speed.

2-Running Program.

Page 21: Computer Cluster Final

21

Working…

Running Program(sequential)

Page 22: Computer Cluster Final

22

Working…

Running Program(sequential)

Page 23: Computer Cluster Final

23

Working…

Running Program(sequential)

Page 24: Computer Cluster Final

24

Running Program(sequential)

Page 25: Computer Cluster Final

25

Data sent

Data sent

Data sent

Running Program(Parallel)

Page 26: Computer Cluster Final

26

Working…

Working…

Working…

Working…

Running Program(Parallel)

Page 27: Computer Cluster Final

27

Finished…

Finished…

Finished…

Results

Results

Results

Get results…

Running Program(Parallel)

Page 28: Computer Cluster Final

28

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 29: Computer Cluster Final

29

Sequence Alignment

Page 30: Computer Cluster Final

30

Sequence Alignment

Used to :

1-Compare between sequences.

2-Search databases.

Page 31: Computer Cluster Final

31

How to Align two Sequences.

if we have two sequences A A A C G A A A T G ALet match=1, gap=-1 , miss-match=0.

they can be aligned as:

1- A A A C G A | | | | | | Score=3 A A T _ G A

2- A A A C _ G A | | | | | | | Score=1 A A _ _ T G A

Page 32: Computer Cluster Final

32

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance• Cluster Computer for Basic Problems..• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 33: Computer Cluster Final

33

BLAST(Basic Local Alignment Search Tool)

Searching DataBases

Page 34: Computer Cluster Final

34

BLAST Algorithm

(High scoring pairs)

Page 35: Computer Cluster Final

35

Blast search types.

BLASTN - Compares a nucleotide query sequence against a nucleotide sequencedatabase.

BLASTP- Compares an amino acid query sequence against a protein sequencedatabase.

TBLASTN- Compares a protein query sequence against a nucleotide sequenceDatabase.

BLASTX- Compares nucleotide query sequence against a protein sequence database.

Page 36: Computer Cluster Final

36

Why We need BLAST to be parallelized ?

Page 37: Computer Cluster Final

37

Our Program:Parallel BLAST

Page 38: Computer Cluster Final

38

Parallel BLAST(cont’)

Formatdb.c

Nucleotide sequence database “formatdb -i DATABASE -p F “.

Protein sequence database “formatdb -i DATABASE -p T “.

Page 39: Computer Cluster Final

39

Linux_Cluster_BLASTALL.c

“blastall -p BLAST Search Type -d DATABASE -i QUERY FILE -o out . Txt”

Parallel BLAST(cont’)

Page 40: Computer Cluster Final

40

Results Average of running 1000 Query, 1000 times.

month.htgs (573 MB)drosoph.nt (118,6 MB))

igseqnt (67.5 MB)Yeastnt (3.2 MB)

mito.nt (3.2 MB)Pdbnt (1.7 MB)

0.0000000

0.2000000

0.4000000

0.6000000

0.8000000

1.0000000

1.2000000

1.4000000

1.6000000

1.8000000Nucleotide-Nucleotide

1 Node3 Nodes-Query time3-Nodes-Query and communication time

Database(Size)

Tim

e(S)

Page 41: Computer Cluster Final

41

Results(cont’) Average of running 1000 Query, 1000 times.

env_nr(1.6GB) nr(573MB) SwissProt(160MB) Pdbaa(20MB) Yeast.aa(3.2MB)0.000000

10.000000

20.000000

30.000000

40.000000

50.000000

60.000000

70.000000

80.000000

90.000000

Amino acid_Amino acid

1 Node-Query Time 3 Nodes-Query time3 Nodes-Query and communication time

Database(size)

Tim

e(S)

Page 42: Computer Cluster Final

42

Results(cont’) Average of running 1000 Query, 1000 times.

env_nr(1.6GB) Swissprot(160MB) nr(84.7MB) Pdbaa(20.4MB) yeast.aa(3.2MB)0.0000000

10.0000000

20.0000000

30.0000000

40.0000000

50.0000000

60.0000000

70.0000000

80.0000000

90.0000000

Amino acid_Nucltide

1 Node Query time3 Nodes Query time only3 Nodes Query and Communication time

Database(Size)

Tim

e(S

)

Page 43: Computer Cluster Final

43

Conclusion about Parallel BLAST.

•Performane: Batter by using CLUSTER.

•Scalability:More Nodes time decrease.

Page 44: Computer Cluster Final

44

Points

• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.

Page 45: Computer Cluster Final

45

Sequence AlignmentCompare between sequences

Page 46: Computer Cluster Final

46

Sequence Alignment

•Introduction.

•Sequence Alignment Benefits.

•Sequence Alignment Types.

Page 47: Computer Cluster Final

47

Needleman-Wunsch Algorithm

Page 48: Computer Cluster Final

48

Why We need Sequence Alignment to be parallelized ?

Page 49: Computer Cluster Final

49

Parallel Sequence Alignment algorithm

Page 50: Computer Cluster Final

50

Our Sequence Alignment Program

•Pairwise Alignment.

•Built Using Needleman-Wunsch algorithm.

Page 51: Computer Cluster Final

51

Learned Skills.

•Using Linux (Suse 10.3) operating system.

• Programming using C language.

• Cluster computers and how to build one.

• MPICH2 for message passing interfaces between nodes.

• Latex.

• Team working, and helping each other.

• Presentation skills.

Page 52: Computer Cluster Final

52

Thank you for your time.

Hisham Adel