An MPI-IO Cloud Cluster Bioinformatics Summer Project (BDT205) | AWS re:Invent 2013
-
Upload
amazon-web-services -
Category
Technology
-
view
559 -
download
2
description
Transcript of An MPI-IO Cloud Cluster Bioinformatics Summer Project (BDT205) | AWS re:Invent 2013
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
An MPI-IO Cloud Cluster Bioinformatics Summer Project
Brandon Posey, Dougal Ballantyne, Boyd Wilson
November 13, 2013
Filesystems on AWS
What filesystems *MUST* you use on AWS?
The one that means the needs of your unique application needs!
Some things to consider: • Total amount of storage required? • Resilience required? • Expected number of clients? • Locality of servers and clients? • Average file sizes? (KB, MB, GB, TB) • Block sizes used by applications? • IO profile? Read/Write%? • Typical IO use case?
Filesystems on AWS are all about building blocks!
Building Blocks • Amazon Elastic Compute Cloud (Amazon EC2)
– 1ECU to 88ECU of compute power – 613MB to 240GB of memory – Shared network, EBS optimized, dedicated 10Gb
• Amazon Simple Storage Service (Amazon S3) – Unlimited capacity – Web-scale – Lifecycle management
Amazon EC2
Amazon S3
Building Blocks • Local storage (ephemeral)
– 150GB to 3360GB per instance – HDD and SSD – FREE! (part of instance cost)
• Amazon Elastic Block Store (Amazon EBS) – 1G to 1000GB per volume – Standard and Provisioned IOPS – Multiple volumes per instance – Supports snapshot to Amazon S3
Amazon EBS
Ephemeral Disk
Storage-optimized EC2 instances http://aws.amazon.com/ec2/instance-types/ "This family includes the HI1 and HS1 instance types, and provides you with Intel Xeon processors and direct-attached storage options optimized for applications with specific disk I/O and storage capacity requirements." • HI1 instances features SSD storage • HS1 instances feature direct attach HDD
Amazon EBS optimized instances http://aws.amazon.com/ebs/ "To enable your Amazon EC2 instances to fully utilize the IOPS provisioned on an EBS volume, you can launch selected Amazon EC2 instance types as “EBS-Optimized” instances."
What Are Your Needs? • Temporary or long-term storage? • Shared or per instance? • How much? • How fast?
Long term storage • Use Amazon S3 • Pull datasets when needed • Easy to access using AWS CLI or API
$ aws s3 cp s3://mybucket/dataset/input /ephemeral/input
• Lifecycle to Amazon Glacier
Temporary Storage • Local ephemeral for scratch • Distributed filesystem for high-performance
scratch – OrangeFS – Lustre – Ceph
• Pull data from Amazon S3
How much? • With Amazon S3, you pay for what you use • With Amazon EBS, you pay for what you
provision • Keeping data in Amazon S3 and only pulling
what is needed helps mange cost
How fast? • Ephemeral storage can deliver up to 2.2GB/sec
– more instances == more throughput
• Amazon EBS volumes support up to 4000 IOPS – more volumes == more IOPS
• Amazon S3 scales horizontally – more client == more throughput – more connections == more throughput
Making filesystems persist • Use Amazon EBS for block storage • Use Amazon EBS snapshots for recovery • Use a replicated distributed filesystem
Automating deployments • AWS CloudFormation • Drive storage through parameters • Easy to set up and tear down • Track template changes in SCM
Solutions on AWS • OrangeFS from Omnibond
• Red Hat Storage 2.0
• Intel Cloud Edition Lustre - Private Beta
Customer presentation
RNA-Seq Differential Gene Expression Workflow
Clemson University Professor, Dr. Alex Feltus had been discussing with Eddie Duffy and Dr. Barr Von Oehsen, about optimizing the Gene Expression Workflow. As a result, a summer project with Brandon Posey was started to work with this optimization in the AWS cloud. The longest processing steps were the FastQ steps and is where the optimization started.
*Workflow chart provided with permission from Allele Systems (www.allelesystems.com)
OrangeFS – Scalable Parallel File System on AWS
Available on the AWS Marketplace and brought to you by Omnibond
OrangeFS Instance
Unified High Performance File System
Amazon DynamoDB
Amazon EBS
volumes
Cloud Cluster Built using AWS, Torque/Maui, OrangeFS
OrangeFS WebDAV
Torque / Maui
Optimization Areas • Data uploaded and
retrieved via OrangeFS WebDav Interface
• MPI Jobs are submitted via Torque & Maui Scheduler
• All built with AWS CloudFormation template
MPI-IO Clients
OrangeFS Servers
Amazon DynamoDB
AWS CloudFormation Prompts "KeyName" : {
"VpcId" : {
"VpcPublicSubnetId" : {
"NAT & OrangeFS… AccessFrom" : {
"FSConfigDDB" : {… "WorkerConfigDDB" : {… "Type" : "AWS::DynamoDB::Table",
"CfnUser" : { …. "Type" : "AWS::IAM::User",…
AWS CloudFormation – Amazon DynamoDB "FSConfigDDB" : {
"Type" : "AWS::DynamoDB::Table",
…
"WorkerConfigDDB" : {
"Type" : "AWS::DynamoDB::Table",
…
AWS CloudFormation - IAM & Network "instanceRootRole" : {
"instanceRootProfile" : {
"HostKeys" : {
"PrivateSubnet" : {
"PrivateRouteTable" : {
"PrivateSubnetRouteTableAssociation" : {
"PrivateNetworkAcl" : {
"NATIPAddress" : {… "Type" : "AWS::EC2::EIP",
AWS CloudFormation – Instances "NATDevice" : {…
"Type" : "AWS::EC2::Instance",
"MasterCoordinator" : {… "Type" : "AWS::EC2::Instance",
"OrangeFSFleet" : {… "Type" : "AWS::AutoScaling::AutoScalingGroup",
"WorkerFleet" : {… "Type" : "AWS::AutoScaling::AutoScalingGroup",
"WebDavDevice" : {… "Type" : "AWS::EC2::Instance",
AWS CloudFormation – Cloud Init (python & Boto) "sudo /usr/bin/python2.7 /home/ec2-user/TorqueMasterConfigure.py -l DEBUG -f /home/ec2-user/MasterConfig.log”,
" -n ", {"Ref" : "WorkerConfigDDB"}, " -o ", {"Ref" : "FSConfigDDB"}, " -s ", {"Fn::FindInMap" : [ "ConfigParameters", "OrangeFSFleetSize", "item"]}, " -z ", {"Fn::FindInMap" : [ "ConfigParameters", "WorkerFleetSize", "item"]}, " -m ", {"Fn::FindInMap" : [ "ConfigParameters", "WorkerMaxFleetSize", "item"]}, " -p ", {"Fn::FindInMap" : [ "ConfigParameters", "OrangeFSPort", "item"]}, " -a ", {"Fn::FindInMap" : [ "ConfigParameters", "FSName", "item"]}, " -d ", {"Fn::FindInMap" : [ "ConfigParameters", "FSID", "item"]}, "\n",
Demo • Spin up a cluster on AWS live
*Workflow chart provided with permission from Allele Systems (www.allelesystems.com)
RNA-Seq Differential Gene Expression Workflow
Optimization Areas • Fast- Splitter
rewritten in MPI-IO to leverage OrangeFS in AWS
• Merge-FastQ also rewritten in MPI-IO to leverage OrangeFS in AWS
Genomics – Data @@@FFF=BFHFDHCCDECJHIIIHG@GEEGAGEHFDHDHGIF@FGDEBFGIIGG=CGFGCDCEGHFEEECEBADBB?BCCCC<5:>@CCCA<9>C@A@ACB
@HWI-ST1097:170:C1LBBACXX:6:1101:1379:2208 1:N:0:CGATGT
CCTGTTATTGCCTCAAACTTCCGTGGCCTAAAACGCCAAAGTCCCCCTAAGAAGATAGCTGCGGGGGGGTGGCTCCGCCTAGCTAGTTAGGAAGCTGAGGG
+
CCCFFFFFHHHHHJJJJJJJJJJFAC8A*1?E#####################################################################
@HWI-ST1097:170:C1LBBACXX:6:1101:1582:2059 1:N:0:CGATGT
GTATTGTCATAAGCAGTTAAAGCTGATGTGCGCCTGTCATGTAATGCTGTAGAAACAAGCTCAGCAAGCTGCTGCTTTTGTGTTCTTGCACCGGAGNTCTT
Torque/Maui Job #!/bin/bash
#PBS -l nodes=4
#PBS -l walltime=4:00:00
#PBS -j oe
#PBS -q batch
#PBS -N AWS
cd /mnt/orangefs
mpirun /usr/local/bin/concat -p '/mnt/orangefs/Sample_Feltus1_L006_R2.cat.fastq.*' -o Combined.fastq >> /mnt/orangefs/Results.txt
FastQ Splitter Time (seconds)
0 20 40 60 80 100
m1.xlarge
m3.xlarge
cc2.8xlarge
Read Input Transfer Write Output
0 500 1000 1500 2000 2500 3000 3500 4000
Old Method
Seconds
Seconds
FastQ Merge Time (seconds)
0 20 40 60 80 100 120
m1.xlarge
m3.xlarge
cc2.8xlarge
Merge Time
0 500 1000 1500 2000 2500
Old Method
Seconds
Seconds
Demo • Torque/Maui Job on the cluster that was spun
up.
More Info • AWS Marketplace…
– OrangeFS Community Edition – OrangeFS Advanced Edition
• Community… Orangefs.org
• Pipeline – Allele Systems… allelesystems.com
Please give us your feedback on this presentation
As a thank you, we will select prize winners daily for completed surveys!
BDT205