Facilitating Breakthroughs With Next-Generation HPC

4
Business need To answer increasingly complex questions in science and other disciplines, ASU researchers needed a faster and easier way to analyze any type of big data with any model. Solution ASU created a holistic First Generation Data Science Research Instrument. It uses Dell and Intel technologies to support high-performance computing, big data, parallel processing and tiered storage. Benefits Supports “big picture” analysis that was previously impossible Speeds research by weeks with 30 teraflops and 83% faster provisioning Reduces high-performance storage costs by 70 percent Combines HPC and Hadoop in a unified fabric for next-generation data science Helps ASU remain a top research institution in the 21st century Solutions at a glance Big Data Data Center Networking High-Performance Computing Infrastructure Consulting Storage Facilitating breakthroughs with next-generation HPC Arizona State University unleashes research potential with a 30-teraflop Dell cluster that supports big data, HPC and massively parallel processing — and saves money Customer profile Company Arizona State University (ASU) Industry Education Country United States Employees 8,150 Website www.asu.edu “We’ve had long-standing, deep relationships with Dell and Intel for many years, so we engaged Dell Consulting Services to help us design and construct the NGCC and its networking infrastructure.” Gordon Wishon, Chief Information Officer, Arizona State University

Transcript of Facilitating Breakthroughs With Next-Generation HPC

Business need

To answer increasingly complex

questions in science and other

disciplines, ASU researchers needed a

faster and easier way to analyze any

type of big data with any model.

Solution

ASU created a holistic First Generation

Data Science Research Instrument.

It uses Dell and Intel technologies

to support high-performance

computing, big data, parallel

processing and tiered storage.

Benefits• Supports “big picture” analysis that

was previously impossible

• Speeds research by weeks with

30 teraflops and 83% faster

provisioning

• Reduces high-performance storage

costs by 70 percent

• Combines HPC and Hadoop in a

unified fabric for next-generation

data science

• Helps ASU remain a top research

institution in the 21st century

Solutions at a glance• Big Data

• Data Center Networking

• High-Performance Computing

• Infrastructure Consulting

• Storage

Facilitating breakthroughs with next-generation HPCArizona State University unleashes research potential with a 30-teraflop Dell cluster that supports big data, HPC and massively parallel processing — and saves money

Customer profile

Company Arizona State

University (ASU)

Industry Education

Country United States

Employees 8,150

Website www.asu.edu

“We’ve had long-standing, deep relationships with Dell and Intel for many years, so we engaged Dell Consulting Services to help us design and construct the NGCC and its networking infrastructure.” Gordon Wishon, Chief Information Officer, Arizona State University

2

Arizona State University (ASU) is a tier-one research institution and a national space-grant institution in Phoenix. ASU is the largest public university in the United States by enrollment. It’s also ranked among the nation’s top 25 research institutes in terms of research output, innovation, development, research expenditures, patents and research-grant proposals. ASU’s mission is to create a “New American University” that encourages more collaborative teaching and research through an approach it calls intellectual fusion. University researchers are currently applying this trans-disciplinary approach to help solve complex problems such as: Who is predisposed to certain types of cancer and what causes obesity?

A new way to solve more complex problems To stay at the forefront of scientific thinking, ASU needed a more flexible HPC cluster that could support increased multidisciplinary collaboration, as well as larger, more complex data models. Sethuraman Panchanathan, senior vice president of the Office of Knowledge Enterprise Development at ASU, says, ”Researchers not only approach a problem with a hypothesis and then look to prove it by collecting and analyzing data that supports the theory, but also look at massive amounts of data and identify patterns which reveal potential issues that can further be solved by analyzing the data.”

When it comes to HPC, one cluster type does not fit allASU wanted to update its HPC cluster so that it could easily support structured and unstructured data, diverse biomedical genomics tools and platforms, as well as any kind of algorithm design (serial, parallel or in-memory). Ken Buetow, Computational Science and Informatics program

Historically, researchers and educators have approached

challenges from a departmental perspective. However, to solve

the extremely complex issues of the 21st century, multidisciplinary

researchers are increasingly collaborating to get answers

from multifaceted data models that include more detail and

perspectives. Facilitating such collaboration requires next-

generation, high-performance computing (HPC) solutions that

can easily analyze more data, regardless of type or origin.

Products & Services

Services

Dell Infrastructure Consulting

Dell Managed Services

Dell ProSupport Plus

Hardware

Dell Compellent SC8000 Storage Center controllers

Dell Networking S4810 switches

Dell Networking Z9000 switches with Intel® Xeon® processors

Dell PowerEdge M1000e blade enclosure

Dell PowerEdge R720XD servers with Inte Xeon E52670 processors

Software

Cloudera® ApacheTM Hadoop® 5 (CDH5)

Bright Cluster Manager 7

“Our relationship with Dell and Intel is highly synergistic, which makes it easier to evolve our architectures, algorithms and approaches — and to realize outcomes that we would like to accomplish.” Sethuraman Panchanathan, Senior Vice President of the Office of Knowledge Enterprise Development, Arizona State University

3

director and professor in the School of Life Sciences at ASU, says, “We need a research data science tool set that allows investigators to not have to bend their problems and research to meet available computing resources. Instead, we should have an HPC solution that can readily support the hardware, software and people needed to take on and solve some of today’s most vexing problems.”

Getting answers in less time with 30 teraflops and 83% faster provisioningASU turned to Dell and Intel to expand its HPC cluster into what’s called the First Generation Data Science Research Instrument, or Next Generation Cyber Capability (NGCC). Gordon Wishon, chief information officer at ASU, says, “We’ve had long-standing, deep relationships with Dell and Intel for many years, so we engaged Dell Consulting Services to help us design and construct the NGCC and its networking infrastructure.”

The NGCC delivers 29.98 teraflops of sustained performance for HPC, big data and massively parallel (or transactional) processing with 150 nodes and 2,400 cores. The HPC side of the NGCC includes 100 Dell PowerEdge M620 servers with Intel® Xeon® E52660 processors and 1,360 cores. NGCC’s transactional side includes 20 Dell PowerEdge M420 servers — each with Intel Xeon E5-2430 processors, 400GB of solid state drives and 64GB of memory — that connect to 275TB of Terascala® storage running the Lustre® file system. The M420 and M620 servers reside in Dell PowerEdge M1000e blade enclosures. The cluster’s big data resource delivers 88 sockets, 704 cores and 6TB of memory with 44 Dell PowerEdge R720xd rack servers. Each server features Intel® Xeon® E5-2640 processors and the Cloudera® Hadoop® distribution.

IT staff can provision resources faster today because they use automated,

template-based tools in Bright Cluster Manager. Jay Etchings, director of operations, research computing at ASU, says, “It used to take us two to three hours to deploy resources for a new project. With our Dell HPC cluster and tools like Bright Cluster Manager, we can stand up a new environment in 30 minutes.”

Increasing storage flexibility while decreasing costs by 70 percent To give researchers 2PB of highly elastic storage, ASU deployed two Dell Compellent arrays. Each 1PB array features dual SC8000 controllers that engineers configured in active/active high-availability configurations. Johnathan Lee, lead server administrator of HPC and Hadoop at ASU, says, “Data sets may be 100TB or more, so being able to quickly store a copy saves time because two people can work on the same data simultaneously and we avoid having to re-download the data if one image is corrupted. Another nice thing about the Dell Compellent SC8000 is that it supports tiering.” ASU stores data that needs to be readily available on 10,000 RPM drives, while inactive files reside on 7,200 RPM drives. “We can reduce disk costs by 70 percent with the Dell Compellent SC8000s because we can store active files on faster disks and cold data on more affordable disks,” says Etchings.

Commenting on the NGCC’s design, Panchanathan says, “We created more than just a number-crunching machine. We built an HPC cluster that can assimilate and analyze massive amounts of different kinds of data. Our collaboration with Dell is so exciting because it doesn’t just say, ‘Here’s all of this hardware and good luck.’ Instead, our relationship with Dell and Intel is highly synergistic, which makes it easier to evolve our architectures, algorithms and approaches — and to realize outcomes that we would like to accomplish.”

4

Moving data around the cluster and between networks faster To further accelerate performance,ASU deployed a network based on a leaf/spine design and the OpenFlow protocol for supercomputing infrastructure applications. The spine includes four 40GbE Dell Networking Z9000 switches with Intel Xeon processors. Engineers connect servers to the fabric with twenty 10GbE Dell Networking S4810 leaf switches. “We chose the Dell Z9000 switches for their large buffers and ultra-low latency,” says Etchings. “With them, we can easily download 300TB genomic datasets over the 100G Internet2 connections supporting greater educational network collaborative efforts” Engineers connect Dell Compellent arrays to the fabric with 16GB fibre channel switches and they use 56GB Mellanox X3 InfiniBand® switches for the Terascala storage.

Moving data between cluster resources is also seamless. “The NGCC is unique because it uses the same data architecture for HPC, big data, transactional and storage capabilities, so we don’t have to wait hours or days to move data using complex extract, transform and load processes,” says Lee.

University boosts IT staff efficiency and cost savings with Dell ServicesTo give IT staff extra support, ASU engaged Dell Managed Services to work on-site for one year. Consultants help manage NGCC and provide on-the-job knowledge transfer. “We save money and time by working with Dell Managed Services because we can rapidly augment our staff with subject matter experts,” explains Etchings. “It would have taken us significant effort to try and find similar resources on our own.” IT personnel also boost efficiency and uptime by backing all components with

Dell ProSupport Plus. Etchings says, “In HPC computing, it’s not uncommon for hardware to fail. If we need replacement parts, Dell ProSupport delivers them by the next business day.”

Produces answers days or weeks faster, and spurs breakthroughsAlready, researchers are achieving more. For example, David Nolin, assistant research scientist at ASU, says, “Here at ASU we’re researching causes and consequences of obesity by applying big data methods to healthcare records. Using the NGCC cluster will allow us to greatly reduce the processing time it takes us to do these analyses and make medical breakthroughs that we would not otherwise be able to make.” Another group at ASU is using the NGCC to understand certain types of cancer by analyzing patients’ genetic sequences and mutations. Sheetal Shetty, postdoctoral scholar in Biomedical Informatics at ASU, says, “This HPC cluster at ASU is immensely greater and more powerful than any of the clusters I’ve used. I can analyze data from 40 patients in about two hours. The same analysis on a lower-end cluster would take two days. With the kind of computing power we have with Dell and Intel technologies, I’m no longer limited to looking at only some types or parts of data, which makes it very difficult to see patterns. Now, I can see the whole picture and make sense of it to get answers faster.”

Dell, the Dell logo, Compellent, PowerEdge and ProSupport, are trademarks of Dell Inc. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell disclaims any proprietary interest in the marks and names of others. Availability and terms of Dell Software, Solutions and Services vary by region. This case study is for informational purposes only. Dell makes no warranties – express or implied—in this case study. Reference Number: 10021044 © April 2015, Dell Inc. All Rights Reserved

View all Dell case studies at Dell.com/CustomerStories

“It used to take us two to three hours to deploy resources for a new project. With our Dell HPC cluster and tools like Bright Cluster Manager, we can stand up a new environment in 30 minutes.” Jay Etchings, Director, Research Computing, Arizona State University

For more information go to www.DellHPCSolutions.com