Think Big Data, think Amazon Web Services · Think Big Data, think Amazon Web Services | Solution...

9
Think Big Data, think Amazon Web Services Solution Overview | July 2013

Transcript of Think Big Data, think Amazon Web Services · Think Big Data, think Amazon Web Services | Solution...

Page 1: Think Big Data, think Amazon Web Services · Think Big Data, think Amazon Web Services | Solution Overview | July 2013 4 It’s simple. Our big data services cover every step of the

Think Big Data, think Amazon Web ServicesSolution Overview | July 2013

Page 2: Think Big Data, think Amazon Web Services · Think Big Data, think Amazon Web Services | Solution Overview | July 2013 4 It’s simple. Our big data services cover every step of the

Think Big Data, think Amazon Web Services | Solution Overview | July 2013 2

In today’s competitive business environment, effectively leveraging large scale data analysis can be critical for business success. Data can drive insights into customer sentiment, inform better business decisions, and ultimately increase the bottom line.

But while these benefits are compelling, not all organizations are taking full advantage. In fact, while the data generated by companies, governments and individuals has increased exponentially over the last few decades, research from Gartner1 and IDC2 shows that the data being analyzed by companies has remained virtually static. The result is an ever-widening gap between raw data and available business insight.

Think Big Data,think Amazon Web Services

By eliminating the traditionally high ‘cost of entry’ for big data analysis, Amazon Web Services is enabling customers to convert more raw data into more valuable business insight, says James Brown, Business Development Manager at AWS.

1 - Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011 2 - IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares

Page 3: Think Big Data, think Amazon Web Services · Think Big Data, think Amazon Web Services | Solution Overview | July 2013 4 It’s simple. Our big data services cover every step of the

Think Big Data, think Amazon Web Services | Solution Overview | July 2013 3

Pure economics. Until recently, it has been far cheaper and easier to produce and store data than to process it. Companies have the option to buy large, multi-million dollar hardware devices to process their big data, but with budgets under pressure and no clear way to calculate returns on investment, the business case is difficult to make… And when analysts explain to the CTO that they only need the hardware for six months, for one day a week, or for one hour a day, the investment becomes impossible to justify.

So what’s the answer?Big data in the cloud gives companies affordable access to computing resources for the first time, providing compute, storage and network resource on demand. As a result, your hardware and procurement budget no longer limit your ability to analyze big data and extract business insight.

Why is this happening?

Our on-demand infrastructure services, and our pay-per-use model, make big data analysis accessible to even the smallest startups. For example, you could spin up 1,000 servers for two hours, and turn them off when you no longer need them, and pay only for the time you’ve used.

As well as vastly reducing infrastructure costs, and increasing flexibility, AWS reduces the ‘cost of entry’ for big data in other ways. For example, our end-to-end portfolio of big data solutions helps you minimize manual deployment and administration tasks, reduce the need for costly specialist skills, and deploy the full range of big data workloads in the cloud quickly and inexpensively.

AWS big data services also increase your business agility, as you can add new services on demand to support your emerging big data needs, from Hadoop and NoSQL to hosted OLTP and datawarehousing solutions and low-cost data archiving.

Why choose the Amazon Web Services (AWS) cloud for big data analysis?

Page 4: Think Big Data, think Amazon Web Services · Think Big Data, think Amazon Web Services | Solution Overview | July 2013 4 It’s simple. Our big data services cover every step of the

Think Big Data, think Amazon Web Services | Solution Overview | July 2013 4

It’s simple. Our big data services cover every step of the big data journey, from getting your data onto the cloud, to analysis, storage and archiving. In the following sections, we explain how our services help you distill business insight from your big data at the right cost for your business.

How can I start doing big dataanalysis in the cloud with AWS?

01Ste

p Getting your big data onto the cloud

In the past, storing big data meant estimating just how much data was involved and buying new hardware to store it on. Now, with our Amazon Simple Storage Service, you can scale storage up and down on demand, and pay only for what you use, which means even the smallest companies can do big data analysis.

To get your data to us in the first place, we also offer a range of cost-effective solutions. These include Amazon Direct Connect, which provides a direct network connection from your premises to AWS, and AWS Import/Export, which transfers your data directly onto and off your storage devices using Amazon’s high-speed internal network.

If you have less data to move, that’s also fine. You can upload it to AWS quickly and easily over the internet via a secure VPN connection or by using AWS Storage Gateway, which seamlessly connects an on-premises software appliance with cloud-based storage to provide secure integration between your on-premises IT environment and AWS’s storage infrastructure. The service enables you to securely store data to the AWS cloud ready for big data analysis.

Page 5: Think Big Data, think Amazon Web Services · Think Big Data, think Amazon Web Services | Solution Overview | July 2013 4 It’s simple. Our big data services cover every step of the

Think Big Data, think Amazon Web Services | Solution Overview | July 2013 5

Analyzing your big data It used to be the same story for compute resources as for storage - you bought servers up front to process your data, which meant significant up-front investment with no guaranteed financial returns. There was also a risk that expensive hardware would stand idle at times of low demand. Compute resources available on AWS have changed all this. You can speed up your analysis projects by distributing workloads across ten, a hundred or even thousands of additional nodes, and pay only for the resources you use, when you use them.

One great example of how AWS services are helping customers to gain valuable business insight is Amazon Redshift, our hosted datawarehousing solution. Amazon Redshift, which is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more, dramatically reduces the costs associated with building, deploying and managing a datawarehouse. In fact, it costs less than $1,000 per terabyte per year, which is approximately the tenth of the cost of most traditional data warehousing solutions.

If you need to process vast quantities of data, but can’t afford to wait around for the results, we give customers the option of the Amazon Elastic MapReduce (Amazon EMR) service. This hosted Hadoop platform distributes processing workloads across multiple nodes simultaneously, reducing processing times by orders of magnitude.

Coupled with the vast infrastructure resources of AWS, Amazon EMR gives you the power to run analysis on vast datasets in seconds, recommend products and services to end customers based on what other customers bought, and gain insight into customer sentiment based on analysis of social media sites. It also speeds up processor-intensive operations, such as recoding movies for different viewing devices, running complex climate simulations, or rendering animations.

Best of all, with Amazon EMR, you can eliminate the costs of hosting, configuring and maintaining Hadoop infrastructure in house. The service has provided significant efficiency and cost benefits for S&P Capital IQ, a major provider of data, research and analytics to institutional investors and investment advisors. Jeff Sternberg, Chief Data Scientist at S&P Capital IQ, says, “We were very interested in getting into Hadoop, and AWS provides Amazon Elastic MapReduce, which allowed us to do that very easily... We don’t have to go through a long ramp-up time to bring hardware in-house, or go through a purchasing cycle, so we can get going really quickly with new ideas.”

90% Approximate saving using Amazon Redshift as compared to traditional data warehousing solutions.

02Ste

p

Page 6: Think Big Data, think Amazon Web Services · Think Big Data, think Amazon Web Services | Solution Overview | July 2013 4 It’s simple. Our big data services cover every step of the

Think Big Data, think Amazon Web Services | Solution Overview | July 2013 6

If you have to upload hundreds of thousands of writes per second to the cloud to handle seasonal orders or major events, we can help you reduce the cost of doing so. Our hosted NoSQL environment, Amazon DynamoDB, which supports up to half-a-million writes per second, powered a successful advertising campaign for top mobile media app Shazam at the 2012 Superbowl. This innovative new approach to television allowed viewers to gain access to additional information during the show, such as live statistics, as well as access to MP3s and even the chance to win a new car. Jason Titus, Shazam’s CTO, says, “AWS gave us the flexibility to bring a massive amount of capacity online in a short period of time and allowed us to do so in an operationally straightforward way. AWS is now Shazam’s cloud provider of choice.”

For companies that need to operate and scale we have developed a relational database in the cloud, Amazon Relational Database Service (Amazon RDS). This solution provides cost-efficient and resizable capacity while managing time-consuming database administration tasks, freeing you up to focus on your applications and business.

Amazon RDS gives you access to the capabilities of a familiar MySQL, Oracle or Microsoft SQL Server database engine. This means that the code, applications, and tools you already use today with your existing databases can be used with Amazon RDS. Amazon RDS automatically patches the database soft-ware and gives the choice to back up the database if you wish, storing the backups for a user-defined retention period and enabling point-in-time recovery. You benefit from the flexibility of being able to scale the compute resources or storage capacity associated with your Database Instance via a single API call.

Last, but not least, AWS offers High Performance Computing (HPC) infrastruc-ture. This allows scientists and engineers to solve complex science, engineer-ing and business problems using applications that require high bandwidth, low latency networking, and vast compute capabilities. One AWS customer, Schrödinger, builds software to help pharmaceutical researchers bring new products to market faster. For one major customer project, the company de-ployed its Glide solution, which virtually screens billions of molecules in a fraction of the time it would take to test them in labs, on a 50,000-core AWS HPC cluster.

When a simulation was complete, Schrödinger switched off the 50,000 cores and stopped incurring charges immediately. When the next project is scheduled, the company can quickly spin up the resources required — without any up-front investment.

The code, applications, and tools you already use today with your existing databases can be used with Amazon RDS recovery.

Page 7: Think Big Data, think Amazon Web Services · Think Big Data, think Amazon Web Services | Solution Overview | July 2013 4 It’s simple. Our big data services cover every step of the

Think Big Data, think Amazon Web Services | Solution Overview | July 2013 7

Storing and archiving your big data

When big data is analyzed, even more data is created. In the past, companies had to choose between throwing away the original data and buying expensive equipment to store it on. To enable organizations to archive huge volumes of data long term, at very low cost, AWS created Amazon Glacier. The solution is optimized for data that is infrequently accessed and where retrieval times of several hours are suitable. Customers can reliably store large or small amounts of data for as little as $0.01 per gigabyte per month, which represents major savings compared to on-premises archiving solutions.

Managing your big data strategy

A major challenge for organizations is how to manage big data workflows cost efficiently. To minimize manual administration and coding and further reduce big data costs, AWS has developed AWS Data Pipeline, a web service that helps you reliably process and move data between different AWS compute and storage services, with minimal manual intervention, and no need for specialist skills.

This service makes it possible to access data where it’s stored, transform and process it at scale, and efficiently transfer the results to AWS services such as Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon EMR.

By tying all elements of big data in the cloud together, AWS Data Pipeline helps customers to lower management costs and create complex data processing workloads that are fault tolerant, repeatable, and highly available. With no worries about resource availability, managing inter-task dependencies, retrying transient failures or timeouts in individual tasks, organizations can focus on insight, not processes.

$0.01Customers can reliably store large or small amounts of data for as little as $0.01 per gigabyte per month

03Ste

p

04Ste

p

Page 8: Think Big Data, think Amazon Web Services · Think Big Data, think Amazon Web Services | Solution Overview | July 2013 4 It’s simple. Our big data services cover every step of the

Think Big Data, think Amazon Web Services | Solution Overview | July 2013 8

As we continue to generate more data with every click, like, tweet, check-in, and API call, the business insight available to companies is increasing exponentially. However, the cost of storing and processing vast quantities of data with traditional infrastructure means that the vast majority of potentially enlightening business intelligence is never fully exploited.

Understanding that big data analysis is an economic issue, rather than a technology issue, Amazon Web Services is pioneering cloud-based big data services that remove traditional cost barriers to entry.

Underpinning our big data proposition is the world’s largest, most scalable cloud platform, which provides virtually unlimited compute, storage and networking resources to customers on demand. However, to ensure that customers can meet varied and rapidly changing big data needs, we have developed a comprehensive portfolio of on-demand big data services too.

Our end-to-end, agile big data approach means that customers can use, and pay for, the resources that they need and only when they need them. What’s more, they can add new functionality to their big data toolsets on demand, from relational databases and datawarehouses, to Amazon EMR and long-term data archiving – and tie all the elements together with an automated, low-touch workflow.

Conclusion

...with every click, like, tweet, check-in, and API call, the business insight available to companies is increasing exponentially.

Page 9: Think Big Data, think Amazon Web Services · Think Big Data, think Amazon Web Services | Solution Overview | July 2013 4 It’s simple. Our big data services cover every step of the

To discuss anything in this article, or to find out more about increasing business insight with Amazon Web Services big data solutions, please contact us online on

aws.amazon.com/contact-us/

To learn more about how AWS can help your data needs, visit our big data details page:

http://aws.amazon.com/big-data/

Next Steps