Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

41
Successes, Challenges and Pitfalls Migrating a SAAS Business to Hadoop Shaun Klopfenstein, CTO Eric Kienle, Chief Architect

Transcript of Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 1: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Successes, Challenges and Pitfalls Migrating a SAAS Business to HadoopShaun Klopfenstein, CTO Eric Kienle, Chief Architect

Page 2: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

The Vision

Page 3: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Requirements

Page 4: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 4Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Business Requirements• Near real-time activity processing• 1 billion activities per customer per day• Improve cost efficiency of operations while scaling up• Global enterprise grade security and governance

Page 5: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 5Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Architecture Requirements• Maximize utilization of hardware• Multitenancy support with fairness• Encryption, Authorization & Authentication• Applications must scale horizontally

Page 6: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Technology Bake Off

Page 7: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 7Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Bake Off• Technology Selection

• Storm/Spark Streaming• HBase/Cassandra

• Built POC with each permutation + Kafka• Load tested with one day of web traffic

Page 8: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 8Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

The Winner Is… Our First Challenge • We hoped to find a clear winner… we didn’t exactly• Truth is all the POCs worked at the scale we tested• It’s possible if we had scaled up the test, we would

have found more differences

Page 9: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 9Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

How We Chose• Community• Features• Team Skillset• History• The winners: HBase/Kafka/Spark streaming

Page 10: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Architecture & Design

Page 11: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 11Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Page 12: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

• Enhanced Lambda Architecture• Inbound activities written to Ingestion Processor

• Hbase and then Kafka• High volume (e.g. web) activities

• First written to Kafka, then enriched• Spark Streaming applications consume events from Kafka

• Solr Indexing• Email Reports• Campaign Processing

• HBase is used for simple historical queries, and is system of record

High Level Architecture

Page 13: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Build It

Implementation

Page 14: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 14Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Building Expertise• We had a few people with Hadoop and

Spark experience

• We decided to grow knowledge in house

• Focus on training - HortonWorks boot camp for operations

• In house courses and tech talks for engineering/QE

Page 15: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 15Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Building Expertise - Successes• Critical to kick start the project

• Built excitement

• Created foundation for the design process

Page 16: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 16Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Building Expertise – Context Challenge

Challenge • Training packed a lot of information into a short period• Teams that didn’t leverage the training right away lost context

Recommendation• Create environments for hands on experience early• Hands on experience across all teams right after training

Page 17: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 17Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Building Expertise – Experience Challenge

Challenge • Hadoop technology is like playing a piano… knowing how to read

music doesn’t mean you can play• Many ways to design, configure, manage - Only a few right ways

and the reasons can be subtle

Recommendation• Find your experts!• Partner and hire

Page 18: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 18Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Building Our First Cluster• Initial sizing and capacity planning of first

Hadoop Clusters

• Perform load tests to get initial capacity plan

• Decided that disk I/O and storage would be the leading indicator

• Went with industry best practice on hardware and network configuration

Page 19: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 19Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Building Our First Cluster- Success• Leading indicator ended up being compute

• But cluster sizing ended up being close enough to start

• Clusters can always be expanded…So don’t get too hung up

Page 20: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 20Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Building Our First Cluster – Zookeeper & VM

Challenge • We started with Zookeeper virtualized• Didn’t perform properly (we think because of disk IO)• Caused random outages

Recommendation• We ended up migrating zookeeper to physical boxes• Don’t use VMs for zookeeper!

Page 21: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 21Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Security• All data at rest must be encrypted

• Applications sharing Hadoop must be isolated from each other

• Applications must have hard quotas for both compute and disk resources

Page 22: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 22Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Security - Success• Enabled Kerberos security for Hadoop cluster• Kerberos allowed us to leveraged HDFS

native encryption• Used encrypted disks for Kafka servers• Created separate secure Yarn queues to

isolate applications• Each application uses a separate Kerberos principal

Page 23: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 23Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Security – Kerberos ChallengeChallenge

• Kerberos can’t be added to a Hadoop cluster without prolonged downtime and patches

• Needed weeks of developer time to accommodate security changes• Added several months to the overall rollout schedule

Recommendation• Allow extra time for Kerberos• Educate your team beforehand, find an expert to guide you• Be prepared for different levels of Kerberos support across the

Hadoop ecosystem

Page 24: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 24Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Security – Kafka and Spark Challenge

Challenge• Kafka doesn’t support data encryption (and won’t)• HDP version we had didn’t fully support Kerberos Kafka and Spark

clients properly

Recommendation• Move Kafka and Spark out of Ambari • Only encrypt Kafka data if you absolutely must, as it adds complexity

Page 25: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Test It

Page 26: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 26Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Validation• Changing the engines on a plane while in flight is hard• Required all components implemented “Passive mode”

• The new code ran in the background and continuously compared results with the legacy system

• Automated functional tests kicked off from Jenkins• Performance testing at AWS

Page 27: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 27Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Validation - Success• Passive mode is one of the best moves we made!• Allowed for testing of components with real world

data and load• Found countless performance and logic issues with

minimal operational impact

Page 28: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 28Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Validation – Passive Mode “Minimal Impact”

Challenge• By design passive mode wrote to both Legacy and Hadoop systems• We impacted performance during an outage of our cluster

Recommendation• Use asynchronous writes or tight timeouts in passive mode• Monitoring for the Hadoop cluster should be in place before

passive testing

Page 29: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Deploying It

Page 30: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 30Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Migration and Management• We are here!

• Migrate over 6,000 subscriptions with no service interruption or data loss

• Track and monitor migration and provide management tools for the new platform

• Achieve the end goal of removing the safety net

Page 31: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 31Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Migration and Management - Successes• Created a new management console called Sirius

• Close architectural coordination of all teams during migration

• If problems arose, we had a quick, automated, fallback path to the legacy system

• Daily cross-functional standup meetings to track the rollout

Page 32: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Page 33: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Challenge• Oozie workflows can be challenging to build and debug• Capacity planning and resource management in the shared Hadoop

cluster is very complex

Recommendation• Only use Oozie workflows for automating complex or long running

processes, or use a different orchestration platform• Constantly reevaluate your capacity plan based on current deployment

Migration and Management Challenges

Page 34: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Running It

Page 35: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 35Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Monitoring• Needed to monitor hundreds of new Hadoop and other

infrastructure servers

• Our custom Spark Streaming applications required all new metrics and monitors

• Capacity planning requires trend analysis of both the infrastructure and our applications

• Don’t overwhelm our already busy Cloud Platform Team

Page 36: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 36Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Monitoring - Successes• Built a custom monitoring infrastructure using

OpenTSDB and Grafana

• Added business SLA metrics to our Sirius console to provide real-time alerts

• Added comprehensive Hadoop monitors into our pre-existing production monitoring system

Page 37: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 37Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Monitoring - ChallengesChallenges

• Adding hundreds of servers and a dozen new applications makes for a huge monitoring task

• Nagios is a very general purpose system and isn’t designed to monitor Hadoop out of the box

Recommendations• Make sure that you have monitors and trend analysis in

place and tested before migration• Be prepared to constantly refine and improve the your

monitors and alerts

Page 38: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 38Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Patching and Upgrading• We have a zero-downtime requirement for applications

• Patching and upgrading of either the infrastructure or our own applications is problematic

• Keeping up with the community requires frequent patching

• Eventually hundreds of Spark Streaming jobs will need to be constantly processing data with no interruption

Page 39: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 39Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Patching and Upgrading - Successes• Use Sirius console to manage Spark Streaming jobs

• Marketo’s Kafka consumer allows streaming jobs to pick up where they left off after a restart

• Integrated existing Jenkins infrastructure with the Sirius console to provide painless automated patching/upgrades

Page 40: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

Page 40Marketo Proprietary and Confidential | © Marketo, Inc. 05/03/2023

Infrastructure Patching and Upgrading - Challenges

Challenges• Patches/upgrades managed with Ambari – not perfect!• We almost never get through an upgrade without one or more Hadoop

components having downtime (so far)

Recommendations• Test all infrastructure patches and upgrades in a loaded non-production

environment• Check out the start and stop scripts from the component specific open

source communities, rather than rely on Ambari

Page 41: Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop

We’re Hiring! Http://Marketo.Jobs

Q & A