Is Your Project in Trouble on System Performance?
Transcript of Is Your Project in Trouble on System Performance?
1Is Your Project in Trouble on System Performance?
May 1–4, 2017
©2017 [Copyright Charles Chow] 1
SATURN 2017
SATURN 2017 - Is Your Project in Trouble on System Performance?©2017 [Copyright Charles Chow]
Is Your Project in Trouble on System Performance?
Charles Chow
May 2017
2Is Your Project in Trouble on System Performance?
May 1–4, 2017
©2017 [Copyright Charles Chow] 2
SATURN 2017
Agenda
Why do so many projects have system performance issues?
How to salvage a troubled project with performance issues?
Performance and Storage Trade-offs by example
Lessons Learned on architectural impact to the performance
Q&A
3Is Your Project in Trouble on System Performance?
May 1–4, 2017
©2017 [Copyright Charles Chow] 3
SATURN 2017
Why do so many projects have system performance issues?
1 32
Lack of non-functional requirements on performance at an early stage of the architectural design
Lack of user-centric approach in detailed design and implementation
Overlook the performance testing at the start of the System Integration Test
4Is Your Project in Trouble on System Performance?
May 1–4, 2017
©2017 [Copyright Charles Chow] 4
SATURN 2017
Most common causes of system performance issues
Lack of Network / Server Capacity
Unoptimized and/or Diminutive Databases
Unmanaged Growth of Data
Poor Resource Utilization
Poor Code Quality
Peak User Load or Traffic Spikes
5Is Your Project in Trouble on System Performance?
May 1–4, 2017
©2017 [Copyright Charles Chow] 5
SATURN 2017
How to salvage a troubled project with performance issues?
Application Layer
Server Layer
Platform Layer
Application / Service Stack Monitoring Metrics*
* Metrics available may vary by vendor’s willingness to allow for monitoring.
Metric
• Online Transaction Metrics:– Response Time, Throughput, Transaction Pass/Fail
Rate, Error Rate, Distribution)
• Batch Transaction Metrics:– Records Processed & Execution Time
Metric
• Server Metrics:– Http Server, App Server, Database Server, Other
Servers
Metric
• Platform Metrics:– Cloud Services, OS, Storage, etc.
• Identify the root causes by monitoring Key Metrics at each layer of the application stack • Conduct performance testing to analyze on performance on the overall application• Collaborate with business stakeholders on performance issue resolution
6Is Your Project in Trouble on System Performance?
May 1–4, 2017
©2017 [Copyright Charles Chow] 6
SATURN 2017
Performance and Storage Trade-offs by example
• This is a web portal project for a customized solution on Sales Opportunities
• The architecture was set on top of existing mobile application
• The client already invested over $2millon on the solution
• Quality attributes workshop (QAW) was not conducted.
• Performance as a critical non-functional requirement was not defined in early phase
Background
On-Premise Client Applications
Hosted Applications
3rd Party Providers
EDW Postgres
Nightly reload (archive old tables
and create new copies)
OAM
iOS App
Portal
SAML SSO Login provider
SFDC Application
User Notes
API web services
Frontend Proxy Server
DB I/O
7Is Your Project in Trouble on System Performance?
May 1–4, 2017
©2017 [Copyright Charles Chow] 7
SATURN 2017
Performance and Storage Trade-offs by example
• Currently the API is querying the PostgreSQL database to return results to the Portal Front-End
• These calls are especially expensive due to data volume and data propagation
• Timeouts occur when tested with 50 and more concurrent users during UAT
• Clients wanted to keep current architecture and tech stack and go-live date is 4 weeks away
• Options on database tuning, code optimization, and hardware capacity increase were exhausted
• Introduce Solr in the architecture by moving data aggregation and sorting to Solr as an indexing service
• Indexing of data will be targeted to data sets that take the longest to return due to complex calculations
• The API services will start querying the Solr index for results instead of the PostgreSQL database directly
• Use pre-generated index to serve results from the Solr instance hosted on AWS
Problem Statement
Proposed Solution
8Is Your Project in Trouble on System Performance?
May 1–4, 2017
©2017 [Copyright Charles Chow] 8
SATURN 2017
Performance and Storage Trade-offs by example
On-Premise Client Applications
Hosted Applications
3rd Party Providers
EDW Postgres
Nightly reload (archive old tables
and create new copies)
OAM
Portal
iOS App
Portal
SFDC Application
User Notes
API web services
Frontend Proxy Server
DB I/O
SAML SSO Login provider
Updated Architecture
9Is Your Project in Trouble on System Performance?
May 1–4, 2017
©2017 [Copyright Charles Chow] 9
SATURN 2017
Performance and Storage Trade-offs by example Performance Testing - Defined and simulated the user “day of life” flow through the different functionalities of the Portal based on user role with sustained load of 100 users.
Scenarios and weightages for each scenario:
User GroupNumber of scenarios
Weightage per user group
Weightage per scenario in user group
User Group 1 3 45%
Scenario 1: 45%Scenario 2: 20%Scenario 3: 35%
User Group 2 3 35%
Scenario 1: 40%Scenario 2: 40%Scenario 3: 20%
User Group 3 2 10%Scenario 1: 75%Scenario 2: 25%
User Group 4 2 10%Scenario 1: 60%Scenario 2: 40%
Parameters Configuration Details
Duration 50 minutes Ramped up from 0 to 100 virtual users (threads) in 10 minutes, sustained load for 30 mins and ramped down from 100 to 0 in 10 minutes
User think time 4-5 seconds/ 10-12 seconds Simulated wait time between steps in scenarios as per user behavior
Data Volume – Open opportunities 17 million to 33 million Simulated increase in opportunities based on estimated growth
Major Performance Activities:
Action Accomplishments
Implemented Solr Indexing
Resolved the major Performance issue
Query and API Tuning
Resolved the specific functional area performance issue
Load TestingCaptured Redis cache issue and mitigated future production issue
Database environment turning
Identified database connection pooling issue and resolved scalability issue
10Is Your Project in Trouble on System Performance?
May 1–4, 2017
©2017 [Copyright Charles Chow] 10
SATURN 2017
Performance and Storage Trade-offs by examplePerformance result - average response time for user actions in each scenario was recorded to be within the SLA. • User action in all page loads except performance dashboard - Average response time was below 3 seconds• User action in performance dashboard – Average response time was below 3 secs with spikes averaging 4.2 secs
Home Page Load Opportunity Search Results
My Scorecard Page Load My Account Dashboard – Drill into top/bottom value
11Is Your Project in Trouble on System Performance?
May 1–4, 2017
©2017 [Copyright Charles Chow] 11
SATURN 2017
SATURN 2017
Title of the Presentation Goes Here©2017 [Copyright Owner(s)]
Questions?
Thank You