Scaling to 150,000 Builds a Month... and Beyond
Transcript of Scaling to 150,000 Builds a Month... and Beyond
PETER LESCHEV • TEAM LEAD • ATLASSIAN • @PETERLESCHEV
Build Engineering @ Atlassian:Scaling to 150k builds per month & beyond
Summit 2015
T E A M
I N T R O D U C T I O N
I N F R A S T R U C T U R E
B A M B O O S E RV E R S
Introduction
C O N C L U S I O N
Build platform & services used internally within
Atlassian to build, test & deliver
software
Developers expect a reliable infrastructure
& fast CI feedback
• 12 Bamboo Servers• maven.atlassian.com / 9 Nexus instances / 9 TB
• 7 Nexus proxies for internal traffic
• Monitoring• opsview, graphite, statsd, newrelic, datadog
Build Engineering today @ Atlassian
• 1200 build agents on EC2• include SCM clients, JDKs, JVM build tools, databases, headless
browser testing, Python builds, NodeJS, installers & more
• Maintain 20 AMIs of various build configurations
4 years ago:
Builds per month
21k
Last month:
Builds per month
225k
Build Engineering @ Atlassian
JIRA alone has
Automated tests
49k
3 stories of gaining maturity to handle Atlassian growth
I N T R O D U C T I O N
T E A M
I N F R A S T R U C T U R E
B A M B O O S E RV E R S
Team
C O N C L U S I O N
History of team roles
Individual Engineers
Information Silos
Fault investigation, requests for advice, unplanned work
Little project work
Very interrupt driven
Duplication of effort
Limited to customer driven changes
Disturbed roleKnowledge Transfer
When switching between project / disturbed roles is difficult
More project workNon-disturbed can focus on larger tasks
Context switching
Reduction in duplication of effort, promotes collaboration within the team
2 week rotation
Team expands
Infra Engineers
Developers
Build Engineers
Disturbed for Dev & Infra
Too interrupt driven
To encourage knowledge transfer between infra & dev
Staggered changeoversMinimising disruption due to context switching
Disturbed pairing
Couldn’t handle smaller customer raised requests & interrupt driven work
Supporting Developers
team channel
Supporting Developers
Questions for Confluence
Supporting Developers
Questions for Confluence
1. Measure the pain
2. Continuous Improvement
Technical Debt
Technical Debt
Contact Rate
+ Confluence Questions+ Hipchat queriesCustomer JIRA issues
Number of Developers
( )÷
=
Contact Rate
The Shield
http://www.clker.com/cliparts/e/d/c/4/11970889822084687040sinoptik_Medieval_shield.svg.hi.png
Rebranding MaintenanceDisturbed
Removing the negative attitude towards the old role within the team
Project
work
Maintenance
The Shield
How do we avoid this in the future?P E T E R L E S C H E V
“ ”
Fix it now, fix it for the future
Self service
Chat bots
Self Service
Self Service
Maven Self Help Tool
I N T R O D U C T I O N
I N F R A S T R U C T U R E
T E A M
B A M B O O S E RV E R S
Infrastructure
C O N C L U S I O N
Infrastructure as Code
= Puppet + SCM ?
4 years ago…
Started using Puppet
Manually maintained snow flakes
Production rollout
puppetmaster
build agents
Production rollout failure
puppetmaster
build agents
Low confidence of change
• Coding on Puppet Master• Culture of manually modifying production - Configuration Drift• Impact on Builds
Using Staging for Development
puppetmaster
build agents
staging puppet environment
Vagrant
www.vagrantup.comMitchell Hashimoto
@mitchellh
Packer
packer.io
Rolling out to staging
Rolling out to production
Broken build agents
Developing locally
But it works on my machineE V E RY D E V E L O P E R
“ ”
Continuous Integration‘From scratch’ provisioning
Confidence that you can rebuild in disaster
The Pets: you give nice names, you stroke them, and when they get ill, you nurse them back to health, taking a long time over it.
“
”The Cattle: you give them numbers. When they get ill, you shoot them T I M B E L L , C E R N
Broken buildsmaster
Branch builds
BUILDENG-5670
BUILDENG-5669
master
Infrequent Releases
Manual Puppet Rollouts
git clone
librarian-puppet install
symlink update on puppet master
Bamboo Deployments
How environments work
Task list Available agents
Available agents
Available agents
Destination server
Destination server
Production
TASK 1TASK 2
TASK 1TASK 2
TASK 1TASK 2
1.3
Task list
Task list Available agents
TASK 1TASK 2
Task list
Task list
Release
Production
TASK 1TASK 2
1.3
Task list Available agents Destination server
Production
TASK 1TASK 2
1.3
Available agents Destination server
TASK 1TASK 2
Task list
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
staging
production
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
build• git clone
• librarian-puppet
• to specific environments
• scp to puppet master & symlink update
test deploy• ‘delta’ & ‘from scratch’
vagrant provisions
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
build & test AMIs• Generated using Packer
• AMIs on Bamboo Servers updateddeploy AMIs
Puppet Build, Test & Deploy Pipeline
Puppet Build, Test & Deploy Pipeline
‘open prs’ Bot
Less human effort through automation
= Increased frequency
& reliability of releases
Snowflakes
Pets
Cattle
Stateless Machines
Infrastructure consistency is key
I N T R O D U C T I O N
B A M B O O S E RV E R S
T E A M
I N F R A S T R U C T U R E
Bamboo Servers
C O N C L U S I O N
At scale is hard
Bamboo Servers
12
Build Plans
3500
Plan Branches
14k
Bamboo is great, but hard to manage at scale
Build Configuration as code
Plan Templates
Bamboo Plugin:
Plan Templates
Checked into SCM
Bamboo Plugin:Reusable snippets
changes can be code reviewed
Export plans for backup, or move to another Bamboo instance easily
Bulk changes
Export existing plans
Update 100s of job requirements with a single commit
Pushing Bamboo to its limits
Agent Smith Wallboard
Bamboo Plugin:
Trend data sent to Graphite
https://marketplace.atlassian.com/plugins/com.atlassian.bamboo.plugin.agent-smith-wallboard
Add metrics, then alert on them
Bamboo Monitoring Plugin
Metrics to graphiteBamboo Plugin:
Bamboo HealthActiveMQ, Database connections, Tomcat, JVM Memory usage.
Background thread workers. Number of plans / plan branches, plans / plan branches for deletion.
https://marketplace.atlassian.com/plugins/com.atlassian.bamboo.plugin.bamboo-monitoring-plugin
When a Bamboo Server starts
misbehaving…
Infrastructure differences? Is it Bamboo Configuration?
Is it a Bamboo Plugin? Is it Bamboo the product?
How is it being used?
Infrastructure consistency of Bamboo Servers is key
Bamboo Puppet provider
+
REST API for Administration
Bamboo Puppet Provider
REST calls
https://forge.puppetlabs.com/atlassian/bamboo_rest
Bamboo Puppet provider
https://forge.puppetlabs.com/atlassian/bamboo_rest
Hipchat Notification
Managed via Puppet
Bamboo Plugins‘Continuous Plugin Deployment’ Task
This text box is not intended to contain a bunch of copy.
1-click upgrades of
How environments work
Task list Available agents
Available agents
Available agents
Destination server
Destination server
Production
TASK 1TASK 2
TASK 1TASK 2
TASK 1TASK 2
1.3
Task list
Task list Available agents
TASK 1TASK 2
Task list
Task list
Release
Production
TASK 1TASK 2
1.3
Task list Available agents Destination server
Production
TASK 1TASK 2
1.3
Available agents Destination server
TASK 1TASK 2
Task list
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
All Bamboo Servers
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
build
Deploy
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
build & test AMIs
Build
https://marketplace.atlassian.com/plugins/com.atlassian.bamboo.plugins.deploy.continuous-plugin-deployment
Bamboo Servers1-click upgrades of
Using scp / ssh & puppet
How environments work
Task list Available agents
Available agents
Available agents
Destination server
Destination server
Production
TASK 1TASK 2
TASK 1TASK 2
TASK 1TASK 2
1.3
Task list
Task list Available agents
TASK 1TASK 2
Task list
Task list
Release
Production
TASK 1TASK 2
1.3
Task list Available agents Destination server
Production
TASK 1TASK 2
1.3
Available agents Destination server
TASK 1TASK 2
Task list
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Upgrade Bamboo
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Build Bamboo
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
Deploymentproject
Build plan
How artifacts work
1.0
1.3
1.3
1.3
Build results
(Artifacts)
Release Environments
Productio
n
Develop
ment
1.0
1.31.3
Productio
n
Develop
ment
1.31.3
Develop
ment
Artifactsn
n+1
n+2
Versions
Test &
Build
JIRA
issueCommit TriggerCode
Release notes
Repository Build artifacts Release
jira-bamboo
servicedesk-bamboo
Infrastructure differences? Is it Bamboo Configuration?
Is it a Bamboo Plugin? Is it Bamboo the product?
How is it being used?
T E A M
I N F R A S T R U C T U R E
B A M B O O S E RV E R S
Conclusion
C O N C L U S I O N
I N T R O D U C T I O N
Constant improvement
We’ve matured to handle the growth of Atlassian
Thank you!
PETER LESCHEV • TEAM LEAD • ATLASSIAN • @PETERLESCHEV