0to100 in 18 months

18
MoodleMoot 2013 Grzegorz Dostatni 0 to 100 in 18 months

Transcript of 0to100 in 18 months

Page 1: 0to100 in 18 months

MoodleMoot 2013 Grzegorz Dostatni

0 to 100 in 18 months

Page 2: 0to100 in 18 months

Introduction

Incidents

Questions

Architecture and Setup

Organization

What is next?

Results

Agenda

Page 3: 0to100 in 18 months

Introduction

In 2010 University of Alberta was looking for a new LMS. Objectives:- Reduce licensing costs- Create a service people want to

use- Improve service reliability- Collaborate with other institutions

across the province and beyond

Page 4: 0to100 in 18 months

Organization

Vice-Provost Information Technology - Sponsor

Centre for Teaching and Learning - Application Support and Development

Academic Information and Communication Technologies - System Administration, Database Administration, Networking

And Many more...

Page 5: 0to100 in 18 months

Developers • New features• Bug fixes• Frequent Updates

(See their talk at 3:30 Collaboration without Compromising)

Organization

VS.

System Admins • Stability• Security• Redundancy

Page 6: 0to100 in 18 months

ResultsUptime 99.992%

On Oct 1, 2012: 508,681 page views 3.8 million apache hits 0.332 s average page return time

Page 7: 0to100 in 18 months

ResultsDaily Page Views

Total Number of page views 65 million

Page 8: 0to100 in 18 months

ResultsDaily Unique Visitors

Total number of unique visitors who have logged in at least once: 54,844

Page 9: 0to100 in 18 months

ArchitectureDecisions

Infrastructure Hardware considerations - 6 physical hosts- VMWare Cluster- Application cluster behind Hardware

Loadbalancers- Failover systems in another data centre- BigIP F5 loadbalancers- EMC CX-4 fiber attached storage

Page 10: 0to100 in 18 months

ArchitectureDecisions

Software Software Decisions - Ubuntu LTS- Postgresql 9.0- Database in a VM- FIleserver replication using LVM and

DRBD- Hourly (nearly) backups with 30 day

retention- Backups happen on Standby servers- eAcceletor- "make everything as simple as possible,

but not simpler"

Page 11: 0to100 in 18 months

ArchitectureAll Environments

3 production Environments: • Main

Production • Archive (old

content) • CPD (non

credit)

Page 12: 0to100 in 18 months

ArchitectureProduction

Defined scalability paths • Adding nodes

to cluster • Increasing

performance of DB

Page 13: 0to100 in 18 months

ArchitectureBackups

Consistent backups require a snaphot in time of both DB and FS. Using Database Point in Time recovery to achieve synchronization

Page 14: 0to100 in 18 months

ArchitectureMonitoring

Catch problems before they turn into outages

All machines are monitored for - CPU load- Disk - Free MemoryDatabase

- Postgresql errors- Long running processesFileserver

- DRBD Mirror StatusApplication

- Number of apache processes

Page 15: 0to100 in 18 months

IncidentsOutages

1 hour outage when our UPS failed

0.5 hour outage cause unknown (disk contention?)

Page 16: 0to100 in 18 months

IncidentsProblems

Missing Database Index after upgrade to 2.2.3 DRBD Replication failure Moodle Cron issues

Page 17: 0to100 in 18 months

What is next?

Suggestions for improvements - moodle cron.php- Application functional testing- Application monitoring- create our own Ubuntu repository- scaling up and clustering Postgresql

Page 18: 0to100 in 18 months

Questions?

More information, including scripts, documentation, Disaster Recovery procedures, installation instructions, please go to http://www.ualberta.ca/~dostatni/moodlemoot2013