Download - Scaling Servers and Storage for Film Assets

Transcript
Page 1: Scaling Servers and Storage for Film Assets

Scaling Servers and Storage for Film Assets Mike Sundy Digital Asset System Administrator

David Baraff Senior Animation Research Scientist

Pixar Animation Studios

Page 2: Scaling Servers and Storage for Film Assets
Page 3: Scaling Servers and Storage for Film Assets

Environment Overview Scaling Storage Scaling Servers Challenges with Scaling

Page 4: Scaling Servers and Storage for Film Assets

Environment Overview

Page 5: Scaling Servers and Storage for Film Assets

Environment

As of March 2011: •  ~1000 Perforce users (80% of company) •  70 GB db.have •  12 million p4 ops per day (on busiest server) •  30+ VMWare server instances •  40 million submitted changelists (across all servers) •  On 2009.1 but planning to upgrade to 2010.1 soon

Page 6: Scaling Servers and Storage for Film Assets

Growth & Types of Data

Pixar grew from one code server in 2007 to 90+ Perforce servers storing all types of assets: •  art – reference and concept art – inspirational art for film. •  tech – show-specific data. e.g. models, textures, pipeline. •  studio – company-wide reference libraries. e.g. animation reference, config files, flickr-like company photo site. •  tools – code for our central tools team, software projects. •  dept – department-specific files. e.g. Creative Resources has “blessed” marketing images. •  exotics – patent data, casting audio, data for live action shorts, story gags, theme park concepts, intern art show.

Page 7: Scaling Servers and Storage for Film Assets

Scaling Storage

Page 8: Scaling Servers and Storage for Film Assets

Storage Stats

•  115 million files in Perforce. •  20+ TB of versioned files.

Page 9: Scaling Servers and Storage for Film Assets

Techniques to Manage Storage

•  Use +S filetype for the majority of generated data. Saved 40% of storage for Toy Story 3 (1.2 TB). •  Work with teams to migrate versionless data out of Perforce. Saved 2 TB by moving binary scene data out. •  De-dupe files — saved 1 million files and 1 TB.

Page 10: Scaling Servers and Storage for Film Assets

De-dupe Trigger Cases

• p4 submit file1 file2 ••• fileN p4 submit file1 file2 ••• fileN # only file2 actually modified •  p4 submit file # contents: revision n # five seconds later: “crap!” p4 submit file # contents: revision n–1 •  p4 delete file p4 submit file # user deletes file (revision n) # five seconds later: “crap!” p4 add file p4 submit file # contents: revision n

Page 11: Scaling Servers and Storage for Film Assets

De-dupe Trigger Mechanics

repfile.14 AABBCC…!

repfile.15 AABBCC…!

repfile.24 AABBCC…!

repfile.25 XXYYZZ…!

repfile.26 AABBCC…!

repfile.34 AABBCC…!

repfile.38 AABBCC…!

file#n file#n+1

file#n file#n+1 file#n+2

file#n file#n+2 file#n+1

Page 12: Scaling Servers and Storage for Film Assets

De-dupe Trigger Mechanics

repfile.24 AABBCC…!

repfile.25 XXYYZZ…!

repfile.26 AABBCC…!

•  +F for all files; detect duplicates via checksums. •  Safely discard duplicate: $ ln repfile.24 repfile.26.tmp $ rename repfile.26.tmp repfile.26!

repfile.24 AABBCC…!

repfile.25 XXYYZZ…!

repfile.26.tmp

repfile.26 AABBCC…!

hardlink rename

Page 13: Scaling Servers and Storage for Film Assets

Scaling Servers

Page 14: Scaling Servers and Storage for Film Assets

Scale Up vs. Scale Out

Why did we choose to scale out? •  Shows are self-contained. •  Performance of one depot won’t affect another.* •  Easy to browse other depots. •  Easier administration/downtime scheduling. •  Fits with workflow (e.g. no merging art) •  Central code server – share where it matters.

Page 15: Scaling Servers and Storage for Film Assets

Pixar Perforce Server Spec

•  VMWare ESX Version 4. •  RHEL 5 (Linux 2.6). •  4 GB RAM. •  50 GB “local” data volume (on EMC SAN). •  Versioned files on Netapp GFX. •  90 Perforce depots on 6 node VMWare cluster – special 2-node cluster for “hot” tech show. •  For more details, see 2009 conference paper.

Page 16: Scaling Servers and Storage for Film Assets

Virtualization Benefits

•  Quick to spin up new servers. •  Stable and fault tolerant. •  Easy to remotely administer. •  Cost-effective. •  Reduces datacenter footprint, cooling, power, etc.

Page 17: Scaling Servers and Storage for Film Assets

Reduce Dependencies

•  Clone all servers from a VM template. •  RHEL vs. Fedora. •  Reduce triggers to minimum. •  Default tables, p4d startup options. •  Versioned files stored on NFS. •  VM on a cluster. •  Can build new VM quickly if one ever dies.

Page 18: Scaling Servers and Storage for Film Assets

Virtualization Gotchas

•  Had severe performance problem when one datastore grew to over 90% full. •  Requires some jockeying to ensure load stays balanced across multiple nodes – manual vs. auto. •  Physical host performance issues can cause cross-depot issues.

Page 19: Scaling Servers and Storage for Film Assets

Speed of Virtual Perforce Servers

•  Used Perforce Benchmark Results Database tools. •  Virtualized servers 95% of performance for branchsubmit benchmark. •  85% of performance for browse benchmark (not as critical to us). •  VMWare flexibility outweighed minor performance hit.

Page 20: Scaling Servers and Storage for Film Assets

Quick Server Setup

•  Critical to be able to quickly spin up new servers. •  Went from 2-3 days for setup to 1 hour. 1-hour Setup •  Clone a p4 template VM. (30 minutes) •  Prep the VM. ( 15 minutes) •  Run “squire” script to build out p4 instance. (8 seconds) •  Validate and test. (15 minutes)

Page 21: Scaling Servers and Storage for Film Assets

Squire

Script which automates p4 server setup. Sets up: •  p4 binaries •  metadata tables (protect/triggers/typemap/counters) •  cron jobs (checkpoint/journal/verify) •  monitoring •  permissions (filesystem and p4) •  .initd startup script •  linkatron namespace •  pipeline integration (for tech depots) •  config files

Page 22: Scaling Servers and Storage for Film Assets

Superp4

Script for managing p4 metadata tables across multiple servers. •  Preferable to hand-editing 90 tables. •  Database driven (i.e. list of depots) •  Scopable by depot domain (art, tech, etc.) •  Rollback functionality.

Page 23: Scaling Servers and Storage for Film Assets

Superp4 example

$ cd /usr/anim/ts3!$ p4 triggers -o!Triggers:

!noHost form-out client ”removeHost.py %formfile%”!!$ cat fix-noHost.py!def modify(data, depot):! return [line.replace("noHost form-out”,! "noHost form-in”)! for line in data]!!$ superp4 –table triggers –script fix-noHost.py –diff! • Copies triggers to restore dir • Runs fix-noHost.py to produce new triggers, for each depot. • Shows me a diff of the above. • Asks confirmation; finally, modifies triggers on each depot. • Tells me where the restore dir is!!

Page 24: Scaling Servers and Storage for Film Assets

Superp4 options

$ superp4 –help! -n Don’t actually modify data! -diff Show diffs for each depot using xdiff. -category category Pick depots by category (art, tech, etc.) -units unit1 unit2 ... Specify an explicit depot list (regexp allowed). -script script Python file to be execfile()'d; must define a function named modify(). -table tableType Table to operate on (triggers, typemap,…) -configFile configFile Config file to modify (e.g. admin/values-config) -outDir outDir Directory to store working files, and for restoral. -restoreDir restoreDir Directory previously produced by running superp4, for when you screw up.

Page 25: Scaling Servers and Storage for Film Assets

Challenges With Scaling

Page 26: Scaling Servers and Storage for Film Assets

Gotchas

•  //spec/client filled up. •  user-written triggers sub-optimal. •  “shadow files” consumed server space. •  monitoring difficult – cue templaRX and mayday. •  cap renderfarm ops. •  beware of automated tests and clueless GUIs. •  verify can be dangerous to your health (cross-depot).

Page 27: Scaling Servers and Storage for Film Assets

Summary

•  Perforce scales well for large amounts of binary data. •  Virtualization = fast and cost-effective server setup. •  Use +S filetype and de-dupe to reduce storage usage.

Page 28: Scaling Servers and Storage for Film Assets

Q & A

Questions?