Migrating the BBC website to Apache...

Post on 10-Mar-2021

9 views 0 download

Transcript of Migrating the BBC website to Apache...

Migrating the BBC website

to Apache 2

By Nick Holmes BBC New Media

Who are the BBC

What is this talk about

Migrating from Apache 1.3.x to 2.0.x

Why we moved

What benefits we achieved

Bugs/Problems we encountered

What we added in a time of change

What’s next with Apache and bbc.co.uk

Who am I?

Nick Holmes – Technical Lead

Standards Focused, Audience Driven

Mainly HTML, mod_include, .htaccessside of Apache.

Apache Advocate

Why we moved

The Business Case

Why we moved

Public service vs Cost

1.7 billion page requests, 44 million users

Licence fee funded

Architecture

Solaris servers / POSIX threads

Threaded ‘worker’ MPM

KeepAlive & Filters

Benefits of

upgrading

What was actually achieved

What was achieved

50% less server load at rollout

Techniques to further this goal

Pages more dynamic

Easier to build

Quicker to deliver

Solaris & ‘worker’ MPM

Solaris supports POSIX threads

‘Worker’ (multithreaded) MPM

10x No. of connections for 3X memory footprint

Memory Usage

0

1000000

2000000

3000000

4000000

5000000

6000000

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

May 2004

kByte

s

Servers

0

50

100

150

200

250

300

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

May 2004

Serv

ers

Total

Busy

Load Avg

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

May 2004

Lo

ad

Loadavg 15min

Loadavg 5min

Solaris & ‘worker’ MPMCPU Usage

0

5

10

15

20

25

30

35

40

45

50

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

May 2004

% C

PU

cpu sys

cpu user

Load Avg

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

May 2004

Lo

ad

Loadavg 15min

Loadavg 5min

Hits

0

200

400

600

800

1000

1200

1400

1600

1800

1 2 3 4 5 6 7 8 9 10 11121314151617181920 21222324252627282930 31

May 2004

Hit

s/s

Network out

0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

8000000

9000000

10000000

1 4 7 10 13 16 19 22 25 28 31

May 2004

Byte

/sMax Out

Avg Out

KeepAlive

Multiple requests, same TCP connection

DoS due to memory footprint (even to 1sec)

Threading model allowed this

bbc.co.uk content

bbc.co.uk content

Dynamic elements built on:

mod_include

.htaccess

Proxied mod_perl

Proxied IIS servers with XSLT

Filters

Wrapping cgi output with html templates

Previously cgi script driven (BBC::parse)

Output filters allow wrapping on waythough web servers

Mod_include

PCRE* regular expressions

$0 - $9 captures

Previously used mod_rewrite

.htaccess file or the server conf

Not efficient / not maintainable

Reg Ex - example

url : http://foo.uk/bar.shtml?img46438.jpg

<!--#set var="bob" value=“$QUERY_STRING" -->

<!--#if expr="$bob = /img([^.]*)\.(.*)/" -->

<!--#set var="bob_num" value="$1" -->

<!--#set var="bob_ext" value="$2" -->

<img src=“<!--#echo var=“bob” -->” />

<p>This is image number <!--#echo var=“bob_num” -->. Itis a <!--#echo var=“bob_ext” --></p>

Reg Ex - output

Contact with Apache group

Resolved bugs (seg faulting)

Positive response

Inspired open source goals in business

Resulted in new modules

Coding Practices

Apache 2 is less forgiving

Who was experimenting

Who had knowledge

Standards working groups

Enabled development of new techniques

Problems & Bugs

What we had to do in order toroll out

Bugs / ProblemsSeg Faulting – resolved in later versions

Cgi daemon dying – resolved in later versions

PDF chunking problems – resolved by

AddHandler pdf-rewrite pdf

Action pdf-rewrite /cgi-bin/byteserve.pl

String Searches

<!--#if expr="$QUERY_STRING = '/yellow/'" --><!--#if expr="$QUERY_STRING = /yellow/" -->

Special Character escaping

<!--#if expr="$QUERY_STRING = /colour=yellow/" -->

<!--#if expr="$QUERY_STRING = /colour\=yellow/" -->

Problems/Bugs

Filename matchingRewriteRule news/index.shtml /totp/news/2003/11/10/7892.shtml

RewriteRule news/ /totp/news/2003/11/10/7892.shtml

ReDirect temporary /totp/news /totp/news/2003/11/10/7892.shtml

Using server variables without ‘set’-ing<!--#config timefmt="%Y/%m/%d" --><!--#include virtual="/foo/$DATE_GMT/fact.ssi" -->

<!--#set var="datefolder" value="${DATE_GMT}" --><!--#include virtual="/foo/$datefolder/fact.ssi" -->

Exec cgi - replace with include virtual

Including parsed javascript – set vars outside .jsSecurity issues with application mime-types

Problems/BugsAddHandler conflicted with SetOutputFilter

Additional space in call <!-- #include

Using “ inside values – use &quot;

value="Bler od Bu?a tra?i. Bler: "Istorijska bitka uIraku". Neophodan konkretan plan za Irak"

Trailing / on file call

http://www.bbc.co.uk/bbcfour/index.shtml/

Pathinfo on ‘file’ includes – replace with

<!--#include virtual="file.ssi?a=somepathinfo&b=aQueryString" -->

While we wereupgrading

Our development of modules

New Modules

Based on mod_include

Parent #func module

LoadModule in conf

New functionality or Easier for builders

Random include

Magazine style pages’ element

Snippets of code or content

Randomly Changing block

Random Include - Example<!--#config timefmt="%S" --><!--#set var="rand_num“ value="$DATE_GMT" -->

<!--#if expr="$rand_num > 55" --> randomchoice 1

<!--#elif expr="$rand_num > 50" --> randomchoice 2

<!--#elif expr="$rand_num > 45" --> randomchoice 3

---<!--#elif expr="$rand_num > 5" --> randomchoice 11

<!--#else --> random choice 12<!--#endif -->

Random include - Solution

<!--#func var="rnd" func="random"min="1“ max="12" -->

<!--#include file=“file${rnd}.ssi" -->

<!--#func var="rnd" func="random"value="red" value="green"value="blue" value="cyan" -->

<!--#echo var="rnd" -->

SetSplitVars - previously

RewriteEngine On

RewriteCond HTTP_COOKIE BBCWEACITY=uk([0-9][0-9][0-9])

RewriteRule (.*) http://www.bbc.co.uk/$1

[env:wea_var=%1]

Could be done in Apache 2 using

<!--#if

expr="$HTTP_COOKIE = /BBCWEACITY:uk([0-9][0-9][0-9])/"

-->

<!--#set var="uk_weather" value="$1" -->

SetSplitVars - Solutionurl : http://foo.uk/bar.shtml?a=14&b=28&c=94

In Apache 2<!--#if expr="$QUERY_STRING = /a\=([^&]*)/" -->

<!--#set var=“a" value="$1" -->

<!--#if expr="$QUERY_STRING = /b\=([^&]*)/" -->

<!--#set var=“b" value="$1" -->

<!--#if expr="$QUERY_STRING = /c\=([^&]*)/" -->

<!--#set var=“c" value="$1" -->

With SetSplitVars<!--#setsplitvars value="$QUERY_STRING" -->

Delimiter, Separator, exceptions

Math

Apache comparison = string comparison

Now we can use:=, !=, >, <, etc, numerical comparisons

addition, subtraction (negative addition)

multiplication and division

Negative of numbers

FLastMod

Extended existing to assign to a variable

Plan to include newest file

Missed part of the process

Now use for checking file existence

…else include default file

What’s Next

Migrating the News site

Geo-IP

Dedicated image serving

Mod_deflate (gzip)

Issues

Solutions (page flattener)

Progressions (page packaging)

Thank you

Thank you for your time this afternoon

Please feel free to contact me at :

nick.holmes@bbc.co.uk

Questions ??