Migrating the BBC website to Apache...

34
Migrating the BBC website to Apache 2 By Nick Holmes BBC New Media

Transcript of Migrating the BBC website to Apache...

Page 1: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Migrating the BBC website

to Apache 2

By Nick Holmes BBC New Media

Page 2: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Who are the BBC

Page 3: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

What is this talk about

Migrating from Apache 1.3.x to 2.0.x

Why we moved

What benefits we achieved

Bugs/Problems we encountered

What we added in a time of change

What’s next with Apache and bbc.co.uk

Page 4: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Who am I?

Nick Holmes – Technical Lead

Standards Focused, Audience Driven

Mainly HTML, mod_include, .htaccessside of Apache.

Apache Advocate

Page 5: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Why we moved

The Business Case

Page 6: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Why we moved

Public service vs Cost

1.7 billion page requests, 44 million users

Licence fee funded

Architecture

Solaris servers / POSIX threads

Threaded ‘worker’ MPM

KeepAlive & Filters

Page 7: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Benefits of

upgrading

What was actually achieved

Page 8: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

What was achieved

50% less server load at rollout

Techniques to further this goal

Pages more dynamic

Easier to build

Quicker to deliver

Page 9: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Solaris & ‘worker’ MPM

Solaris supports POSIX threads

‘Worker’ (multithreaded) MPM

10x No. of connections for 3X memory footprint

Memory Usage

0

1000000

2000000

3000000

4000000

5000000

6000000

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

May 2004

kByte

s

Servers

0

50

100

150

200

250

300

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

May 2004

Serv

ers

Total

Busy

Load Avg

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

May 2004

Lo

ad

Loadavg 15min

Loadavg 5min

Page 10: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Solaris & ‘worker’ MPMCPU Usage

0

5

10

15

20

25

30

35

40

45

50

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

May 2004

% C

PU

cpu sys

cpu user

Load Avg

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

May 2004

Lo

ad

Loadavg 15min

Loadavg 5min

Hits

0

200

400

600

800

1000

1200

1400

1600

1800

1 2 3 4 5 6 7 8 9 10 11121314151617181920 21222324252627282930 31

May 2004

Hit

s/s

Network out

0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

8000000

9000000

10000000

1 4 7 10 13 16 19 22 25 28 31

May 2004

Byte

/sMax Out

Avg Out

Page 11: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

KeepAlive

Multiple requests, same TCP connection

DoS due to memory footprint (even to 1sec)

Threading model allowed this

Page 12: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

bbc.co.uk content

Page 13: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

bbc.co.uk content

Dynamic elements built on:

mod_include

.htaccess

Proxied mod_perl

Proxied IIS servers with XSLT

Page 14: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Filters

Wrapping cgi output with html templates

Previously cgi script driven (BBC::parse)

Output filters allow wrapping on waythough web servers

Page 15: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Mod_include

PCRE* regular expressions

$0 - $9 captures

Previously used mod_rewrite

.htaccess file or the server conf

Not efficient / not maintainable

Page 16: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Reg Ex - example

url : http://foo.uk/bar.shtml?img46438.jpg

<!--#set var="bob" value=“$QUERY_STRING" -->

<!--#if expr="$bob = /img([^.]*)\.(.*)/" -->

<!--#set var="bob_num" value="$1" -->

<!--#set var="bob_ext" value="$2" -->

<img src=“<!--#echo var=“bob” -->” />

<p>This is image number <!--#echo var=“bob_num” -->. Itis a <!--#echo var=“bob_ext” --></p>

Page 17: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Reg Ex - output

Page 18: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Contact with Apache group

Resolved bugs (seg faulting)

Positive response

Inspired open source goals in business

Resulted in new modules

Page 19: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Coding Practices

Apache 2 is less forgiving

Who was experimenting

Who had knowledge

Standards working groups

Enabled development of new techniques

Page 20: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Problems & Bugs

What we had to do in order toroll out

Page 21: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Bugs / ProblemsSeg Faulting – resolved in later versions

Cgi daemon dying – resolved in later versions

PDF chunking problems – resolved by

AddHandler pdf-rewrite pdf

Action pdf-rewrite /cgi-bin/byteserve.pl

String Searches

<!--#if expr="$QUERY_STRING = '/yellow/'" --><!--#if expr="$QUERY_STRING = /yellow/" -->

Special Character escaping

<!--#if expr="$QUERY_STRING = /colour=yellow/" -->

<!--#if expr="$QUERY_STRING = /colour\=yellow/" -->

Page 22: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Problems/Bugs

Filename matchingRewriteRule news/index.shtml /totp/news/2003/11/10/7892.shtml

RewriteRule news/ /totp/news/2003/11/10/7892.shtml

ReDirect temporary /totp/news /totp/news/2003/11/10/7892.shtml

Using server variables without ‘set’-ing<!--#config timefmt="%Y/%m/%d" --><!--#include virtual="/foo/$DATE_GMT/fact.ssi" -->

<!--#set var="datefolder" value="${DATE_GMT}" --><!--#include virtual="/foo/$datefolder/fact.ssi" -->

Exec cgi - replace with include virtual

Including parsed javascript – set vars outside .jsSecurity issues with application mime-types

Page 23: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Problems/BugsAddHandler conflicted with SetOutputFilter

Additional space in call <!-- #include

Using “ inside values – use &quot;

value="Bler od Bu?a tra?i. Bler: "Istorijska bitka uIraku". Neophodan konkretan plan za Irak"

Trailing / on file call

http://www.bbc.co.uk/bbcfour/index.shtml/

Pathinfo on ‘file’ includes – replace with

<!--#include virtual="file.ssi?a=somepathinfo&b=aQueryString" -->

Page 24: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

While we wereupgrading

Our development of modules

Page 25: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

New Modules

Based on mod_include

Parent #func module

LoadModule in conf

New functionality or Easier for builders

Page 26: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Random include

Magazine style pages’ element

Snippets of code or content

Randomly Changing block

Page 27: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Random Include - Example<!--#config timefmt="%S" --><!--#set var="rand_num“ value="$DATE_GMT" -->

<!--#if expr="$rand_num > 55" --> randomchoice 1

<!--#elif expr="$rand_num > 50" --> randomchoice 2

<!--#elif expr="$rand_num > 45" --> randomchoice 3

---<!--#elif expr="$rand_num > 5" --> randomchoice 11

<!--#else --> random choice 12<!--#endif -->

Page 28: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Random include - Solution

<!--#func var="rnd" func="random"min="1“ max="12" -->

<!--#include file=“file${rnd}.ssi" -->

<!--#func var="rnd" func="random"value="red" value="green"value="blue" value="cyan" -->

<!--#echo var="rnd" -->

Page 29: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

SetSplitVars - previously

RewriteEngine On

RewriteCond HTTP_COOKIE BBCWEACITY=uk([0-9][0-9][0-9])

RewriteRule (.*) http://www.bbc.co.uk/$1

[env:wea_var=%1]

Could be done in Apache 2 using

<!--#if

expr="$HTTP_COOKIE = /BBCWEACITY:uk([0-9][0-9][0-9])/"

-->

<!--#set var="uk_weather" value="$1" -->

Page 30: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

SetSplitVars - Solutionurl : http://foo.uk/bar.shtml?a=14&b=28&c=94

In Apache 2<!--#if expr="$QUERY_STRING = /a\=([^&]*)/" -->

<!--#set var=“a" value="$1" -->

<!--#if expr="$QUERY_STRING = /b\=([^&]*)/" -->

<!--#set var=“b" value="$1" -->

<!--#if expr="$QUERY_STRING = /c\=([^&]*)/" -->

<!--#set var=“c" value="$1" -->

With SetSplitVars<!--#setsplitvars value="$QUERY_STRING" -->

Delimiter, Separator, exceptions

Page 31: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Math

Apache comparison = string comparison

Now we can use:=, !=, >, <, etc, numerical comparisons

addition, subtraction (negative addition)

multiplication and division

Negative of numbers

Page 32: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

FLastMod

Extended existing to assign to a variable

Plan to include newest file

Missed part of the process

Now use for checking file existence

…else include default file

Page 33: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

What’s Next

Migrating the News site

Geo-IP

Dedicated image serving

Mod_deflate (gzip)

Issues

Solutions (page flattener)

Progressions (page packaging)

Page 34: Migrating the BBC website to Apache 2people.apache.org/~jim/ApacheCons/ApacheCon2004/pdf/TU17.pdfNetwork out 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000

Thank you

Thank you for your time this afternoon

Please feel free to contact me at :

[email protected]

Questions ??