Migrating the BBC website to Apache...
Transcript of Migrating the BBC website to Apache...
Migrating the BBC website
to Apache 2
By Nick Holmes BBC New Media
Who are the BBC
What is this talk about
Migrating from Apache 1.3.x to 2.0.x
Why we moved
What benefits we achieved
Bugs/Problems we encountered
What we added in a time of change
What’s next with Apache and bbc.co.uk
Who am I?
Nick Holmes – Technical Lead
Standards Focused, Audience Driven
Mainly HTML, mod_include, .htaccessside of Apache.
Apache Advocate
Why we moved
The Business Case
Why we moved
Public service vs Cost
1.7 billion page requests, 44 million users
Licence fee funded
Architecture
Solaris servers / POSIX threads
Threaded ‘worker’ MPM
KeepAlive & Filters
Benefits of
upgrading
What was actually achieved
What was achieved
50% less server load at rollout
Techniques to further this goal
Pages more dynamic
Easier to build
Quicker to deliver
Solaris & ‘worker’ MPM
Solaris supports POSIX threads
‘Worker’ (multithreaded) MPM
10x No. of connections for 3X memory footprint
Memory Usage
0
1000000
2000000
3000000
4000000
5000000
6000000
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
May 2004
kByte
s
Servers
0
50
100
150
200
250
300
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
May 2004
Serv
ers
Total
Busy
Load Avg
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
May 2004
Lo
ad
Loadavg 15min
Loadavg 5min
Solaris & ‘worker’ MPMCPU Usage
0
5
10
15
20
25
30
35
40
45
50
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
May 2004
% C
PU
cpu sys
cpu user
Load Avg
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
May 2004
Lo
ad
Loadavg 15min
Loadavg 5min
Hits
0
200
400
600
800
1000
1200
1400
1600
1800
1 2 3 4 5 6 7 8 9 10 11121314151617181920 21222324252627282930 31
May 2004
Hit
s/s
Network out
0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
8000000
9000000
10000000
1 4 7 10 13 16 19 22 25 28 31
May 2004
Byte
/sMax Out
Avg Out
KeepAlive
Multiple requests, same TCP connection
DoS due to memory footprint (even to 1sec)
Threading model allowed this
bbc.co.uk content
bbc.co.uk content
Dynamic elements built on:
mod_include
.htaccess
Proxied mod_perl
Proxied IIS servers with XSLT
Filters
Wrapping cgi output with html templates
Previously cgi script driven (BBC::parse)
Output filters allow wrapping on waythough web servers
Mod_include
PCRE* regular expressions
$0 - $9 captures
Previously used mod_rewrite
.htaccess file or the server conf
Not efficient / not maintainable
Reg Ex - example
url : http://foo.uk/bar.shtml?img46438.jpg
<!--#set var="bob" value=“$QUERY_STRING" -->
<!--#if expr="$bob = /img([^.]*)\.(.*)/" -->
<!--#set var="bob_num" value="$1" -->
<!--#set var="bob_ext" value="$2" -->
<img src=“<!--#echo var=“bob” -->” />
<p>This is image number <!--#echo var=“bob_num” -->. Itis a <!--#echo var=“bob_ext” --></p>
Reg Ex - output
Contact with Apache group
Resolved bugs (seg faulting)
Positive response
Inspired open source goals in business
Resulted in new modules
Coding Practices
Apache 2 is less forgiving
Who was experimenting
Who had knowledge
Standards working groups
Enabled development of new techniques
Problems & Bugs
What we had to do in order toroll out
Bugs / ProblemsSeg Faulting – resolved in later versions
Cgi daemon dying – resolved in later versions
PDF chunking problems – resolved by
AddHandler pdf-rewrite pdf
Action pdf-rewrite /cgi-bin/byteserve.pl
String Searches
<!--#if expr="$QUERY_STRING = '/yellow/'" --><!--#if expr="$QUERY_STRING = /yellow/" -->
Special Character escaping
<!--#if expr="$QUERY_STRING = /colour=yellow/" -->
<!--#if expr="$QUERY_STRING = /colour\=yellow/" -->
Problems/Bugs
Filename matchingRewriteRule news/index.shtml /totp/news/2003/11/10/7892.shtml
RewriteRule news/ /totp/news/2003/11/10/7892.shtml
ReDirect temporary /totp/news /totp/news/2003/11/10/7892.shtml
Using server variables without ‘set’-ing<!--#config timefmt="%Y/%m/%d" --><!--#include virtual="/foo/$DATE_GMT/fact.ssi" -->
<!--#set var="datefolder" value="${DATE_GMT}" --><!--#include virtual="/foo/$datefolder/fact.ssi" -->
Exec cgi - replace with include virtual
Including parsed javascript – set vars outside .jsSecurity issues with application mime-types
Problems/BugsAddHandler conflicted with SetOutputFilter
Additional space in call <!-- #include
Using “ inside values – use "
value="Bler od Bu?a tra?i. Bler: "Istorijska bitka uIraku". Neophodan konkretan plan za Irak"
Trailing / on file call
http://www.bbc.co.uk/bbcfour/index.shtml/
Pathinfo on ‘file’ includes – replace with
<!--#include virtual="file.ssi?a=somepathinfo&b=aQueryString" -->
While we wereupgrading
Our development of modules
New Modules
Based on mod_include
Parent #func module
LoadModule in conf
New functionality or Easier for builders
Random include
Magazine style pages’ element
Snippets of code or content
Randomly Changing block
Random Include - Example<!--#config timefmt="%S" --><!--#set var="rand_num“ value="$DATE_GMT" -->
<!--#if expr="$rand_num > 55" --> randomchoice 1
<!--#elif expr="$rand_num > 50" --> randomchoice 2
<!--#elif expr="$rand_num > 45" --> randomchoice 3
---<!--#elif expr="$rand_num > 5" --> randomchoice 11
<!--#else --> random choice 12<!--#endif -->
Random include - Solution
<!--#func var="rnd" func="random"min="1“ max="12" -->
<!--#include file=“file${rnd}.ssi" -->
<!--#func var="rnd" func="random"value="red" value="green"value="blue" value="cyan" -->
<!--#echo var="rnd" -->
SetSplitVars - previously
RewriteEngine On
RewriteCond HTTP_COOKIE BBCWEACITY=uk([0-9][0-9][0-9])
RewriteRule (.*) http://www.bbc.co.uk/$1
[env:wea_var=%1]
Could be done in Apache 2 using
<!--#if
expr="$HTTP_COOKIE = /BBCWEACITY:uk([0-9][0-9][0-9])/"
-->
<!--#set var="uk_weather" value="$1" -->
SetSplitVars - Solutionurl : http://foo.uk/bar.shtml?a=14&b=28&c=94
In Apache 2<!--#if expr="$QUERY_STRING = /a\=([^&]*)/" -->
<!--#set var=“a" value="$1" -->
<!--#if expr="$QUERY_STRING = /b\=([^&]*)/" -->
<!--#set var=“b" value="$1" -->
<!--#if expr="$QUERY_STRING = /c\=([^&]*)/" -->
<!--#set var=“c" value="$1" -->
With SetSplitVars<!--#setsplitvars value="$QUERY_STRING" -->
Delimiter, Separator, exceptions
Math
Apache comparison = string comparison
Now we can use:=, !=, >, <, etc, numerical comparisons
addition, subtraction (negative addition)
multiplication and division
Negative of numbers
FLastMod
Extended existing to assign to a variable
Plan to include newest file
Missed part of the process
Now use for checking file existence
…else include default file
What’s Next
Migrating the News site
Geo-IP
Dedicated image serving
Mod_deflate (gzip)
Issues
Solutions (page flattener)
Progressions (page packaging)
Thank you
Thank you for your time this afternoon
Please feel free to contact me at :
Questions ??