Download - Inside the PostgreSQL Project Infrastructure

Transcript
Page 1: Inside the PostgreSQL Project Infrastructure

Inside the PostgreS QL Project Infrastructure S lide: 1

Presentation TitlePresentation Sub-Title

Dave Page, 25th March 2010

Ins ide the Pos tg reS QL Projec t Infras truc ture

Dave PagePostgreS QL Core Team

S enior S oftware Architect, EnterpriseDB

Page 2: Inside the PostgreSQL Project Infrastructure

S lide: 2Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

In the beginning...

Date: Tue, 23 Apr 1996 16:06:10 -0400 (EDT) From: "Marc G. Fournier" <[email protected]> Subject: Re: [PG95]: postgres95 TODO list posted on the web To: Chad Robinson <[email protected]> cc: Jolly Chen <[email protected]>, [email protected]

...

If it helps, I’d be willing to setup a cvs database, including appropriate accounts for a core few developers that patches can go through.

From there, it wouldn’t be too hard to do a weekly "distribution" that is ftpable.

I don’t know enough about the server backend to offer much more then that :(\

Page 3: Inside the PostgreSQL Project Infrastructure

S lide: 3Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

The first server

Hosted in Toronto, Canada

Page 4: Inside the PostgreSQL Project Infrastructure

S lide: 4Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Early services

• Mailing lists

• CVS repository

• FTP site

Page 5: Inside the PostgreSQL Project Infrastructure

S lide: 5Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

14 Years later...

• 20+ Physical servers

• 35+ Virtual Machines

• Hosted in:– France– Panama C ity– Austria– Canada– US A (4 independent locations)

Page 6: Inside the PostgreSQL Project Infrastructure

S lide: 6Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Current services

• FTP site• Website• S ource control – CVS and GIT• Mailing lists• Wiki• Mailing list archives• Website/archives search• pgFoundry• Commitfest management server• Buildfarm and Hudson servers• Development servers… and more!

Page 7: Inside the PostgreSQL Project Infrastructure

S lide: 7Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

OS “zoo”

• Primarily using FreeBS D jails:– Easy to backup– Easy to relocate– Per-function jails

• Also running:– Ubuntu– S lackware– CentOS– Windows

Page 8: Inside the PostgreSQL Project Infrastructure

S lide: 8Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Problems

• Little FreeBS D experience in the community.

• FreeBS D Ports are hard to upgrade, especially with lots of jails.

• Hosting companies don't tend to like FreeBS D – and we can't be too picky!

• No centralised management or deployment.

Page 9: Inside the PostgreSQL Project Infrastructure

S lide: 9Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

New infrastructure

• Runs Debian virtual machines under KVM on Debian hosts.

• Pre-built packages setup base hosts and VMs.

• Management system automates:– VM creation and configuration.– Addition and removal of user accounts and S S H keys.– Package installation and upgrades.– Detection of unexpected user accounts or unauthorised services.– S etup and configuration of Nagios and Munin monitoring.– Auto-backup configuration

Page 10: Inside the PostgreSQL Project Infrastructure

S lide: 10Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Monitoring

• Nagios

• Munin

• S mokeping

• Auto-backup

• Google Analytics

Page 11: Inside the PostgreSQL Project Infrastructure

S lide: 11Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Nagios

• 64 Hosts

• 514 S ervices

• S ervice checks include:– S ervice availability – NTP, S S H, HTTP, FTP, RS YNC etc.– Utilisation – disk usage, logged in users, processes, mail queue– Management – software update availability– “Our stuff” - buildfarm status, search indexer, database backups

Page 12: Inside the PostgreSQL Project Infrastructure

S lide: 12Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Nagios

Page 13: Inside the PostgreSQL Project Infrastructure

S lide: 13Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Munin

• Monitors resource trends:– Disk usage– Network utilisation– Processes– S endmail/Postfix stats– CPU/Memory utilisation– Apache stats

Page 14: Inside the PostgreSQL Project Infrastructure

S lide: 14Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Munin

• CPU usage – postgresql01.managed.contegix.com

Page 15: Inside the PostgreSQL Project Infrastructure

S lide: 15Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

S mokeping

• Monitors network latency to various hosts from the Conova Communications data centre in S alzburg, Austria

Page 16: Inside the PostgreSQL Project Infrastructure

S lide: 16Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Auto-backup

• Automatically backs up changes to key configuration files to S ubversion.

– Gives us a simple backup of config files– Allows us to trace the history of changes to a file

• Alerts the sysadmins to changes to monitored files

– Helps us see what the other team members are doing– Acts as a simple Intrusion Detection System

Page 17: Inside the PostgreSQL Project Infrastructure

S lide: 17Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Google Analytics

• Monitors website utilisation.

• Helps us understand how the website is used.

• Can be hampered by disabled scripting support in browsers, common with computer geeks!

Page 18: Inside the PostgreSQL Project Infrastructure

S lide: 18Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Google Analytics

Page 19: Inside the PostgreSQL Project Infrastructure

S lide: 19Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

FTP S ite

• Primary site: ftp.postgresql.org

• 62 regional mirrors in 39 countries

• Mirrors may also serve content via:– HTTP (supported)– RS YNC (unsupported)

• Content includes main FTP site, and pgFoundry downloads

Page 20: Inside the PostgreSQL Project Infrastructure

S lide: 20Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

FTP Mirror monitoring

• All mirrors are checked daily by the 'mirrorbot'

• The mirrorbot checks that content is up to date:

– Fresh mirrors have a DNS hostname, e.g. ftp.uk.postgresql.org– Fresh mirrors are listed on the website for users to choose

• Out of date or broken mirrors:

– Are automatically removed from the website and DNS .– Are reported to their maintainers via email.– Are automatically purged from the system if un-fixed.

Page 21: Inside the PostgreSQL Project Infrastructure

S lide: 21Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Website infrastructure

• Developed following the great 8.0 S lashdotting incident.

• Capable of handling high-load scenarios on release days..

• Minimised points of failure.

Page 22: Inside the PostgreSQL Project Infrastructure

S lide: 22Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

wwwmaster.postgresql.org

• Dynamic, master server.

• Runs custom-built PHP framework for:– S tatic page rendering (general content)– Dynamic page rendering (docs, news, events etc)– Form processing

• Dynamic content stored in PostgreS QL.

• S tatic version of content generated hourly by a spider, and pushed via RS YNC to the static servers.

• Users redirected back to static servers where possible.

Page 23: Inside the PostgreSQL Project Infrastructure

S lide: 23Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

www.postgresql.org

• S tatic, slave servers

• S erve HTML, CS S , images and files such as PDFs.

• Currently 2 servers.

• Geographically diverse.

• Round-robin load balanced via DNS .

• Monitoring system dynamically removes servers from DNS within minutes of a failure.

Page 24: Inside the PostgreSQL Project Infrastructure

S lide: 24Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Website problems

• PHP framework is complex, and understood by few.

• Adding new dynamic content can require significant effort to build administration pages.

• The framework includes lots of features and code we thought we needed, but then never used.

• S pider can take hours to process the entire site.

• S pidering the site is very inefficient.

Page 25: Inside the PostgreSQL Project Infrastructure

S lide: 25Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

New website infrastructure

• S limmed down and vastly simplified framework, built using Django and Python.

• Django's administration module makes it easy to add and manage content.

• S pider and static slaves will be replaced with Varnish cache servers:– Pages dynamically cached from wwwmaster on first request.– Last available content served if wwwmaster goes down.– Cache invalidation of individual or groups of pages as changed

on wwwmaster.

Page 26: Inside the PostgreSQL Project Infrastructure

S lide: 26Dave Page, 25th March 2010 Inside the PostgreS QL Project Infrastructure

Questions?

Thank you.