This is Your PostgreSQL on Drugs

40
Aaron Thul Electronic Medical Office Logistics (EMOL) http://chasingnuts.com/oscon1.08.pdf

description

PostgreSQL is known to be a powerful open source relational database with many uses. One such use is warehousing EMRs (Electronic Medical Records) from oncology practices across the country. PostgreSQL, Perl, Apache, Ubuntu Linux, and OpenBSD are all used for their strengths to deliver information to pharmaceutical companies to see what their drugs are doing for individuals in real world scenarios.Do you have a large amount of data that needs to be searchable, aggregated, and extremely secure at the same time? See many of the creative solutions that have been deployed to help facilitate how we put PostgreSQL to the task of drugs.

Transcript of This is Your PostgreSQL on Drugs

Page 1: This is Your PostgreSQL on Drugs

AaronThulElectronicMedicalOfficeLogistics(EMOL)http://chasingnuts.com/oscon1.08.pdf

Page 2: This is Your PostgreSQL on Drugs

Sorrynofreesamples

Page 3: This is Your PostgreSQL on Drugs

339

Page 4: This is Your PostgreSQL on Drugs

WhoamI?

  Computer&DatabaseGeek,justlikeyou  FormerOSCONPresenter  PresentlyaITmanagerataEMOL  PostgreSQLEvangelist  PenguiconOrganizer

Page 5: This is Your PostgreSQL on Drugs
Page 6: This is Your PostgreSQL on Drugs
Page 7: This is Your PostgreSQL on Drugs

WithPostgreSQLandOSSEMOLis

 DatacollectionfromEMRsandothersources  AidinginAdherencetostandards  ProvidingPhysicianandPracticelevelbenchmarking

 DataBrokering  EnablingAutomationofNationalinitiatives  Improvingpatientcare

Page 8: This is Your PostgreSQL on Drugs

EMOLPostgreSQLData

  PatientRecords  BillingRecords  LabResults  ClinicalRecords  InventoryManagement  PatientReportedData

Page 9: This is Your PostgreSQL on Drugs

Metadata

  PhysiciansDictations  ScannedDocuments  Images

  XRAYs  MIRIs  CATScans

Page 10: This is Your PostgreSQL on Drugs

MetadataStorage

ReiserFSwithtailpackingEachpractice/doctorhasafolder

SUNOpenSolaris&ZFS???LinuxandXFS???NetappWaffle???

Page 11: This is Your PostgreSQL on Drugs

EMOLSoftware

 UbuntuLinuxLTS(8.04)  PostgreSQL(8.3)  Perl(5.8.x) WindowsUnifiedDataStorageServer2003

  YesWindows

Page 12: This is Your PostgreSQL on Drugs

EMOLHardware

  HPProCurveSwitches  SonicWallFirewalls&IDS  LargenumberofSCSIandSATAHardDrives  iSCSIServersandDAS

Page 13: This is Your PostgreSQL on Drugs

WhyPostgreSQL?

CapableRequiredFeaturesDatabaseTeamExperienceSecurityCommunity

  DocumentationProject  MailingLists  IRC  EventsLikeThis!

Page 14: This is Your PostgreSQL on Drugs

WhyPostgreSQL?

Page 15: This is Your PostgreSQL on Drugs

WhyPerl?

 DevelopmentteamexperiencedwithPerl Unix‐centric,andavailableforWindows  Textparsingandnormalizing  IknowitPerlisnotsexylikeINSERT ‘new_popular_language’ INTO languages;

Page 16: This is Your PostgreSQL on Drugs

WhoisWhere?

OSandPostgreSQLbinariesonlocaldisks  RAID1Mirror  15kspindledrives  EXT3

Page 17: This is Your PostgreSQL on Drugs

WhoisWhere?

WALBuffersonlocaldisks  RAID1Mirror  15kspindlespeed  EXT2

Page 18: This is Your PostgreSQL on Drugs

WhoisWhere?

INDEXs  DAS(DirectAttachedStorage)Units  RAID6  10kspindlespeedSCSI  EXT3

Page 19: This is Your PostgreSQL on Drugs

WhoisWhere?

TABLES  MultipleiSCSIServersonSANS  4x1GigabitEthernetInterfacesBonded  8x1TerabyteSATAdrivesperSANNodeRAID6  EXT3

Page 20: This is Your PostgreSQL on Drugs

DataDaily

Loading10GBdatadailyintoPostgreSQLLoading10GBmetadatadaily

Page 21: This is Your PostgreSQL on Drugs

DataSize

SELECT relname, (relpages*8)/1024 as MB

FROM pg_class

ORDER BY relpages DESC;

Page 22: This is Your PostgreSQL on Drugs

DataSize

SELECT relname, (relpages*8)/1024 as MB

FROM pg_class

ORDER BY relpages DESC;

Thisdoesnotaccountforpg_toast

Thisdoesprovidemoreprecision

Page 23: This is Your PostgreSQL on Drugs

DataSizeReally

SELECT nspname || '.' || relname AS "relation",

pg_size_pretty(pg_relation_size(nspname || '.' || relname)) AS "size"

FROM pg_class C

LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)

WHERE nspname NOT IN ('pg_catalog', 'information_schema')

AND nspname !~ '^pg_toast'

AND pg_relation_size(nspname || '.' || relname)>0

ORDER BY pg_relation_size(nspname || '.' || relname) DESC

Page 24: This is Your PostgreSQL on Drugs

Howmuchdataarewetalking

LargestTable:1,844.73GBSecondLargestTable:1,289.36GB

Page 25: This is Your PostgreSQL on Drugs

Howmuchdataarewetalking

LargestIndex:411.91GBSecondLargestIndex:405.08GB

Page 26: This is Your PostgreSQL on Drugs

Howmuchdataarewetalking

TotalDBsizeondisk:16,800.39GB

Page 27: This is Your PostgreSQL on Drugs

BettermakesureweneedthatINDEXselect

indexrelid::regclass as index, relid::regclass as table

from

pg_stat_user_indexes

JOIN pg_index USING (indexrelid)

where

idx_scan = 0 and indisunique is false;

Moredetailsat:

http://people.planetpostgresql.org/xzilla/index.php?/archives/351‐Index‐pruning‐techniques.html

Page 28: This is Your PostgreSQL on Drugs

Runittwiceandmakeitfaster

Maintaina1/500setofrandomsampledataALLquerieshitthatdatabasefirst

Page 29: This is Your PostgreSQL on Drugs

HowdoIsleepatnight

FirstNameLastNamesSocialSecurityNumbersBirthDates

Neededtotrackpeopleovertimeandgeography

Page 30: This is Your PostgreSQL on Drugs

HowdoIsleepatnight

"Bydefault,PostgreSQLisprobablythemostsecurity‐awaredatabaseavailable..."

DatabaseHacker'sHandbook

Page 31: This is Your PostgreSQL on Drugs

ProtectingtheWarehouse

  Simpleprocessesthatarefollowed  IntrusionPrevention&Firewalls  SecurityMonitoring&Management‐MSSP  EncryptedCommunication  Centralizedmanagementofusersandgroups

  mitigatesvulnerabilitiesthatoccurduetoinconsistencies

Page 32: This is Your PostgreSQL on Drugs

ProtectingtheWarehouse

  Role‐basedsecurity  SECURITYDEFINERFunctionswherewecan  Identitydatasymmetricallyencrypted Dataisanonamizedinallbutafewtables  Role‐basedsecurityandschemas  Alldataisanonamizedbeforeitissentout

Page 33: This is Your PostgreSQL on Drugs

PostgreSQLscaling

Sizematters:Yahooclaims2‐petabytedatabaseisworld'sbiggest,busiest

Page 34: This is Your PostgreSQL on Drugs

PostgreSQLscaling

BasedonamodifiedPostgreSQLengine,theyear‐olddatabaseprocesses24billioneventsaday,accordingtoWaqarHasan,vicepresidentofengineeringinYahoo'sdatagroup.

Page 35: This is Your PostgreSQL on Drugs

PostgreSQLscaling

GridSQLfromEnterpriseDB  BuiltusingmultiplestandardPostgreSQLservers  OpenSourceProject

Page 36: This is Your PostgreSQL on Drugs

LessonsLearned

ServerEthernetCardsarenotallmadethesame

With100+drivesbereadytoRMAsomedisks

Youcan’thavetobigacacheonyourRAIDcontroller

Page 37: This is Your PostgreSQL on Drugs

MoreLessonsLearned

pg_resetxlog isnotTHATscaryDon’teverusethis!!!

YoucanneverhavetomanyPCI‐XSlots

Auto‐vacuumisnotalwaysyourfriend

Page 38: This is Your PostgreSQL on Drugs

MoreLessonsLearned

Worrywhenadevelopersays“Ihaveanidea”

Somemistakesarejusttomuchfuntomakeonlyonce

Page 39: This is Your PostgreSQL on Drugs

MoreLessonsLearned

Iamusedtohearing“Itseemslikeyouaredoingsomethingfundamentallywrong”

Neveraskfordirectionsfromatwo‐headedtourist!

‐BigBird

Page 40: This is Your PostgreSQL on Drugs

Questions

Web:http://www.chasingnuts.comEmail:[email protected]:AaronThulonirc.freenode.orgJabber:[email protected]:@AaronThulAIM:AaronThul