Aaron Thul Electronic Medical Office Logistics …assets.en.oreilly.com/1/event/12/This is Your...

Post on 02-Aug-2020

1 views 0 download

Transcript of Aaron Thul Electronic Medical Office Logistics …assets.en.oreilly.com/1/event/12/This is Your...

AaronThulElectronicMedicalOfficeLogistics(EMOL)http://chasingnuts.com/oscon1.08.pdf

WhoamI?

  Computer&DatabaseGeek,justlikeyou  FormerlyaSysAdminatAutowebCommunications  PostgreSQLBuildYourCar

  PresentlyaITmanagerataEMOL  PostgreSQLEvangelist  PenguiconOrganizer

WithPostgreSQLandotherOpenSourcesoftwareEMOLis  AllowingDatacollectionfromEMRsandothersources

  AidinginAdherencetonationalstandards  ProvidingPhysicianandPracticelevelbenchmarking

 DataBrokering  EnablingAutomationofNationalinitiatives,suchastheCMSPQRI

EMOLPostgreSQLData

  PatientRecords  BillingRecords  LabResults  ClinicalRecords  InventoryManagement  PatientReportedData

Metadata

  PhysiciansDictations  ScannedDocuments  Images

  XRAYs  MIRIs  CATScans

MetadataStorage

  ReiserFSwithtailpacking  Eachpractice/doctorhasafolder

  SUNOpenSolaris&ZFS???  LinuxandXFS??? NetappWaffle???

EMOLSoftwareBuildingBlocks UbuntuLinuxLTS(8.04)  PostgreSQL(8.3)  Perl(5.8.x) WindowsUnifiedDataStorageServer2003(R2)  YesWindows

EMOLHardwareBuildingBlocks  HPProCurveSwitches

  SupportconsiderablycheaperthanSmartnet

  SonicWallFirewalls  SupportconsiderablycheaperthanSmartnet

  LargenumberofSCSIandSATAHardDrives  iSCSIServersandDAS(DirectAttachedStorage)Systems

WhyPostgreSQL?

  Capable  RequiredFeatures DatabaseTeamExperience  Security  Community

  DocumentationProject  MailingLists  IRC  EventsLikeThis!

WhyPerl?

  PracticalExtractionandReportLanguage  DevelopmentteamexperiencedwithPerl  Unix‐centric,andavailableforWindows  Textparsingandnormalizing  IknowitPerlisnotsexylike

  INSERT ‘new_popular_language’ INTO languages;

  Rapidprototyping  Weaklytyped  Interpreted,thoughveryfast  Supportsobjects

WhoisWhere?

  OSandPostgreSQLbinariesonlocaldisks  RAID1Mirror  15kspindledrives  EXT3

  WALBuffersonlocaldisks  RAID1Mirror  15kspindlespeed  EXT2

  INDEXs  DAS(DirectAttachedStorage)Units  RAID6  10kspindlespeedSCSI  EXT3

  TABLES  MultipleiSCSIServersonSANS  4x1GigabitEthernetInterfacesBonded  8x1TerabyteSATAdrivesperSANNodeRAID6  EXT3

DataDaily

  Loading10GBdatadailyintoPostgreSQL  Loading10GBmetadatadaily

DataSize

SELECT relname, (relpages*8)/1024 as MB

FROM pg_class

ORDER BY relpages DESC;

DataSize

SELECT relname, (relpages*8)/1024 as MB

FROM pg_class

ORDER BY relpages DESC;

Thisdoesnotaccountforpg_toast

Thisdoesprovidemoreprecision

DataSizeReally

SELECT nspname || '.' || relname AS "relation",

pg_size_pretty(pg_relation_size(nspname || '.' || relname)) AS "size"

FROM pg_class C

LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)

WHERE nspname NOT IN ('pg_catalog', 'information_schema')

AND nspname !~ '^pg_toast'

AND pg_relation_size(nspname || '.' || relname)>0

ORDER BY pg_relation_size(nspname || '.' || relname) DESC

Howmuchdataarewetalking

  LargestTable:1,844.73GB  SecondLargestTable:1,289.36GB

  LargestIndex:411.91GB  SecondLargestIndex:405.08GB

  TotalDBsizeondisk:16,800.39GB

BettermakesureweneedthatINDEXselect

indexrelid::regclass as index, relid::regclass as table

from

pg_stat_user_indexes

JOIN pg_index USING (indexrelid)

where

idx_scan = 0 and indisunique is false;

Moredetailsat:

http://people.planetpostgresql.org/xzilla/index.php?/archives/351‐Index‐pruning‐techniques.html

Runittwiceandmakeitfaster

 Maintaina1/500setofrandomsampledata  ALLquerieshitthatdatabasefirst Onlyoncequeryresultissuccessfulisthequerymovedontoproductiondatabaseserver

HowdoIsleepatnight

  FirstName  LastNames  SocialSecurityNumbers  BirthDates

 Neededtotrackpeopleovertimeandgeography

HowdoIsleepatnight

"Bydefault,PostgreSQLisprobablythemostsecurity‐awaredatabaseavailable..."

DatabaseHacker'sHandbook

ProtectingtheWarehouse

  Simpleprocessesthatarefollowed  IntrusionPrevention&Firewalls  SecurityMonitoring&Management‐MSSP  EncryptedCommunication  IdentityManagement‐Centralizedmanagementofusersandgroups–mitigatesvulnerabilitiesthatoccurduetoinconsistencies

ProtectingtheWarehouse

  Role‐basedsecurity  Functionseveryplacewecan  Identitydatasymmetricallyencrypted Dataisanonamizedinallbutafewtables  Role‐basedsecurity  Alldataisanonamizedbeforeitissentout

LessonsLearned

  ServerEthernetCardsarenotallmadethesame

 With100+drivesbereadytoRMAsomedisks  YoucanneverhavetomanyDIMMslots  YoudogetwhatyoupayforwithRAIDcontrollers

  Youcan’thavetobigacacheonyourRAIDcontroller

MoreLessonsLearned

  pg_resetxlog isnotTHATscary  YoucanneverhavetomanyPCI‐XSlots  Auto‐vacuumisnotalwaysyourfriend

MoreLessonsLearned

 Worrywhenadevelopersays“Ihaveanidea”  Somemistakesarejusttomuchfuntomakeonlyonce

  Iamusedtohearing“Itseemslikeyouaredoingsomethingfundamentallywrong”

 Neveraskfordirectionsfromatwo‐headedtourist!

‐BigBird

LookingForward

  Idon’tthinkIneedtoworryaboutPostgreSQLscaling  Sizematters:Yahooclaims2‐petabytedatabaseisworld'sbiggest,busiest

  http://www.computerworld.com/action/article.do?command=viewArticleBasic&taxonomyId=18&articleId=9087918&intsrc=hm_topic

LookingForward

  GridSQLfromEnterpriseDB  BuiltusingmultiplestandardPostgreSQLservers  OpenSourceProject

Questions

 Web:http://www.chasingnuts.com  Email:aaron@chasingnuts.com  IRC:AaronThulonirc.freenode.org  Jabber:apthul@gmail.com  Twitter:@AaronThul  AIM:AaronThul