Review: Software Security David Brumley [email protected] Carnegie Mellon University.
Carnegie Mellon Andrew System E-mail Architecture at Carnegie Mellon University Rob SiemborskiWalter...
-
date post
20-Dec-2015 -
Category
Documents
-
view
215 -
download
3
Transcript of Carnegie Mellon Andrew System E-mail Architecture at Carnegie Mellon University Rob SiemborskiWalter...
Carnegie Mellon
Andrew SystemAndrew SystemE-mail Architecture atE-mail Architecture at
Carnegie Mellon UniversityCarnegie Mellon UniversityRob SiemborskiRob Siemborski Walter WongWalter Wong
[email protected]@andrew.cmu.edu [email protected]@cmu.edu
Computing ServicesCarnegie Mellon University
5000 Forbes AvePittsburgh, PA 15213Last Revision: 01/27/2004 (wcw)
Carnegie Mellon
Presentation OverviewPresentation Overview
History & GoalsHistory & Goals The Big PictureThe Big Picture Mail Transfer AgentsMail Transfer Agents Mail Processing (Spam & Virus Detection)Mail Processing (Spam & Virus Detection) The DirectoryThe Directory The Cyrus IMAP AggregatorThe Cyrus IMAP Aggregator Clients and Andrew WebmailClients and Andrew Webmail Current Andrew Hardware ConfigurationCurrent Andrew Hardware Configuration Future DirectionsFuture Directions
Carnegie Mellon
The Early YearsThe Early Years
Early 80s – The Andrew ProjectEarly 80s – The Andrew Project Campus-wide computingCampus-wide computing Joint IBM/CMU VentureJoint IBM/CMU Venture One of the first large scale distributed systems, One of the first large scale distributed systems,
challenging the ‘mainframe’ mentalitychallenging the ‘mainframe’ mentality
The Andrew File System (AFS)The Andrew File System (AFS) The Andrew Message System (AMS)The Andrew Message System (AMS)
Carnegie Mellon
Goals of theGoals of theAndrew Message SystemAndrew Message System
ReliabilityReliability Machine and Location IndependenceMachine and Location Independence Integrated Message DatabaseIntegrated Message Database
Personal Mail and Bulletin BoardsPersonal Mail and Bulletin Boards Separation of Interface from FunctionalitySeparation of Interface from Functionality Support for Multi-MediaSupport for Multi-Media ScalabilityScalability Easy to Extend, Easy to UseEasy to Extend, Easy to Use
Carnegie Mellon
End of AMSEnd of AMS
AMS was a nonstandard systemAMS was a nonstandard system Avoid becoming a “technology island”Avoid becoming a “technology island” Desire to not maintain our own clients.Desire to not maintain our own clients.
AMS was showing scalability AMS was showing scalability problemsproblems
Desire to decouple the file system Desire to decouple the file system from the mail systemfrom the mail system
Carnegie Mellon
Project Cyrus GoalsProject Cyrus Goals
Scalable to tens of thousands of usersScalable to tens of thousands of users Support wide use of bulletin boardsSupport wide use of bulletin boards Use widely accepted standards-based Use widely accepted standards-based
technologiestechnologies Comprehensive client support on all Comprehensive client support on all
major platformsmajor platforms Supports a disconnected mode of Supports a disconnected mode of
operation for the mobile useroperation for the mobile user
Carnegie Mellon
Project Cyrus Goals (2)Project Cyrus Goals (2)
Supports Kerberos authenticationSupports Kerberos authentication Allows for easy sharing of private Allows for easy sharing of private
folders with select individualsfolders with select individuals Separation of the mail store from a Separation of the mail store from a
distributed file systemdistributed file system Can be independently installed, Can be independently installed,
managed and set up for use in small managed and set up for use in small departmental computing facilitiesdepartmental computing facilities
Carnegie Mellon
More CMU Mail System More CMU Mail System GoalsGoals
Allow users to have a single @cmu.edu Allow users to have a single @cmu.edu address no matter where their actual mail address no matter where their actual mail store is locatedstore is located ““CMUName” ServiceCMUName” Service
Ability to detect and act on incoming Ability to detect and act on incoming Spam and Virus MessagesSpam and Virus Messages
Provide access mail over the WebProvide access mail over the Web Integration of messaging into the overall Integration of messaging into the overall
Computing ExperienceComputing Experience
Carnegie Mellon
The Big PictureThe Big Picture
LDAP DirectoryServers Cyrus IMAP AggregatorMail Transfer Agents
(Three Pools)
The InternetUsers /Mail Clients
Carnegie Mellon
Mail Transfer AgentsMail Transfer Agents
LDAP DirectoryServers Cyrus IMAP AggregatorMail Transfer Agents
(Three Pools)
The InternetUsers /Mail Clients
Carnegie Mellon
Mail Transfer AgentsMail Transfer Agents
Andrew has 3 Pools of Mail Transfer Agent Andrew has 3 Pools of Mail Transfer Agent (MTA) Machines(MTA) Machines Mail exchangers (MX Servers) receive and Mail exchangers (MX Servers) receive and
handle mail from the outside world for the handle mail from the outside world for the ANDREW.CMU.EDU domain.ANDREW.CMU.EDU domain.
The “SMTP Servers” process user submitted The “SMTP Servers” process user submitted messages (SMTP.ANDREW.CMU.EDU)messages (SMTP.ANDREW.CMU.EDU)
Mail exchangers for the CMU.EDU domain (the Mail exchangers for the CMU.EDU domain (the CMU.EDU MXs)CMU.EDU MXs)
All Andrew MTAs run SendmailAll Andrew MTAs run Sendmail
Carnegie Mellon
Mail Transfer Agents (2)Mail Transfer Agents (2)
Why 3 Pools?Why 3 Pools? MX ServersMX Servers
Subject to the ebb and flow of the outside worldSubject to the ebb and flow of the outside world Significant CPU-intensive processingSignificant CPU-intensive processing Typically handle much larger queues (7,000+ Typically handle much larger queues (7,000+
messages each)messages each) SMTP ServersSMTP Servers
Speak directly to our clientsSpeak directly to our clients Need to be very responsiveNeed to be very responsive Very small queues (200 messages each)Very small queues (200 messages each)
Carnegie Mellon
Mail Transfer Agents (3)Mail Transfer Agents (3)
CMU.EDU MXsCMU.EDU MXs Service separation from Andrew MX serversService separation from Andrew MX servers Mostly just forwardingMostly just forwarding No real need to duplicate processing done No real need to duplicate processing done
on Andrew MX serverson Andrew MX servers
All Three Pools are RedundantAll Three Pools are Redundant Minimize impact of a machine failureMinimize impact of a machine failure
Carnegie Mellon
Mail Transfer Agents (4)Mail Transfer Agents (4)
Separate MTA pools give significant Separate MTA pools give significant control over incoming email.control over incoming email. A message may touch multiple poolsA message may touch multiple pools Example:Example:
User submits messageto [email protected] via
SMTP servers
Message processed byCMU.EDU MX, bound [email protected]
Messageprocessed byANDREW MX
Final DeliveryTo Cyrus Aggregator
Carnegie Mellon
Mail ProcessingMail Processing
All mail through the system is All mail through the system is “processed” to some degree.“processed” to some degree. Audit LoggingAudit Logging Cleaning badly-formed messagesCleaning badly-formed messages Blocking restricted Blocking restricted
sender/recipients/relayssender/recipients/relays
More substantial processing done by More substantial processing done by Andrew MX ServersAndrew MX Servers
Carnegie Mellon
Mail Processing (2)Mail Processing (2)
Spam DetectionSpam Detection Uses Heuristic Algorithms to identify Uses Heuristic Algorithms to identify
Spam Messages (SpamAssassin)Spam Messages (SpamAssassin) Tags message with a header and scoreTags message with a header and score User initiated filters (SIEVE) can detect User initiated filters (SIEVE) can detect
the header and act upon it (bounce the the header and act upon it (bounce the message or file it into an alternate folder)message or file it into an alternate folder)
Very computationally expensive on MXVery computationally expensive on MX
Carnegie Mellon
Mail Processing (3)Mail Processing (3)
Virus DetectionVirus Detection Uses signatures to match virus Uses signatures to match virus
messages (ClamAV)messages (ClamAV) ““Bounce” message immediately at the Bounce” message immediately at the
incoming RCPTincoming RCPT Debate between bounce vs. tagDebate between bounce vs. tag
Carnegie Mellon
The DirectoryThe Directory
LDAP DirectoryServers Cyrus IMAP AggregatorMail Transfer Agents
(Three Pools)
The InternetUsers /Mail Clients
Carnegie Mellon
The DirectoryThe Directory
Mail delivery and routing is assisted by Mail delivery and routing is assisted by an LDAP-accessible database.an LDAP-accessible database.
Every valid destination address has an Every valid destination address has an LDAP entityLDAP entity
LDAP lookups can do “fuzzy matching”LDAP lookups can do “fuzzy matching”
LDAP queries done against replicated LDAP queries done against replicated poolpool
Carnegie Mellon
The Directory (2)The Directory (2)
Every account has a Every account has a mailRoutingAddress: the “next hop” mailRoutingAddress: the “next hop” of the delivery processof the delivery process mRA is not generally user configurablemRA is not generally user configurable
Some accounts have a user-Some accounts have a user-configurable mailForwardingAddress configurable mailForwardingAddress (mFA)(mFA) mFA will override the mRAmFA will override the mRA
Carnegie Mellon
The Cyrus IMAP AggregatorThe Cyrus IMAP Aggregator
LDAP DirectoryServers Cyrus IMAP AggregatorMail Transfer Agents
(Three Pools)
The InternetUsers /Mail Clients
Carnegie Mellon
The IMAP ProtocolThe IMAP Protocol
Standard Protocol developed by the Standard Protocol developed by the IETFIETF
Messages Remain on ServerMessages Remain on Server MIME (Multipurpose Internet Mail MIME (Multipurpose Internet Mail
Extentions) AwareExtentions) Aware Support for Disconnected OperationSupport for Disconnected Operation AMS-Like Features (ACLs, Quota, etc)AMS-Like Features (ACLs, Quota, etc)
Carnegie Mellon
The Cyrus IMAP ServerThe Cyrus IMAP Server
CMU Developed IMAP/POP ServerCMU Developed IMAP/POP Server Released to public and maintained as active Released to public and maintained as active
Open Source project under BSD-like LicenseOpen Source project under BSD-like License No servers were available implemented all No servers were available implemented all
of the features needed to replace AMS.of the features needed to replace AMS. Designed to be a “Black Box” serverDesigned to be a “Black Box” server Performance and Scalability were key to Performance and Scalability were key to
DesignDesign
Carnegie Mellon
Initial Cyrus IMAP Initial Cyrus IMAP DeploymentDeployment
Single monolithic server (1994-2002)Single monolithic server (1994-2002) Originally deployed alongside AMSOriginally deployed alongside AMS Features were implemented Features were implemented
incrementallyincrementally Users were transitioned incrementallyUsers were transitioned incrementally
Local users provided a good testing poolLocal users provided a good testing pool Scaled surprisingly wellScaled surprisingly well
Carnegie Mellon
Cyrus IMAP Aggregator Cyrus IMAP Aggregator DesignDesign
IMAP not well suited to clusteringIMAP not well suited to clustering No real concept of mailbox “location”No real concept of mailbox “location” Clients expect consistent views of the server and Clients expect consistent views of the server and
its mailboxesits mailboxes Significantly varying client implementation Significantly varying client implementation
qualityquality
Aggregator was designed to make many Aggregator was designed to make many machines look like one so any user can machines look like one so any user can share a folder to any other usershare a folder to any other user
Carnegie Mellon
Cyrus IMAP Aggregator Design Cyrus IMAP Aggregator Design (2)(2)
Three Participating Three Participating Types of ServersTypes of Servers IMAP Frontends IMAP Frontends
(“dataless” Proxies)(“dataless” Proxies) IMAP Backends IMAP Backends
(“Normal” IMAP (“Normal” IMAP Servers; your data Servers; your data here)here)
MUPDATE (Mailbox MUPDATE (Mailbox Database)Database)
MUPDATE Server
Maintains list
Users /Mail Clients
Frontends Proxy
RequestsFor Clients
Backends hold
TraditionalMailbox Data
Carnegie Mellon
IMAP FrontendsIMAP Frontends
Fully redundantFully redundant All are identicalAll are identical
Maintain local replica Maintain local replica of mailbox listof mailbox list
Proxies most requests, Proxies most requests, querying backends as querying backends as neededneeded
May also send IMAP May also send IMAP referrals to capable referrals to capable clientsclients
MUPDATE Server Propogates mailbox list changes to frontends
Users /Mail Clients
Frontends Proxy
RequestsFor Clients
Backends hold
TraditionalMailbox Data
Carnegie Mellon
IMAP BackendsIMAP Backends
Basically Normal Basically Normal IMAP ServersIMAP Servers
Mailbox Operations Mailbox Operations are approved & are approved & recorded by recorded by MUPDATE serverMUPDATE server Create / DeleteCreate / Delete RenameRename ACL ChangesACL Changes
Users /Mail Clients
Requests are proxied by Frontends
Backends hold
TraditionalMailbox
Data MUPDATE Serverapproves mailbox
operations
Carnegie Mellon
MUPDATE ServerMUPDATE Server
Specialized Location Specialized Location Server (similar to Server (similar to VLDB in AFS)VLDB in AFS)
Provides guarantees Provides guarantees about replica about replica consistencyconsistency
Simpler than Simpler than maintaining maintaining database database consistency between consistency between all the frontends all the frontends
Users /Mail Clients
Frontends update local mailbox list
replicas
Backends send mailbox list updates
MUPDATE Serverapproves and
replicates updates
Carnegie Mellon
Cyrus Aggregator:Cyrus Aggregator:Data UsageData Usage
User INBOXes and sub foldersUser INBOXes and sub folders Users can share their foldersUsers can share their folders Internet mailing lists as public foldersInternet mailing lists as public folders Netnews Newsgroups as public foldersNetnews Newsgroups as public folders Public folders for “workflow”; general Public folders for “workflow”; general
discussion, etcdiscussion, etc Continued “bboard” paradigm: Continued “bboard” paradigm:
30,000+ folders visible 30,000+ folders visible
Carnegie Mellon
Cyrus IMAP Aggregator:Cyrus IMAP Aggregator:AdvantagesAdvantages
Horizontal ScalabilityHorizontal Scalability Adding new capacity to frontend and/or backend is easy Adding new capacity to frontend and/or backend is easy
to do and can be done with no user visible downtimeto do and can be done with no user visible downtime Management possible through single IMAP client Management possible through single IMAP client
sessionsession Wide client interoperabilityWide client interoperability Simple Client configurationSimple Client configuration Ability to (mostly) transparently move users from Ability to (mostly) transparently move users from
one backend to anotherone backend to another Failures are partitionedFailures are partitioned
Carnegie Mellon
Cyrus IMAP Aggregator:Cyrus IMAP Aggregator:LimitationsLimitations
Backends are NOT redundantBackends are NOT redundant MUPDATE is a single point of failureMUPDATE is a single point of failure
Failure only results in error when trying Failure only results in error when trying to CREATE/DELETE/RENAME or change to CREATE/DELETE/RENAME or change ACLs on mailboxesACLs on mailboxes
Carnegie Mellon
Cyrus IMAP Aggregator:Cyrus IMAP Aggregator:BackupsBackups
Disk partition backup via Kerberized Disk partition backup via Kerberized Amanda (Amanda (http://http://www.amanda.orgwww.amanda.org))
Restores are manualRestores are manual 21 day rotation – no archival 21 day rotation – no archival Backup to disk (no tapes)Backup to disk (no tapes)
Carnegie Mellon
Cyrus IMAP Aggregator:Cyrus IMAP Aggregator:Other Protocol SupportOther Protocol Support
POP3 support for completenessPOP3 support for completeness Possibly creates more problems than not Possibly creates more problems than not
(where did my INBOX go?)(where did my INBOX go?) NNTP to populate bboardsNNTP to populate bboards NNTP access to mail storeNNTP access to mail store LMTP w/AUTH for mail transport from LMTP w/AUTH for mail transport from
MTA to backendsMTA to backends
Carnegie Mellon
ClientsClients
LDAP DirectoryServers Cyrus IMAP AggregatorMail Transfer Agents
(Three Pools)
The InternetUsers /Mail Clients
Carnegie Mellon
ClientsClients
IMAP has many publicly available clientsIMAP has many publicly available clients Varying qualityVarying quality Varying feature setsVarying feature sets
Central computing recommends MulberryCentral computing recommends Mulberry Roaming Profiles via IMSPRoaming Profiles via IMSP Many IMAP extensions supported (e.g. ACL)Many IMAP extensions supported (e.g. ACL) UI not as popularUI not as popular
Carnegie Mellon
Clients - WebmailClients - Webmail
Use SquirrelMail as a Webmail ClientUse SquirrelMail as a Webmail Client Local ModificationsLocal Modifications
Interaction with WebISO (pubcookie) Interaction with WebISO (pubcookie) AuthenticationAuthentication
Kerberos Authentication to CyrusKerberos Authentication to Cyrus Local proxy (using imtest) to reduce Local proxy (using imtest) to reduce
connection load on serverconnection load on server
Preferences and session information Preferences and session information shared via AFS (simple, non-ideal)shared via AFS (simple, non-ideal)
Carnegie Mellon
Clients – Mailing ListsClients – Mailing Lists
+dist+ for “personal” mailing lists+dist+ for “personal” mailing lists+dist+~user/[email protected]+dist+~user/[email protected]
Majordomo for “Internet-style” mailing Majordomo for “Internet-style” mailing listslists
Prototype web interface for accessing Prototype web interface for accessing bboardsbboards Authenticated (for protected bboards)Authenticated (for protected bboards)
http://bboard.andrew.cmu.edu/bb/org.acs.asg.coveragehttp://bboard.andrew.cmu.edu/bb/org.acs.asg.coverage
Unauthenticated (for mailing list archives)Unauthenticated (for mailing list archives)http://asg.web.cmu.edu/bb/archive.info-cyrushttp://asg.web.cmu.edu/bb/archive.info-cyrus
Carnegie Mellon
Andrew Mail StatisticsAndrew Mail Statistics
Approximately 30,000 UsersApproximately 30,000 Users 12,000+ Peak Concurrent IMAP Sessions12,000+ Peak Concurrent IMAP Sessions 8+ IMAP Connections / Second8+ IMAP Connections / Second 650 Peak Concurrent Webmail Sessions650 Peak Concurrent Webmail Sessions Approximately 1.5 Million Emails/weekApproximately 1.5 Million Emails/week
See Also: http://graphs.andrew.cmu.eduSee Also: http://graphs.andrew.cmu.edu
Carnegie Mellon
Andrew HardwareAndrew Hardware
5 frontends 5 frontends 3 Sun Ultra 80s (2x450mhz UltraSparc II; 2 GB memory; Internal 10000 RPM disk)3 Sun Ultra 80s (2x450mhz UltraSparc II; 2 GB memory; Internal 10000 RPM disk) 2 SunFire 280Rs (2x1ghz UltraSparc III; 4 GB memory; Internal 10000 RPM disk)2 SunFire 280Rs (2x1ghz UltraSparc III; 4 GB memory; Internal 10000 RPM disk)
5 backends 5 backends 4 Sun 220R (450mhz UltraSparc II; 2GB memory; 4 Sun 220R (450mhz UltraSparc II; 2GB memory; JetStorJetStor II-LVD II-LVD RAID5 8x36 GB 15000 RPM disks) RAID5 8x36 GB 15000 RPM disks) 1 SunFire 280R (2x1ghz UltraSparc III; 4GB memory; 1 SunFire 280R (2x1ghz UltraSparc III; 4GB memory; JetStorJetStor III U160 III U160 RAID5 8x73 GB 15000 RPM RAID5 8x73 GB 15000 RPM
disks)disks) 1 mupdate1 mupdate
Dell 2450 (Pentium III 733 MHz; 1 GB memory; PERC3 RAID5 4x36GB 10000RPM disks)Dell 2450 (Pentium III 733 MHz; 1 GB memory; PERC3 RAID5 4x36GB 10000RPM disks) 3 ANDREW.CMU.EDU MX3 ANDREW.CMU.EDU MX
Dell 2650 (Pentium 4 3ghz; 2 GB memory; PERC3 RAID1 2x73GB 15,000rpm disks) Dell 2650 (Pentium 4 3ghz; 2 GB memory; PERC3 RAID1 2x73GB 15,000rpm disks) 3 SMTP.ANDREW.CMU.EDU3 SMTP.ANDREW.CMU.EDU
Dell 2650 (Pentium 4 3ghz; 2 GB memory; PERC3 RAID1 2x73GB 15,000rpm disks) Dell 2650 (Pentium 4 3ghz; 2 GB memory; PERC3 RAID1 2x73GB 15,000rpm disks) 2 CMU.EDU MX2 CMU.EDU MX
Dell 2650 (Pentium 4 3ghz; 2 GB memory; PERC3 RAID1 2x73GB 15,000rpm disks) Dell 2650 (Pentium 4 3ghz; 2 GB memory; PERC3 RAID1 2x73GB 15,000rpm disks) 1 mailing list1 mailing list
Dell 2650 (Pentium 4 2.8ghz; 1 GB memory; PERC3 RAID1 2x73GB 15,000rpm disks) Dell 2650 (Pentium 4 2.8ghz; 1 GB memory; PERC3 RAID1 2x73GB 15,000rpm disks) 3 webmail3 webmail
Dell Optiplex GX 260 small form factor (Pentium 4 2.4Ghz; 1GB memory; 80GB ATA disk)Dell Optiplex GX 260 small form factor (Pentium 4 2.4Ghz; 1GB memory; 80GB ATA disk)
Carnegie Mellon
Current IssuesCurrent Issues
Lack of client support for ‘check new’ Lack of client support for ‘check new’ for IMAP folders (even when client for IMAP folders (even when client supports NNTP)supports NNTP)
Large number of visible folders can Large number of visible folders can be problematic for clients (i.e. be problematic for clients (i.e. PocketInbox)PocketInbox)
Carnegie Mellon
Potential Future WorkPotential Future Work
Online/Self-Service Restores (e.g. AFS “OldFiles”, Online/Self-Service Restores (e.g. AFS “OldFiles”, delayed EXPUNGE)delayed EXPUNGE)
Virtual “Search” FoldersVirtual “Search” Folders Fault tolerance Fault tolerance
Replicate backendsReplicate backends Support multiple MUPDATE serversSupport multiple MUPDATE servers
Multi-Access Messaging HubMulti-Access Messaging Hub One Mail Store, many APIsOne Mail Store, many APIs IMAP, POP, NNTP, HTTP/DAV/RSS, XML/SOAPIMAP, POP, NNTP, HTTP/DAV/RSS, XML/SOAP Web Bulletin Boards / blog interfaceWeb Bulletin Boards / blog interface Remove Shared Folder / Mailing List DistinctionRemove Shared Folder / Mailing List Distinction
Carnegie Mellon
Current SoftwareCurrent Software
MTA: Sendmail 8.12.10MTA: Sendmail 8.12.10 LDAP: OpenLDAP 2.0LDAP: OpenLDAP 2.0 Cyrus: 2.2.3Cyrus: 2.2.3 MIMEDefang: 2.28MIMEDefang: 2.28 SpamAssassin: 2.61SpamAssassin: 2.61 ClamAV: 0.63ClamAV: 0.63 Squirrelmail: 1.4.2 (w/Local Squirrelmail: 1.4.2 (w/Local
Modifications)Modifications)