Python atElemental Security
EuroPython - June 29, 2005
Guido van RossumElemental Security, Inc.
[email protected]@python.org
June 29, 2005 2
Elemental Security, Inc.
• Enterprise security software
– product: Elemental Compliance System (ECS)
• express, monitor and enforce security policies for any computer connecting to the network (cross-platform)
• scored 9.3 in recent InfoWorld Test Center
• Startup (no longer in stealth mode!)• C round just closed; 11M led by Lehman Brothers
• Using lots of Python (and Java!)
• We're always hiring!
• See http://www.elementalsecurity.com• Now a real website :-)
June 29, 2005 3
ECS Application Structure
• One Central Server
– Java, J2EE (Tomcat), some Python, Oracle
– front-end: rich web UI (JavaScript + XML-RPC)
– back-end: agent connector (HTTP+SSL)
• Many Agents
– Python and C
– runs on Windows, Solaris, Linux, ...
– main components:
• scheduler
• server connector
• policy engine – I'll get back to this later
• packet filter – nearly the only part written in C
June 29, 2005 4
Why Does Elemental Use Python?
A. Because I'm There :-)
B. Python is the best tool for the job
– small footprint
– runs everywhere (or almost runs :-)
– access to platform-specific APIs (e.g. registry)
– much of what we do is "script-like"
• gather various configuration information about the host
• check specific policy rules
– this is so important we have a custom language for it!
– application changes frequently
• we continually learn to understand the problem better
• quickly refactor code as needed
June 29, 2005 5
ElementClass – a Simpler XML API
• Use cases:
– exchange data with central server
• policies, reports, etc.
– persist structured data within agent
• policies, schedule, etc.
– tool to manage policy definitions (Tkinter UI)
• XML an obvious choice
• Want better mapping between Python & XML
• example:
– XML: <schedule start="1" offset="100" />
– Py: sch.start+sch.offset #not int(sch.getattr("start"))
June 29, 2005 6
ElementClass – Example Input
<group name="PSF">
<employee name="Guido" age="49" />
<employee name="Tim" age="99" />
<employee name="Ben" age="17" />
<employee name="Dan" age="15" />
</group>
June 29, 2005 7
ElementClass – Example Code
• from xmlparse import ElementClass, String, Integer
• class Employee(ElementClass): __element__ = "employee" __attributes__ = {"name": String, "age": Integer}
• class Group(ElementClass): __element__ = "group" __attributes__ = {"name": String} __children__ = {Employee: "employees[]"}
• group = Group.__parseFile__(filename)
• minors = [e for e in group.employees if e.age < 18]
• group.employees = minors
• f = open(filename, "w"); group.__render__(f); f.close()
June 29, 2005 8
Element Class – Example Output
<group name="PSF">
<employee
age="17"
name="Ben" />
<employee
age="15"
name="Dan" />
</group>
June 29, 2005 9
ElementClass – Limitations, Features
• No namespace support
• attribute names must be Python identifiers
– (except '-' mapped to '_')
• Can have CDATA or subelements but not both
• Subelement choices for #occurrences:
– zero or once: Python attribute is None or object
– any number: Python attribute is a list, may be empty
• Ordering of attributes and subelements is lost
– except for relative ordering of similar elements
• All attributes and elements are optional
• Optionally, can ignore unrecognized attrs/elements
June 29, 2005 10
ElementClass – What's Next?
• Improve the API a bit?– use lists of tuples instead of dicts for metadata
• this allows specifying attribute/subelement ordering
– decide what to do with Unicode values• convert to str if ASCII only, or not?
– add more attribute data types?• currently String, Integer, Boolean, Timestamp
• add Float; what else? enumerations?
– add required attributes, subelements? (which API?)
– tidy up output (fewer line breaks)
• Document it
• Contribute it to the PSF in time for Python 2.5!– ESI lawyers to look at PSF Contribution Agreement
June 29, 2005 11
Really Hammering The Server
• Server scalability requirement: support 4000 agents– Available: a few dozen test machines
– How to do server load testing?
• Solution 1: run 50 agents on one test machine– test machines overloaded
– test machines look too similar
– can't quite reach scalability requirement
• Solution 2: run 500 synthetic agents on one box– skips work that doesn't affect what the server sees
– started out as a private hack, adopted very quickly
– full potential not yet reached (next: 20K agents!)
– can easily inject additional test data into server
June 29, 2005 12
The Approach
• Share as much code as possible with real agent– fortunately, most agent code is in library modules
• N agent objects, K worker threads (K ≤ N)
• 1 scheduler thread– real-time event queue managed using heapq module
– main loop sleeps until next event ready• beware: event queue may be updated while sleeping!
– distributes events to workers via Queue.Queue
– worker main loop:• while True:
callable, args = workQueue.get() callable(*args)
– callable is typically a bound method of an agent object
June 29, 2005 13
The Outcome
• Works really well despite its simplicity
– didn't have to use asynchronous I/O
• Randomized synthetic data sent to server
– example: simulate all agents being "nmapped"
• Probably bounded by number of threads
– can't have too many agents per thread
• Inexplicable slow memory leak (not M2Crypto!)
June 29, 2005 14
A Policy Implementation Language
• ECS is all about policy compliance
– each host has a policy compliance score: 0-100%
– composed of individual (Boolean) policy rule scores
– some (not all) policy rules can also be enforced
• So what's a policy rule? Examples:
– all passwords must be at least 6 characters
– ftpd should be disabled
– all email must go through server X
• Elemental has a library of 1000+ policy rules
– user selects some and deploys to group of hosts
– agent gets rule list, executes rules, uploads results
• repeat on user-selected schedule (30 min – 7 days)
June 29, 2005 15
How To Implement Policy Rules
• Requirements:
– Cost to add another rule must be low
– Some rules are relatively complex programming tasks
– Rule authors are security experts, not programmers
• Some possibilities:
– shell scripts (Titan)
– Perl, Python, etc.
– XML
– custom language
June 29, 2005 16
Why Write Another Language
• Need a library of policy-checking methods, e.g.:
– assert that a file has a specific mode, owner, group
– assert that a registry entry has a specific value
– parse a configuration file using "name = value" syntax and then check a specific name/value pair
• Ideal: constraint-based (declarative) language
– execution order doesn't matter
– compiler can check for conflicts between rules
• Python would be fine if I were writing all the rules
– still fairly low-level; risk of using the wrong approach
• Compromise: nearly-declarative language
– resembles Python except where it doesn't
June 29, 2005 17
How Fuel Differs From Python
– func has_localhost(host: Host, group: str): bool: for ip in host.gethostgroup(group): if substr(ip, 0, 4) == "127.": return true return false
• Declarations required; all code is type-checked– interfaces used for library code written in Python
• Single-assignment language with immutable values– let var [: type] = expr
• Argument defaults computed dynamically
• Many Python features left out (e.g. slicing!)
• Container types: immutable set and struct
• Fuel is not Turing-complete!
June 29, 2005 18
Implementing Fuel
• Process grammar with pgen
– eventually reimplemented pgen in Python
• Use tokenize.py for tokenization
• Implemented pgen parsing automaton
– as-we-go parse tree reduction
• Use visitor pattern to translate to Python source
• Parse tree node classes have grammar in docstrings
• Run-time library in Python
– defines some mutable object types
June 29, 2005 19
Challenges in Writing Fuel
• Not enough users yet to know we're doing it right
– yes, we should open-source it!
• Main challenge is to keep the language expressive without compromising its declarative nature
– Fuel 2.0 will tweak the design quite a bit
• host.runscript("userdel", "-r", acct.name)
– admission of defeat – but unavoidable some times
• Source code organization
– linkage between source & hierarchical menu of rules
– metadata repeated in source & XML
– same rule implemented differently per platform
June 29, 2005 20
How We Use Fuel
• ~1400 policy rules implemented in Fuel
• Written by about 4 people part-time over 1 year
• Rules cover Solaris, Linux, Windows (2k+), ...
• Rules cover all areas of security:
– accounts, network, filesystem, system, hardware, software, packet filter, trust, authentication, logging
June 29, 2005 21
Question Time
Top Related