Evolving Secure Computer Infrastructurescsweb.cs.wfu.edu/~fulp/natsec14.pdf · Evolving Secure...
Transcript of Evolving Secure Computer Infrastructurescsweb.cs.wfu.edu/~fulp/natsec14.pdf · Evolving Secure...
Evolving Secure Computer Infrastructures
Errin FulpComputer Science Department
Wake Forest University
1
Computer Science @ Wake Forest University
Research groups: bio-informatics, mobile, security, . . .
Security research has been network oriented
High-speed firewall company, GreatWall Systems
Research is more nature-inspired, due to PNNL
2
WFU + Pacific Northwest National Laboratory (PNNL)
Glenn Fink and David McKinnon @ PNNL
Together developed Digital Ants concept
Swarm intelligence approach to cyber security
Sponsored by PNNL, DOE, NITRDa, and NSF
aU.S. Networking and Information Technology Research and Development.
3
Nature-Inspired Cyber-Defense Research @ Wake
swarmdefense
evolvingcomputers
watching forpatterns
4
What makes cyber security a difficult problem?
Mathematical approaches often seek optimal
Traditional methods need well defined problems
Security problems are often ill-conditioned
Many steps and inputs may be unknown
Security problems are adversarial, more difficult
5
Natural systems routinely cope with ill-conditioned problems
Do not strive for optimal, just try to be good enough
If a situation changes... no problem we can adapt
Multi-stability allows coexistence of many stable states
Robust, tolerant of mistakes, perhaps learning
Scaleable
6
How does our work fit within biomimicry?
7
Bi·o·mim·ic·ry
Biomimicry is learning from and then emulating nature to create designs for
complex problems.
Three levels of biomimicry
Natural form - emulating structure found in nature
Natural processes - emulating how things are done
Ecosystems - emulating the integration of different entities
8
Form Processes
Bird inspired aircraft wings Swarm intelligence
Fish-scale swim suits Ant colony wireless bandwidth optimization
Frog inspired tire treads Genetic algorithms/programming
Seashell-structured ceramics Immune system based cyber defenses
Mollusk adhesion approaches Artificial neural networks
Biomimicry seems to have gained in popularity.
9
Measuring Biomimicry Popularitya
Da Vinci Index Patents, articles, and grants
Projected sales Bio-inspired products and solutions
Scholarly activity Publications by country
Biomimicry centers Established locations for research and development
a“Bioinspiration: An Economic Progress Report,” Fermanian Business & Economic
Institute, 2013
10
Trend Index
0
100
200
300
400
500
600
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
Da Vinci Index
Da Vinci Index is based on the number of patents linked to bio-inspiration,
scholarly articles published, and the dollar amounts of grants issued by National
Institutes of Health (NIH) and National Science Foundation (NSF).
11
Buy Bio-Inspired Stock Today!a
Projected percent of industry sales
0 10 15 20
Chemical manufacturing
Materials
Architectural, engineering, and related services
Plastics and rubber products manufacturing
Waste management and remediation services
Textile mills and textile product mills
Transportation equipment manufacturing
Utilities
Warehousing and storage
Construction
Computer, electronic products, equip, and appliances
Printing and related activities
Food and beverage and tobacco products
Air, rail, water, truck and pipeline transp serv
Mining, quarrying, and oil and gas extraction
Petroleum and coal products manufacturing
Mining
Furniture manufacturing
Paper manufacturing
Information technology
Apparel, leather and allied products
Agriculture
Forestry and aquaculture
5
Based on the numbers of patents, scholarly articles, and commercial products
either in development or on the market, bio-inspiration is making the most
significant inroads in chemistry, materials science, and engineering.
aNot necessarily an endorsement by Fulp, NatSec, IRI, or IEEE.
12
Scholarly Activity
0
5
10
15
20
25
30
Percentage of Scholarly Articles
Journal articles related to bio-inspiration published in 2012. Of the top 10
universities, 5 were Chinese and 2 were American.
13
Biomimicry Centers
213
4
5
6
7
8
7
9
10
111213
14
15
16
17
181920
22
23
24
2521
71. Biomimetic Surface Engineering
2. The Biomimetic Materials Laboratory
3. Biomimicry 3.8 Institute
4. Center for Biologically Inspired Design of
the Georgia Institute of Technology
5. Centre for Bio-Inspired Technology
6. CiBER
7. Bay Area Biomimicry Network
8. Biomimicry Chicago
9. Biomimicry Mexico
10. Biomimicry Netherlands
11. Biomimicry NY
12. Biomimicry Oregon
13. Biomimicry Puget Sound
14. Biomimicry Quebec
15. Biomimicry South Africa
16. Biomimicry Texas
17. Great Lake Biomimicry
18. Swiss Cleantech
19. World Biomimetic Foundation
20. Wyss Institute
21. San Diego Zoo Global Center for
Bioinspiration
22. Chinese Academy of Science
23. National University Singapore
24. Zhejiang University
25. Seoul National University
Also possible to earn degrees associated with bio-inspiration from Columbia
University, Stanford University, Northwestern University, Harvard, Oxford, Uni-
versity of Cambridge, ...
14
Measuring Biomimicry Popularity in Computer Sciencea
Preceding was not discipline specific, so let’s consider biomimicry popularity
within computer science related fields
Publications IEEE, ScienceDirect, and Google
Grants NSF
aGenerated by Fulp, searching for (bio-inspired || biological inspired ||
biomimicry) && (computer || algorithm || network).
15
2000
2005
2010
0
100
200
300
400
500
Citations
IEEE
2000
2005
2010
0
500
1,000
1,500
Citations
1995
2000
2005
2010
0
50
100
150
Citations
Science Direct
2000
2005
2010
0
10
20
30
40
50
Funded
Proposals
NSF
Number of published or funded grant proposals.
16
Consistent Trend
1995
1997
1999
2001
2003
2005
2007
2009
2011
2013
0
5 · 10−2
0.1
0.15
0.2Percent
Bio-Inspired
IEEEGoogle
Science DirectNSF
Trend is consistent across venues, and similar to the Da Vinci Index.
17
Biomimicry Hype?
Hype Cycle represents maturity, adoption, and application of technologies.
technology
trigger
inflated expectations
trough of
disillusionment
slope of enlightenment
plateau of enlightenment
Hype Cycle
Does the Hype Cycle apply to research areas? If so, where are we?
18
a
Given the possibilities, perhaps we’ve only begun...
a“The AskNature Database: Enabling Solutions in Biomimetic Design,” Jon-Michael
Deldin and Megan Schuknecht.
19
Biomimicry Hype?
techno
logy
trigger
inflate
d expect
ations
trough
of
disillu
sionm
ent
slope
ofenl
ighten
ment
platea
u ofenl
ighten
ment
Hype
Cycle
Perhaps the curve should be tilted?
20
Let’s consider computer science biomimicry.
21
Biomimetic DesignNatural
Computing
synthesize
phenomena
artificial life
natural
materials
molecular
quantum
nature
inspired
swarm
river
social
human
ecology
evolutionary
22
Biomimetic DesignNatural
Computing
synthesize
phenomena
artificial life
natural
materials
molecular
quantum
nature
inspired
swarm
river
social
human
ecology
evolutionary
Imitation of nature to solvecomplex problems
23
Biomimetic DesignNatural
Computing
synthesize
phenomena
artificial life
natural
materials
molecular
quantum
nature
inspired
swarm
river
social
human
ecology
evolutionary
Synthesizes, uses, or isinspired by nature
24
Biomimetic DesignNatural
Computing
synthesize
phenomena
artificial life
natural
materials
molecular
quantum
nature
inspired
swarm
river
social
human
ecology
evolutionary
Simulates life to understandorganisms
25
Biomimetic DesignNatural
Computing
synthesize
phenomena
artificial life
natural
materials
molecular
quantum
nature
inspired
swarm
river
social
human
ecology
evolutionary
Use natural materials forcomputation
26
Biomimetic DesignNatural
Computing
synthesize
phenomena
artificial life
natural
materials
molecular
quantum
nature
inspired
swarm
river
social
human
ecology
evolutionary
Use algorithms inspired bynature
27
nature inspired
algorithms
evolution
genetic
algorithm
evolutionary
program-
ming
evolutionary
algorithm
social
river
swarm
human
ecology
bio-geo
weed
symbiosis
28
nature inspired
algorithms
evolution
genetic
algorithm
evolutionary
program-
ming
evolutionary
algorithm
social
river
swarm
human
ecology
bio-geo
weed
symbiosis
Algorithms based on theadaptive nature oforganisms
Algorithms based socialnature of organisms
Algorithms based on howorganisms interact with/inthe environment
29
Different Types of Nature-Inspired Algorithms
Nature-InspiredAlgorithms
Evolution
EAS
Genetic
Algorithm
Genetic
Programming
Evolutionary
System
Differential
Evolution
Paddy Field
Algorithm
Swarm
River
Intelligent
Water
Social
Producer
Group
Search
Bird
Particle
Swarm
Stigmergyic
Ant
Colony
Fish
Fish
Swarm
Bacteria
Bacterial
Foraging
Firefly
Firefly
Algorithm
Beez
Bee
Colony
Frog
Shuffled Frog-
Leaping
Human
Artificial
Immune
Ecology
Biogeo
Biogeography-Based
Optimization
Weed
Invasive Weed
Optimization
Symbiosis
Multi-species
Optimization
30
Nature Inspired Strategies
Accelerated PSO Yang et al.
Atmosphere clouds model Yan and Hao
Ant colony optimization Dorigo
Biogeography-based optimization Simon
Articial bee colony Karaboga and Basturk
Brain Storm Optimization Shi
Bacterial foraging Passino
Differential evolution Storn and Price
Bacterial-GA Foraging Chen et al.
Dolphin echolocation Kaveh and Farhoudi
Bat algorithm Yang
Japanese tree frogs calling Hernandez and Blum
Bee colony optimization Teodorovic and DellOrco
Eco-inspired evolutionary algorithm Parpinelli and Lopes
Bee system Lucic and Teodorovic
Egyptian Vulture Sur et al.
BeeHive Wedde et al.
Fish-school Search Lima et al.
Wolf search Tang et al.
Flower pollination algorithm Yang
Bees algorithms Pham et al.
Gene expression Ferreira
Bees swarm optimization Drias et al.
Great salmon run Mozaffari
Bumblebees Comellas and Martinez
Group search optimizer He et al.
Cat swarm Chu et al.
Human-Inspired Algorithm Zhang et al.
Consultant-guided search Iordache
Invasive weed optimization Mehrabian and Lucas
Cuckoo search Yang and Deb
Marriage in honey bees Abbass
Eagle strategy Yang and Deb
OptBees Maia et al.
Fast bacterial swarming algorithm Chu et al.
Paddy Field Algorithm Premaratne et al.
Firefly algorithm Yang
Roach infestation algorithm Havens
Fish swarm/school Li et al.
Queen-bee evolution Jung
Good lattice swarm optimization Su et al.
Shuffled frog leaping algorithm Eusuff and Lansey
Glowworm swarm optimization Krishnanand and Ghose
Termite colony optimization Hedayatzadeh et al.
Hierarchical swarm model Chen et al.
Krill Herd Gandomi and Alavi
Big bang-big Crunch Zandi et al.
Monkey search Mucherino and Seref
Black hole Hatamlou
Particle swarm algorithm Kennedy and Eberhart
Central force optimization Formato
Digital ants Fink
Virtual ant algorithm Yang
Virtual bees Yang
Charged system search Kaveh and Talatahari
Electro-magnetism optimization Cuevas et al.
Weightless Swarm Algorithm Ting et al.
Galaxy-based search algorithm Shah-Hosseini
Gravitational search Rashedi et al.
Anarchic society optimization Shayeghi and Dadashpour
Harmony search Geem et al.
Articial cooperative search Civicioglu
Intelligent water drop Shah-Hosseini
Backtracking optimization search Civicioglu
River formation dynamics Rabanal et al.
Differential search algorithm Civicioglu
Self-propelled particles Vicsek
Grammatical evolution Ryan et al.
Simulated annealing Kirkpatrick et al.
Imperialist competitive algorithm Atashpaz-Gargari and Lucas
Stochastic difusion search Bishop
League championship algorithm Kashan
Spiral optimization Tamura and Yasuda
Social emotional optimization Xu et al.
Water cycle algorithm Eskandar et al.
31
Let’s consider evolutionary algorithms and security.
32
Evolutionary algorithms try to solve problems by evolving sets of search points
evolutionary
algorithms
genetic
algorithm
genetic
programming
evolutionary
strategies
differential
evolution
paddy field
algorithm
Mimic selection, crossover,and mutation processes
Evolution to find computerprograms that perform auser-defined task
Macro-level (species-level)evolution process
Similar to GA, but mutationis the result of arithmeticcombinations of individuals
Uses pollination anddispersal to find solution.
33
Evolution Defense Strategies
Genetic programming defining security policies
Differential evolution intrusion detection
Evolutionary strategies providing situational awareness
Paddy field too new?
In these examples, evolutionary approaches were used to solve a difficult
optimization problem. But is security an optimization problem?
34
Evolution as a Defense Strategy
Evolutionary strategies are used as search heuristics
Maintain a population of solutions
Find the best solution based on survival of the fittest
Typically given a cyber security problem, there isn’t one solution
Diversity and adaptability are what is helpful
“Static nature of defenses cannot hold up against the dynamic nature of
attacks” – Cohen 1993
35
Computer Evolution as a Moving Target Defensea
Sponsored by the National Science Foundation SaTC, Award 1252551
Adaptation makes organisms better suited to their habitat
Survival of the fittest, later generations improve
Changes can be type of defense
aDavid John, Daniel Canas, William Turkett, Robert Smith, Scott Seal, Bryan Prosser,
and Don Gage.
36
Problems with Computing Infrastructures
Computers are often deployed in a deterministic fashion
Configurations are rarely updated
Unfortunately this environment is great for attackers
One type of defense is to create a Moving Target environment
37
Attack Process and Moving Targets
start reconnaissance risk OK? attack success?yes
no no
Normally reconnaissance occurs before an attack
Gather intelligence about potential targets
Can require substantial time and effort
Moving Target (MT) defense makes reconnaissance ineffective
Defended system changes over time
Security through diversity defense
38
Moving Target Examples
Categories of MT environments p r � c � � � � ri � � r � � � r � c � � r � computer
routeremap
networkshuffle
EAMT(you are here)
TCP/IP
obfuscation
ASLR
Several successful examples of moving target environments
Memory address randomization (ASLR)
Network address shuffling
Route remapping
Want to create a moving target defense for computers
Change properties of the system, do not hide or cloak
Hopefully evolve to more secure systems
39
Configurations and Moving Targets
...
# Create "/keygen" if it doesn’t exist.
class gen class
{
file { "/etc/cipher/keygen.lst"}:
ensure => present,
mode => 644,
owner => root,
group => root
}
}...
...
NameVirtualHost 172.20.30.40
<VirtualHost 172.20.30.40>
# primary vhost
DocumentRoot /www/subdomain
RewriteEngine On
RewriteRule /.* /www/subdomain/index.html
ServerSignature Off
ServerTokens Prod
# end definition
</VirtualHost>
...
...
net.core.rmem max = 16777216
net.core.wmem max = 16777216
# increase Linux autotuning TCP buffer limits
# min, default, and max number of bytes to use
# (only change 3rd value, make it 16 MB or more)
net.ipv4.tcp rmem = 4096 87380 16777216
net.ipv4.tcp wmem = 4096 65536 16777216
# recommended to increase this for 10G NICS
net.core.netdev max backlog = 30000
net.ipv4.tcp congestion control=cubic
# are you really reading this?...
Attacker observes computer properties during reconnaissance phase
Configurations define properties of the OS and applications
Changing the configuration could create a moving target (which is
different than masking)
Defining one good configuration is difficult
Windows system typically has over 200,000 registry entries
We want to find multiple good configurations...
40
Difficulty of the Problem
Let C be a configuration of n individual settings, C = {p1, p2, ..., pn}
Description Possible Values
KeepAlive, allow requests over the same connection 0, 1
KeepAliveTimeout, wait time to wait for requests 0 - 1800
Indexes, automatic directory indexing 0, 1
LimitRequestBody, limit the message size 0 - 65535
LimitRequestFields, limit number of HTTP requests 1, 0
LimitRequestFieldSize, limit HTTP header field size 0 - 32767
C = {1, 800, 1, 32767, 0, 1}
Parameters may be interdependent, forming parameter chains
Certain subset of settings may be required for feasibility or security
Setting Apache KeepAliveTimeout and StartServers too high
can enable a DoS attack
Can model the configuration as a Boolean expression
Dependencies can be expressed using AND, OR, and NOT
Finding a secure configuration is similar to the satisfiability problem
41
Moving Target Objectives
Want configurations to be diverse, temporally and spatially
Temporally - difference in a single computer over time
Spatially - difference between computers at any point in time
Also want configurations to be feasible and secure
42
Configuration Guidelines
Several guidelines exist for securing configurations
Defense Information Systems Agency (DISA) STIG
Common Vulnerabilities and Exposures (CVE)
Open Web Application Security Project (OWASP)
Guidelines tend to be generic
Settings for simple computer/server installations
Could there be alternatives to known secure settings?
43
Evolution and Genetic Algorithms
GAs mimic evolution to find solutions
Better solutions created from good solutions
Mutations can help form a MT defense
Model configurations as chromosomes and apply evolution
44
Modeling Configurations as Chromosomes
Configuration consists of multiple parameters
Individual configuration parameters are the chromosome traits
Therefore configuration is a chromosome, C
Name Description Value Purpose
file permission permissions for the secret file 222 security
login banner message displayed upon login 2 diversity
file ownership ownership of the secret file 774 security
max open files change max number of open files 220 diversity
max file change max file descriptor 409 diversity
.
.
.
.
.
.
.
.
.
.
.
.
C = {222, 2, 774, 220, 409, . . .}
Chromosome may have security and/or diversity parameters
Security parameters affect the security of the system
Diversity parameters affect something observable in the system
All chromosomes must represent feasible configurations
Chromosomes will be be ranked, need a measure of fitness
45
Configuration Security
Fitness of a chromosome based on the attacks encountered
Attack difficulty and impact
Common Vulnerability Scoring System (CVSS) is a 6 part vector
Part Description Possible Values
Diffi
culty AV access vector Local, Adjacent-network, Network-external
AC access complexity High, Medium, Low
Au authentication required Multiple, Single, None
Damage C confidentiality None, Partial, Complete
I integrity None, Partial, Complete
A availability None, Partial, Complete
CVE CVSS Vector CVSS Score
Default ssh password AV:N/AC:L/Au:N/C:P/I:P/A:P 7.2 (high)
46
Chromosome/Configuration Fitness
CVSS uses a convoluted equation for the numerical value (0- 10)
Three values (1, 10, and 100) used to score each CVSS vector part
AV/AC/Au/C/I/A
Score Example CIA Description Value
Good Suffered no attack 100
Medium Attacked with minimal impact 10
Bad Attacked with maximum impact 1
Therefore best parameter score is 600, the worst is 6
Parameter scores added to provide a configuration score
47
Chromosome/Configuration Fitness Problems
Chromosome fitness value will not always be correct
No incidences does not necessarily indicate a secure configuration
New vulnerabilities will be discovered
48
Genetic Algorithm Components
p1
p2
p3
p4
p5
p6
p7
p1
p2
p3
p4
p5
p6
p7
selection crossover mutation
Selection - Determine the best pair of chromosomes from pool
Crossover - Combine best chromosomes to produce new chromosomes
Mutation - Randomly change traits in the new chromosome
49
Evolution and Genetic Algorithms
startselect
chromosomescrossover mutate
activate
chromosome
on VM
feasible?
evaluate
chromosome
performance
update
pool
chromosome
active on host
newchromosomesgenerated
yes
no
intermittentlyselect bestchromosome
update chromosomeperformance
Managing configurations on host Discovering new configurations
50
EAMT Framework (process and software view)
startselect
chromosomescrossover mutate
activate
chromosome
on VM
feasible?
evaluate
chromosome
performance
update
pool
chromosome
active on host
newchromosomegenerated
yes
no
intermittentlyselect bestchromosome
update chromosomeperformance
Managing configurations on host Discovering new configurations
EA
(discovery)
VM
(implementation)
Assessment
(scoring)
configuration
XMLdocument
securityeventreport
configurationscores
VM farm
chromosome
active on host
intermittentlyselect bestchromosome
update chromosomeperformance
Managing configurations on host Discovering new configurations
process view software component view
Python-based framework created to manage computer configurations
MTD for RedHat Enterprise Linux (RHEL5), Apache 2.2 servers
EAMT processes mapped to four primary software components
51
EAMT Framework Operation
EA
(discovery)
VM
(implementation)
Assessment
(scoring)
configuration
XMLdocument
securityeventreport
configurationscores
VM farm
chromosome
active on host
intermittentlyselect bestchromosome
update chromosomeperformance
Four components continually iterate
Configuration database (chromosome pool) should improve
Periodically, chromosome selected and instantiated on actual server
This provides a moving target defense for the actual server
52
Experimental Results
Experiments using 140 RHEL5 and Apache 2.2 installed servers
Managed configurations have 102 parameters, initially insecure
Parameters have different possible value ranges
EA (+PDM) compared with Beam and random search algorithms
Performance based on security and diversity (spatial) provided
Fitness score based on parameter settings (CVSS-like)
Diversity is the sampled Hamming distance between configurations
Seek high fitness and diversity values
53
EAMT Results
0 5 10 15 20 25 30 35 40
4
4.5
5
5.5
·104
Generation
Average
FitnessScore
EAMTBeam
Random
0 5 10 15 20 25 30 35 400
20
40
60
80
Generation
Average
Pairw
ise
Ham
mingDistance
EAMTBeam
Random
EA was able to discover the most secure configurations
EA had less diversity than random, but better than Beam
54
Star Plots
example stars generation 1 generation 41
Each star is a configuration and consists of 104 lines (parameters), line length
represents the value.
55
Discovering Secure Settings
Can apply machine learning given configurations
Discover secure settings for a certain attack
Using decision trees for classification
Rules (settings) can be applied to future configurations
Help administrators understand secure parameter settings
56
Final Remarks (for this talk)
Biomimicry is gaining popularity
Does the Hype Curve apply?
Several centers, majors, and courses
Bio-inspired security has several great characteristics
Does not strive for optimal, just try to be good enough
Adaptive, resilient, scalable, and diverse
Diversity is an important aspect of moving target defenses
Possible to use evolutionary strategies to discover configurations
Evolved configurations are more secure and diverse
57
Turn back, you’ve gone too far
58