UNM CS Dept. Profile

23
Technological Networks: From Empirical Laws to Theory Stephanie Forrest Dept. of Computer Science University of New Mexico http://cs.unm.edu/~forrest [email protected]

description

Technological Networks: From Empirical Laws to Theory Stephanie Forrest Dept. of Computer Science University of New Mexico http://cs.unm.edu/~forrest [email protected]. UNM CS Dept. Profile. 18 Faculty: 9 tenured (5 full, 4 assoc), 5 assistant,1 lecturer 3 openings (2 junior, 1 senior) - PowerPoint PPT Presentation

Transcript of UNM CS Dept. Profile

Page 1: UNM CS Dept. Profile

Technological Networks: From Empirical Laws to Theory

Stephanie Forrest Dept. of Computer ScienceUniversity of New Mexico

http://cs.unm.edu/[email protected]

Page 2: UNM CS Dept. Profile

UNM CS Dept. Profile

• 18 Faculty:– 9 tenured (5 full, 4 assoc), 5 assistant,1 lecturer– 3 openings (2 junior, 1 senior)

– Prince of Asturias Endowed Chair in Information Technology – External faculty appointments from other departments and national

labs.

• Close collaborations with SFI and national labs (SNL and LANL)

• Students:– Degrees: BS: ~40, MS: ~35, PhD: ~5– Undergraduate: ~200 majors; BS degrees

• >20% Female; >35% Minorities– Graduate: ~120 MS ~80 Ph. D.

• >20% Female; 40-60% Foreign

• Funding: (2004-2005)– Total: $3.5M– NSF, DARPA, DOE, NIH, Sandia and Los Alamos

Page 3: UNM CS Dept. Profile

UNM CS Dept. Research

• Strongly Interdisciplinary:– Adaptive computation– New paradigms of computing (molecular and quantum

computation)– Computational biology and bioinformatics (phylogenetic

tree reconstruction, radiology, RNAi)• Graphics and visualization• High-performance computing:

– Light-weight distributed operating systems• Automated reasoning and machine learning:

– Otter– POMDP

• Complex networks:– Provably robust scalable algorithms for P2P networks– Phase transitions in NP-complete problems

Page 4: UNM CS Dept. Profile

Themes of Talk

• The real world isn't exactly scale free.• Understanding and predicting network structure

is important for engineering:– Network properties can be exploited to enhance

computer security (computer epidemiology / border gateway protocol)

• We lack theory to explain/predict the structure of technological networks:– Preferential attachment isn’t good enough.

• Initial steps toward theory.

Page 5: UNM CS Dept. Profile

Technological Networks

• Distribute resources:– Energy– Materials– Information

• Energy distribution:– Power grids– Gas pipelines

• Transportation:– Highways– Airline routes

• The Internet:– Physical connectivity– Autonomous systems

(AS)– World-wide web – Social contacts,

e.g., email

• Microarchitecture

Page 6: UNM CS Dept. Profile

Network Structure

• Network topology affects network properties:– Shortest distance between

two nodes– Bisection width– Rate and extent of contagion

• Analysis:– Epidemiological models.– The epidemic threshold.

• Degree distribution of network:– Scale-free (power law)

networks: – Pk = k-c

• Controlling infections on scale-free networks:– Random vaccination is

ineffective (e.g., anti-virus software).

– Targeted vaccination of high-connectivity nodes.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 7: UNM CS Dept. Profile

Example 1: Computer EpidemiologyJustin Balthrop, Mark Newman, Matt Williamson

• Viruses and worms spread over networks of contacts between computers:– Email address books.– URL links.

• Different types of networks are exploited by different types of infections.

UNM CS Dept. Network of Address Books

Page 8: UNM CS Dept. Profile

Degree Distributions of Four NetworksRelevant to Computer Security

– Not scale-free.– Targeted vaccination unlikely to be effective:

• > 10% of nodes required for address book data• > 87% of nodes required for email traffic data

– Computer infections can choose their own topology, so network topology is not static.

– Viruses spread faster than repairs.

0 100 200 300 4000

50

100

150

200

250

300

1 10 100 1000DegreekDegreek

IP networkAdminstrator network

1

10

100

1000

10000

Email trafficAddress books

Science 304:527-529 (2004)

Page 9: UNM CS Dept. Profile

Throttling: Generic Control of EpidemicsMatt Williamson and Justin Balthrop

• Control network topology in time rather than space.– Limit the rate at which a computer can make new connections – Limits spreading rates rather than stopping.

• Assumes that virus traffic is significantly different from normal traffic:– Nimbda infects up to 400 new machines per second compared to normal

rate of connections to new Web servers of about one connection per second or slower (Williamson, 2002).

– Throttling Nimbda would have increased the epidemic time from one day to over one year.

• Advantages:– Effective when the form of infection not known in advance.– Reduces amount of traffic generated during an epidemic.

• Limitations:– Effectiveness decreased if not all nodes are throttled (altruism),

stealth attacks.

• Implementations:– RIOT– HP’s Virus Throttle

Page 10: UNM CS Dept. Profile

Responsive I/O Throttle (RIOT)Justin Balthrop and Matt Williamson

• Vision: An adaptive desktop firewall:– Hamper worms, viruses, DOS,

misconfigurations, etc.– Graduated automated response

(throttling) on all connections.

– Adaptive (learn normal).– Robust to false positives

• Personal desktop firewalls:– Coarse-grained– Static and preprogrammed

• Generic rate limit (throttle):– Delay network connections that

occur at an anomalous rate.– Detector activation triggers

delay and packets are dropped.

IP Address 1 IP Address 2 TCP Port 1

TCP Port 232 bits 32 bits 16 bits 16 bits

• How to detect anomalous connections?– Set of detectors (lymphocytes)

that observe TCP connections:• Data Path (below)• Meta-information (direction, TCP

flags)

– Each detector matches some portion of IP space.

– Each detector has its own normal activity level.

Page 11: UNM CS Dept. Profile

Learning and Throttling Connections

1. A new connection is initiated (SYN Packet)

2. The connection information is translated to a bitstring

3. The bitstring is shown to all detectors

4. An immature detector exceeds its activation threshold

5. The detector’s activation threshold is raised

6. Detectors with stable activation thresholds become mature

Page 12: UNM CS Dept. Profile

Autonomic Responses: A Repertoire

LISYS (Steven Hofmeyr)

E-mail system administrator when there is an anomalous network connection.

Personal Desktop Firewall

Block network connections based on user-specified policy.

pH(Anil Somayaji)

Slow down system calls for anomalous processes.

Throttling(Matt Williamson)

Limit the overall rate of new network connections for a computer.

RIOT(Justin Balthrop)

Limit rate of all anomalously behaving TCP connections. Adaptive.

PGBGP(Josh Karlin)

Limit the rate at which new BGP routes are adopted.

Page 13: UNM CS Dept. Profile

Example 2: Inter-Domain Routing Josh Karlin, Stephanie Forrest, and Jennifer Rexford

• ~20,000 Autonomous Systems (ASs) connected via the Border Gateway Protocol (BGP)

• ASs route blocks of IP’s, know as prefixes– 64.106.0.0/17 (Owned by

UNM)

• ~170,000 prefixes owned by ASs today.

• Border gateway protocol (BGP):– Tell neighbors about new

routes (Announcements).– Tell neighbors about old

routes gone bad (Withdrawals).

– That’s it.

SWCP UNM

Time WarnerTelecom

Comcast

AT&T

Page 14: UNM CS Dept. Profile

BGP Networks are Interesting and Important

• Distributed:– Nodes are AUTONOMOUS systems.– No centralized routing information (routes stored and

maintained locally).– No authentication of new nodes or routes.

• Dynamic:– Network connectivity changes routinely and continually.– Network updates are spread through local contact (BGP).

• Confluence of technological and economic constraints:– A “policy” network as well as a routing network.

• All inter-domain Internet traffic relies upon BGP.• Vulnerable:

– Trivial to inject false routing information into network.– Man-in-the-Middle attacks.

• Pretty Good BGP (PGBGP) and Internet Alert Registry (IAR).– Throttle the adoption of new routes.

Page 15: UNM CS Dept. Profile

PGBGP Algorithm

• Main Idea: Delay Suspicious Routes – Lower the preference of suspicious routes (24hr)

• Detection:– Monitor BGP update messages – Treat origin ASs for a prefix seen within the past

few days as normal – Treat new origin ASs as suspicious for 24 hours, then

accept as normal (possible prefix hijack) – Treat new sub-prefixes as suspicious for 24 hours,

then accept as normal (possible sub-prefix hijack) • Response:

– Suspicious origin AS routes are temporarily given low local preference

– Suspicious sub-prefixes are temporarily ignored (not forwarded to)

Page 16: UNM CS Dept. Profile

PGBGP Advantages

• Incremental deployability – No change to BGP protocol, just to path selection– Immediate benefits to adopting AS and customers

• Automated and immediate response– Avoid using and propagating the bogus route– Network has chance to stop the attack before it spreads

• Robust to false positives– Lowering preference for suspicious routes– No loss of reachability– Accidental short-term delays do no harm

• Offline investigation of suspicious route– Internet Alert Registry, active probing, …

• Adaptive, simple

Page 17: UNM CS Dept. Profile

Incremental Deployment

Subprefix HijackPrefix Hijack

ICNP (2006)

• Limitations:– Doesn’t address path spoofing, redistribution attacks. – Negligence

Page 18: UNM CS Dept. Profile

BGP Network StructurePetter Holme and Josh Karlin

• Barabasi-Albert (BA) model (Barabasi and Albert, 1999):– Vertices and edges added iteratively– Probability of attaching to vertex i is proportional to k(i)

• Inet model (Winick and Jamin, 2000)– No simple growth principle– Generate random graph with known degree distribution– Augment to mimic additional known correlations (e.g., connecting

all high-degree nodes)

Proc. Royal Acad. A (in press)

Page 19: UNM CS Dept. Profile

Tentative Conclusions

• Real AS graph is more heterogeneous than can be expected from degree distribution alone:– Core providers in the low-d tail– Peak at d=3 (vertices directly connected to the

core)– Second peak at d=4 (vertices directly connected

to d=3 nodes)– More structure in periphery than predicted by

earlier models.

• Preferential attachment is a poor model for network growth.

• What constraints determine network architecture and growth?

Page 20: UNM CS Dept. Profile

Allometric Scaling in Biology

• Dominant design constraint:– Distribute resources to every cell in the organism

• Internal, space-filling, hierarchical networks (vascular system)

• Invariant terminal units (capillaries)• Optimality (minimize transport time, maximize

metabolism)

-10

10

30

-30 0 30

ln (body mass)

ln (metabolic rate *e

E/kT

)

Endotherms

Reptiles

Fish

Amphibians

Invertebrates

Unicells

Plants

y = 0.71x + C€

Y =Y0X3 / 4

Page 21: UNM CS Dept. Profile

Examples: Scaling in SoftwareH. Inoue

HelloWorld: Unique Function Calls vs. Invocation Freq LimeWire Behavior

Page 22: UNM CS Dept. Profile

Example: Scaling in Software Dave Ackley and Terry Van Belle

• How to measure evolvability?– Likelihood (how likely is a

location to change).– Impact (change in one

location affects other locations).

• Work = [Likelihood x Impact]:– Software maintenance costs.

– Expected time to evolve.

• Study long-lived Java code bases:– Code change sizes and

frequencies are power law ish. Van Belle,

2004

Page 23: UNM CS Dept. Profile

A Theory of Network Scaling: Conclusions

• Why it’s important– Networking infrastructure continues to expand by

orders of magnitude:• NSF GENI project: Redesign the Internet from the ground up.

• Interplanetary networking.– Architecture:

• Move from performance-oriented to power-aware designs.• The end of silicon.

– Scaling problems in software, security problems everywhere

• Why it’s hard– Terminal units aren’t necessarily invariant sized– Dimensionality and geometry are not obvious– May require new mathematics