BEHOLD, WE HAVE SIGNAL

51
1 Institute for Software Research BEHOLD, WE HAVE SIGNAL

description

BEHOLD, WE HAVE SIGNAL. Software Architecture in Cyberspace. Mary Shaw Carnegie Mellon University http://www.cs.cmu.edu/~shaw/. The Internet infrastructure supports a creative and thriving social and economic system. The Internet is much richer than its communication infrastructure. - PowerPoint PPT Presentation

Transcript of BEHOLD, WE HAVE SIGNAL

Page 1: BEHOLD, WE HAVE SIGNAL

1

Institute for Software Research

BEHOLD, WE HAVE SIGNAL

Page 2: BEHOLD, WE HAVE SIGNAL

2Institute for Software Research

Software Architecturein

Cyberspace

Mary ShawCarnegie Mellon University

http://www.cs.cmu.edu/~shaw/

Page 3: BEHOLD, WE HAVE SIGNAL

3

Institute for Software Research

The Internet infrastructure supports a creative and thriving social and economic system.

The Internet is much richer than its communication infrastructure.

Architectural styles help explain software systems.

What is an appropriate theory of architectural style for Internet applications?

Page 4: BEHOLD, WE HAVE SIGNAL

4

Institute for Software Research

The Internet infrastructure supports a creative and thriving social and economic system.

The Internet is much richer than its communication infrastructure.

Architectural styles help explain software systems.

What is an appropriate theory of architectural style for Internet applications?

Page 5: BEHOLD, WE HAVE SIGNAL

5

Institute for Software Research

The Internet infrastructure supports a creative and thriving social and economic system. End users are integral to the Internet,

not simply its audience end users are active participants, not passive

consumers end user producers outnumber technical

professionals

Architectural styles help explain software systems.

What is an appropriate theory of architectural style for Internet applications?

Page 6: BEHOLD, WE HAVE SIGNAL

6

Institute for Software Research

Demographics of US Internet users

Overall Total adults 73% Women 73 Men 73Age 18-29 90% 30-49 85 50-64 70

65+ 35Geography urban 74% suburban 77

rural 63Education < high school 44%

high school 63some college 84college + 91

Pew Internet & American Life Project, 2008

Page 7: BEHOLD, WE HAVE SIGNAL

7

Institute for Software Research

Activities of online users (“have you ever?”)

Activity % of ‘net users DateSend or read email 92 Dec 07Use search engine 89 May 08Search for map/directions 86 Dec 06Look for info on hobby/interest 83 F-M 07Check on product before buying 81 Sep 07Check weather 80 May 08Look for health/medical info 75 Dec 07Get travel info 73 M-J 04Get news 73 May 08Buy product 71 Dec 07Visit government website 66 May 08Buy travel reservation 64 Sep 07Surf for fun 62 F-A 06 + 65 more

Pew Internet & American Life Project, 2008

Page 8: BEHOLD, WE HAVE SIGNAL

8

Institute for Software Research

Content from online users (“have you ever?”)

Activity % of ‘net users DateUpload photos to share 37 Aug 06Rate product with online rating system 32Sep 07Post comment or review of product 30 Sep 07Use online social networking site 29 Dec 06Categorize or tag online content 28 Dec 06Share files from your computer 27 M-J 05Post comments to online group or blog 22Dec 07Share something online you created 21 Dec 07Create content for the internet 19 Nov 04 !!Share files with peer-to-peer sharing 15 May 08Create or work on your own webpage 14 Dec 07Create web pages/blogs for others 13 Dec 07Create or work on your own blog 12 May 08Participate in online health discussion 12 Aug 06

Pew Internet & American Life Project, 2008

Page 9: BEHOLD, WE HAVE SIGNAL

9

Institute for Software Research

Generation differences

Online Gen Y Gen X Trailing LeadingMatures After

Teens BoomerBoomer Work Activity 12-17 18-28 29-40 41-50 51-59 60-69 70+

Go online 87 84 87 79 75 64 21Teens and Gen Y more likely than older users:

Online games 81 54 37 29 25 25 32Inst message 75 66 52 38 42 33 25Download music 51 45 28 16 14 8 5Download video 31 27 22 14 8 8 1Gen X or older generations dominate

Get health info 73 84 80 84 68 72Travel rez 50 72 64 64 59 60Bank online 38 50 44 37 35 22

Pew Internet & American Life Project, 2005

Page 10: BEHOLD, WE HAVE SIGNAL

10

Institute for Software Research

There are lots of end users

C. Scaffidi, M. Shaw, and B. Myers. Estimating the Numbers of End Users and End User Programmers. VL/HCC'05: Proc 2005 IEEE Symposium on Visual Languages and Human-Centric Computing, pp. 207-214, 2005.

Using data from the Bureau of Labor Statistics, we estimate that over 90M Americans will use computers at work in 2012. Of these, only about 2.5M will be professional programmers; 40.5M will be managers and (non-software) professionals.

This does not include home users or non-US users, so there will be many more than 90M total end users. Most of them will “program” in some way.

Page 11: BEHOLD, WE HAVE SIGNAL

11

Institute for Software Research

End users are not software engineers

End users do not have rich and robust mental models of their computing systems they fail to do backups They misunderstand storage models (especially

local vs network storage) they execute malware they innocently engage in other risky behavior.

The responses of SE to this mismatch between real computing systems and end users’ models has been to seek ways to help the users act “rationally”.

But there is lots of evidence that people do not reason that way.

Page 12: BEHOLD, WE HAVE SIGNAL

12

Institute for Software Research

Human network is part of the Internet

Intrinsic property (say, dependability) of a unit in a specific (or assumed) environment based on a specific set of attributes

+ context, value proposition

Situational dependency, reflecting the environment and a specific user’s needs, tolerances, priorities, expectations

+ perception, understanding humans

Pragmatic dependability – as realized in practice – which results from decisions made by people and social groups and by the ways users behave

Page 13: BEHOLD, WE HAVE SIGNAL

13

Institute for Software Research

The Internet infrastructure supports a creative and thriving social and economic system.

The Internet is much richer than its communication infrastructure. This is an Ultra-Large Scale System

Usual software system assumptions don’t hold

Architectural styles help explain software systems.

What is an appropriate theory of architectural style for Internet applications?

Page 14: BEHOLD, WE HAVE SIGNAL

14

Institute for Software Research

Ultra-Large-Scale Systems

More than “systems of systems” or “networks of networks”

Large size on many dimensions … Lines of code, amount of data, users, dependencies

among and complexity of components, etc … but more than that

Decentralized operation and control Conflicting, unknowable, diverse requirements Continuous evolution and deployment Heterogeneous, inconsistent, changing elements Indistinct people/system boundary Normal failures New forms of acquisition and policy

SEI. Ultra-Large-Scale Systems. 2006

Page 15: BEHOLD, WE HAVE SIGNAL

15

Institute for Software Research

Decentralized operation and control

ULS system scale offers only limited possibilities for central or hierarchical control Long life, multiple users and objectives, span of

physical jurisdictions are the norm at ULS scale Many versions of subsystems must work

together Modifications are developed and installed by

independent groups Spontaneous, unanticipated new uses arise

Undermines common assumption: All conflicts must be resolved and must be

resolved uniformly

Page 16: BEHOLD, WE HAVE SIGNAL

16

Institute for Software Research

Conflicting, unknowable, diverse requirements

ULS systems serve wide range of competing needs Competing users contend for requirements Understanding of problem evolves Dependability is “better/worse”, not

“right/wrong” Wicked problems

Undermines common assumptions: Requirements are known in advance, evolve

slowly Tradeoff decisions will be stable

Page 17: BEHOLD, WE HAVE SIGNAL

17

Institute for Software Research

Continuous evolution and deployment

ULS systems have long lives and multiple independent developers Different groups may install capability for their

own needs; this may conflict with other groups Evolution can’t be controlled centrally, must be

shaped by rules and policies that protect critical services and allow diversity at the edges

Undermines common assumption: System improvements introduced in

“releases” Users/developers know about releases and

can choose to accept them or not

Page 18: BEHOLD, WE HAVE SIGNAL

18

Institute for Software Research

Heterogeneous, inconsistent, changing elements

ULS systems will be composed from independently-created components Heterogeneous: many sources, no single interface

standard, often incorporating legacy systems Inconsistent: evolution spontaneous, not planned;

different objectives may cause inconsistent versions Changing: hardware, software, operating

environment change based on local decisions

Undermines common assumptions: Effect of change can be predicted adequately Configuration information is accurate and

controlled Components and users are fairly homogeneous

Page 19: BEHOLD, WE HAVE SIGNAL

19

Institute for Software Research

Indistinct people/system boundary

ULS systems’ service to a user depends on actions of other users; user/developer distinction soft User actions may affect overall system health System must adapt to changing usage patterns Aggregate analysis may be better than exact

analysis

Undermines common assumption: Users’ behavior doesn’t affect overall system Collective behavior of people is not relevant Social interactions are not relevant

Page 20: BEHOLD, WE HAVE SIGNAL

20

Institute for Software Research

Normal failures

ULS system scale implies inevitable failures, so systems must do protection/recovery/enforcement Hardware failures inevitable because of scale Legitimate use of software and services outside

planned capability will cause degradation/failure Malicious use will cause problems

Undermines common assumptions: Failures will be infrequent and exceptional Defects can be removed

Page 21: BEHOLD, WE HAVE SIGNAL

21

Institute for Software Research

New forms of acquisition and policy

ULS systems will evolve, but there must be governance to prevent anarchy Success of system depends on organic evolution Individual developers won’t fully understand

core infrastructure Need effective guidance on allowed/unallowed

change

Undermines common assumption: There is a single agent responsible for

system development, operation, and evolution

Page 22: BEHOLD, WE HAVE SIGNAL

22

Institute for Software Research

Analogy: Cities and city planning

Cities are complex systems Built of individual components chosen by

individuals Constantly evolve Withstand failures and attacks

Cities are not centrally controlled Standards for infrastructures

Building codes, highway standards Policies that allow individual action within

constraints Zoning laws

Regulations that govern individual action Enforcement after the fact, rather than prior

constraint

Page 23: BEHOLD, WE HAVE SIGNAL

23

Institute for Software Research

The Internet infrastructure supports a creative and thriving social and economic system.

The Internet is much richer than its communication infrastructure.

Architectural styles help explain software systems. Style establishes the abstract structure of a system Character of problem should lead to style for

solution

What is an appropriate theory of architectural style for Internet applications?

Page 24: BEHOLD, WE HAVE SIGNAL

24

Institute for Software Research

Architectural style in software

Styles are rooted in common knowledge, intuition, informal prose, box-and-line diagrams

A style is a set of design rules that identify which kinds of components make up a system which kinds of connectors compose the components how control is shared among components how data is transmitted through the system what type of reasoning is compatible with the style

Focus on abstractions used by designers In practice, implemented as processes and calls … … but that loses the designer’s intent

M Shaw and P. Clements, A Field Guide to Boxology… COMPSAC 1997

Page 25: BEHOLD, WE HAVE SIGNAL

25

Institute for Software Research

Examples of styles

Data flow styles Batch sequential, dataflow network, closed-loop

control Call-and-return styles

Main program/subroutines; object-oriented systems Interacting process styles

Communicating processes, event subsystems Data-centered repository styles

Transactional database, blackboard Data-sharing styles

Compound document, hypertext, Fortran common Hierarchical styles

Layers

Page 26: BEHOLD, WE HAVE SIGNAL

26

Institute for Software Research

Members of Interacting Process Family

All members of the family are dominated by message-passing protocols among independent, usually concurrent processes with sporadic low-volume traffic.

===Control=== ====Data====Topo- Synch- Topo-

Substyle logy roniciy logy Mode

Comm proc arb not seq arb any1way data flow linear asynch linear passedClient-server star synch star passedHeartbeat hier ls/par hier/star pass,shareBroadcast arb asynch star bdcastToken passing arb asynch arb passed

G. Andrews, Paradigms for Process Interaction in Distributed Programs. ACM Comp Surv 1991

Page 27: BEHOLD, WE HAVE SIGNAL

27

Institute for Software Research

Language of Styles: Constituent Parts

Component: unit of software that performs some function at runtime process stand-alone

programprocedure transducermanager memory

Connector: mechanism that mediates communication, coordination, cooperation among components procedure call data stream

shared representation direct accessmessage protocol implicit invocation

Page 28: BEHOLD, WE HAVE SIGNAL

28

Institute for Software Research

Language of Styles: Control Issues

Topology: Geometric form of control flowlinear acyclichierarchical arbitrary

Synchronicity: dependency among components synchronous asynchronous

opportunistic direct access Binding time: when are control relations set

upmessage protocol compile timeinitialization time while running

Page 29: BEHOLD, WE HAVE SIGNAL

29

Institute for Software Research

Language of Styles: Data Issues

Topology: Geometric form of data flow linear acyclic

hierarchical arbitrary Continuity: how continuous is flow

continuous sporadic (discrete times)

Mode: how data is made availableexplicit passing sharingbroadcast multicast

Binding time: when are control relations set upmessage protocol compile timeinitialization time while running

Page 30: BEHOLD, WE HAVE SIGNAL

30

Institute for Software Research

Language of Styles: Reasoning

Data flow styles Functional composition

Call-and-return styles Hierarchy (local reasoning)

Interacting process styles Nondeterminism

Data-centered repository styles Data integrity (ACID, convergence, invariants)

Data-sharing styles Representation

Hierarchical styles Levels of service

Page 31: BEHOLD, WE HAVE SIGNAL

31

Institute for Software Research

Utility of Arch Description Languages

Establish uniform descriptive notation Clarify informal concepts Document design intent

Discriminate among different styles Bring out significant differences that affect suitability

for various tasks Provide advice on selecting style for a problem

Following Jackson, characterize problem domain and use this to select appropriate abstract architecture

Separate concerns about fit of architecture to problem from implementation and performance issues

Allow comparison of alternatives, potentially simulation and analysis

Page 32: BEHOLD, WE HAVE SIGNAL

32

Institute for Software Research

The Internet infrastructure supports a creative and thriving social and economic system.

The Internet is much richer than its communication infrastructure.

Architectural styles help explain software systems.

What is an appropriate theory of architectural style for Internet applications? Style establishes the abstract structure of a system Character of problem should lead to style for solution

Page 33: BEHOLD, WE HAVE SIGNAL

33

Institute for Software Research

Picking a target in cyberspace

Cyberspace supports many sophisticated applications, interactions, and communities, including everything from IM/email to enterprise integration.

Focus here on the web as used by everyday people This is the most different from conventional

software systems We need to start with a specific focus End users are in dire need of support

Page 34: BEHOLD, WE HAVE SIGNAL

34

Institute for Software Research

Example: Twitter Vote Report

On US Election Day 2008, this application, set up by an interested user community, collected reports on status of polling places and displayed it on a map in near real time so that interested people, especially the public, could spot problem areas

Page 35: BEHOLD, WE HAVE SIGNAL

35

Institute for Software Research

VoteReport input stream

Data was collected by twitter, SMS, and telephone call.

A stylized format was provided (and often ignored)

The message feed was available

Page 36: BEHOLD, WE HAVE SIGNAL

36

Institute for Software Research

VoteReport composition

The VoteReport application interpreted the Twitter/SMS feed, geo-referenced the data, and mapped them, presumably through the efforts of professional programmers.

The report could be embedded in a web site by anyone

The input stream was also available

Page 37: BEHOLD, WE HAVE SIGNAL

37

Institute for Software Research

Mashups

“Mashup”s combine functionality of multiple websites or augment existing websites; they show how users build on existing systems in new and innovative ways.

Wong did a qualitative survey of high-quality mashups to see how they used/improved existing web sites, how they combined data from multiple sites, and what kinds of user tasks they support

Overall impression: mashups are ad hoc and idiosyncratic

J. Won and J Hong. What do we “mashup” when we make mashus? WEuSE IV 08

Page 38: BEHOLD, WE HAVE SIGNAL

38

Institute for Software Research

Capabilities of Mashups

Mashups provided a variety of capabilities, often more than one per mashup Search: Does it have a search interface? Visualization: Does it add visualization to the

date? Real-time: Is the purpose to allow user to

monitor original websie as realtime data set? Widget: Is it actually a widget or plugin? Personalized: Does it use personal information or

enable personalization? Folksonomy: Does it use or add tagging? In-situ use: Does it simply tailor a website for a

specific situation or use?

Page 39: BEHOLD, WE HAVE SIGNAL

39

Institute for Software Research

Categories of mashups

Aggregation: aggregate multiple websites or summarize sets of data

Alternate UI & In-situ use: provide new ways t interact with a website or support specific use cases

Personalization: specialize a website based on personal data from that site or another source

Focused View of Data: index or categorize contents of another website

Real-time Monitoring: react to changes in a website and make user aware of them

Page 40: BEHOLD, WE HAVE SIGNAL

40

Institute for Software Research

Implementations of mashups

Inputs RSS feeds, search engine feeds, personal names,

user ids, settings, ranges, geographical points and directions, search terms, tracking numbers, ISBNs, headlines, tags, “clippings” (screen scraping?)

Outputs RSS feeds, list of links, visualizations, text, maps,

pictures, numeric tables, lists, reports APIs

Delicious, yelp, craiglist, flickr, city information feeds, eBay, Google Maps, Google Earth, USGS DRM, amazon, UPS, USPS, FedEx, iTunes, Napster, news sites, wikipedia daylife, Rhapsody, search engines, Twitter, YouTube, MySpace

Page 41: BEHOLD, WE HAVE SIGNAL

41

Institute for Software Research

Architectural Task for Web Applications

How do web applications differ from classical software systems? Richer set of component types Much stronger role for data and its presentation More integration by inclusion or embedding More significance for initiative (push/pull) Minimal reliance on compiler for integration

How much of classic architectural style carries over? How much extension is required? Structure (component/connector/composition)

remains valid Details partially hold, but require extension

Page 42: BEHOLD, WE HAVE SIGNAL

42

Institute for Software Research

Online Activities

Between user and …

Interactive Synchronous Asynchronous

… few known people

IM, chat,online meetings

email private blog, photo sharing

… many unknown people

multiplayer games, broadcast

newsgroups, mailing lists, RSS feeds

public blog, wiki, social networks, ratings

… a server e-commerce, banking, games, remote desktop

search, news, job hunt

web browsing, music download, storage/backup

Page 43: BEHOLD, WE HAVE SIGNAL

43

Institute for Software Research

Distinguishing the Classes of Solutions

One role of an architecture for cyberspace is to explain the structural distinctions among the applications that support user activities

Sometime differences matter You wouldn’t use IM for offline backup of photo

files What attributes of IM, photo files, and the backup

task show the mismatch? Sometimes the same application can be

realized in different ways Both FTP and P2P support file sharing, but with

different resource implications What attributes show the abstract similarity, and

what attributes show the resource implications?

Page 44: BEHOLD, WE HAVE SIGNAL

44

Institute for Software Research

Elementary Component Types (provisional)

Text (human-readable, file or message) Encoded file (typed set of bits) Database (structured, with query capability) Web page (human-readable, possibly

generated) Computation or service Stream (continuous, nonpersistent) Human (yes, people are part of the system)

Attributes of interest state persistence, qualitative size, type/internal

structure, rate of change (static/dynamic)

Page 45: BEHOLD, WE HAVE SIGNAL

45

Institute for Software Research

Connector Primitives (provisional)

The lowest level of primitive is the TCP/UDP port

20, 21 FTP 23 Telnet25 SMTP 80, 443 HTTP, HTTPS109, 110 POP2, POP3 143, 220 IMAP194 IRC 666 Doom 3724 World Warcraft 4664 Google desktop5190 AOL IM 6881-6988 BitTorrent

… and a couple of thousand others Attributes of interest

activation (push/pull/continuous/interactive query-respond); arity; signature; knowledge of targets; abstract protocol; synchronicity; performance (rates, bandwidth, latency); state of interaction (persistence, location); duration of interaction

Page 46: BEHOLD, WE HAVE SIGNAL

46

Institute for Software Research

Composition of primitives

Abstractly, we can recognize Simple connections

Scripting, linking Remote call in various forms

RPC, service invocation Extended sessions

Browser session with session ID User-facing compositions

Installation of plug-in, mashups Enterprise compositions

SOA, SaaS

Page 47: BEHOLD, WE HAVE SIGNAL

47

Institute for Software Research

Example: Internet Messaging

A human creates short unstructured text units and pushes them via IRC to another known person. They are delivered with low latency, but they are not persistent.

Page 48: BEHOLD, WE HAVE SIGNAL

48

Institute for Software Research

Example: Grapevine model of email

An email system has five components: a composer, a reader, a database for archiving messages, a directory system, and a transport mechanism. A message is a structured text entity. It has formatted

headers, including an address text string, and may have internal formatting and/or attachments.

The composer is an interactive editor (computation) that produces a mail message

The reader can display the message, and it is integrated with the database for archiving messages, which it can also classify and retrieve.

The directory system translates a symbolic address to a destination.

The transport mechanism moves mail to is destination.

Grapevine paper, CACM, sometime in the 70s

Page 49: BEHOLD, WE HAVE SIGNAL

49

Institute for Software Research

Example: VoteReport

Here is a plausible architectural description of the VoteReport application The input is a stream of messages, merged from

twitter, SMS, and transcriptions of phone calls. The input is supposed to be structured but often is not

A parser receives the input stream and transduces the messages (to the extent possible) to georeferenced reports

A report has a text, a timestamp, and an icon Recent reports are displayed on a zoomable map

embedded in a web page Filters are provided, and summaries are displayed The map and input stream are elements that can

be incorporated in other web pages.

Page 50: BEHOLD, WE HAVE SIGNAL

50

Institute for Software Research

The Internet infrastructure supports a creative and thriving social and economic system. Some of the wires in the network have myelin

sheaths The Internet is much richer than its

communication infrastructure. The problems are wicked, not just technical

Architectural styles help explain software systems. Software architecture provides a model for a

descriptive theory of software in cyberspace What is an appropriate theory of architectural

style for Internet applications? Can we help end users at the ends of the network?

Page 51: BEHOLD, WE HAVE SIGNAL

51

Institute for Software Research

BEHOLD, WE HAVE SIGNAL