CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of...

21
CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    3

Transcript of CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of...

Page 1: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

CS 290C: Formal Models for Web Software

Lecture 10: Language Based Modeling and Analysis of Navigation Errors

Instructor: Tevfik Bultan

Page 2: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

Web Application Modeling

• So far we have discussed various approaches for modeling, analyzing and verifying web applications

• We have seen two main approaches:– Model driven development approaches where the

application is specified or enhanced using a formal model

• For example: WebML, navigation state machines– Reverse engineering approaches where a formal model

is extracted fro the application• For example: Extracting a state machine model for

navigation by analyzing the links that are inserted in web pages

Page 3: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

Model Driven Development Approach

• Model driven development approach enables– Specification of the behavior of the application at a high

level of abstraction, making it easier to develop applications.

– The actual implementation can be automatically or semi-automatically generated from the high level models

– Separation of concerns can be achieved by specifying different concerns about the application (such as the data model or the navigation constraints) using different specification mechanisms

• However, model driven development requires the developers to learn and use the modeling languages

• There is a concern about the mapping between the actual implementation and model (they have to maintained together)

Page 4: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

Reverse Engineering Approach

• Reverse engineering approaches does not require developers to learn a new specification language

• Since reverse engineering approaches extract a model directly from the code, there is no maintenance issues (when the application changes, we can extract a new model)

• However, reverse engineering is hard:– Extracting sound models using static analysis can lead

to very approximate models that do not contain much information or can be undecidable for more precise models

– Extracting models by observing runtime behavior is not sound and cannot be used to guarantee correctness

Page 5: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

How About a Language Based Approach?

• Both model driven development and reverse engineering approaches can be considered software engineering approaches

• Another approach would be to use a programming language based approach

• Can we model the problems that appear in Web applications in programming language terms and possibly suggest solutions using programming language mechanisms (such as type checking)?

Page 6: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

A Language Based Approach

• Today I will discuss the following paper which presents a language based approach for modeling and analyzing navigation problems in Web applications:

“Modeling Web Interactions and Errors,” S. Krishnamurthi, R. B. Findler, P. Graunke, and M. Felleisen.

Page 7: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

Web Applications

• A Web program’s execution consists of a series of interactions between a Web browser and a Web server

• When a browser submits a http request whose URL points to a Web program, the server invokes the program with the request using some protocol– GCI, Java servlets, ASP.NET

• It then waits from the program to terminate and turns the program’s output into a response that the browser can display, i.e., it returns a Web page.

• Each such program is called a “script” since they only read some inputs and write some output

Page 8: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

Web Applications

• This simple request-response style programming using scripts makes design of multi-stage Web interactions difficult

• A multi-stage interactive Web program consists of many scripts each handling one request– These scripts communicate with each other via external

media since they must remember the earlier part of the interaction

– Forcing scripts to communicate this way causes problems since they lead to unstated and easily violated invariants

Page 9: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

Web Applications

• Use of the Web browser creates further complications– A browser is designed to let a user navigate a web of

hyperlinked nodes– When a user uses this power to navigate an interaction

with an application many unexpected scenarios can happen

• User can backtrack to an earlier stage of the interaction

• User can duplicate a page and generate parallel interactions

Page 10: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

A Language Based Approach

• We will first describe a formal model that captures the essence of Web application behavior

• Then we will investigate the use of language based techniques to address the navigation problems

Page 11: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

A Formal Model

• A Web application (W) consists of – a server (S) and– a client (C)

• Server consists of – a storage, and– a dispatcher

• Dispatcher contains – a table (P) of programs that associates URLs with

programs and – an evaluator that applies programs from the table to the

submitted form

Page 12: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

A Formal Model

• Every page is simply a form (F) that contains– the URL to which the form is submitted, and– a set of form fields

• A field name is a value that can be edited by the client

• The client stores the – the current form and – the sequence of all the forms that have been visited by

the client so far (cached pages)

Page 13: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

Web Program Behavior

• The behavior of the Web program is described using three types of actions:

– Fill-form: This corresponds to client editing values of fields in the current form. The modified form becomes the current form and is added to the cache

– Switch: Makes a form from the cache the current form– Submit: dispatches on the current form’s URL to find a

program in the table P. This program accesses the server state and the current form and updates the server state and generates a new form which becomes the current form

Page 14: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

A Simple Web Programming Language

• A simple functional programming language can be specified to characterize the basic operations that are required to write a web application:– Extract a field from a form– Construct a new form– Modify fields of a form

• To allow stateful programming we can introduce read and write operations that allow read and write access to the server storage

Page 15: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

Navigation Problems

• Two navigation problems can be characterized formally in this model:– Script communication problem: Where a script

accepts a different type of form than what is delivered to it. For example, the script tries to access a field that does not exist in the form

– HTTP observer problem: Since the http protocol does not allow a proper implementation of the observer pattern (which enables independent observers to be notified of state changes) a page received by the client can become outdated when the MVC model changes in the server.

Page 16: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

Script Communication Problem and Types

• The main issue in script communication problem is type mismatch between the forms generated and consumed by different scripts

• Since these scripts are loosely coupled programs, there is no standard type checking mechanism that can be used to make sure that these type mismatches do not happen

• Checking all scripts together is not feasible since they are developed incrementally and may reside on different Web servers and may be written using different programming languages

Page 17: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

An Incremental Type System for Web Applications

• The proposed solution is the following:– When the Web server receives a request for a URL that

is not already in its table, it installs the relevant program– Before installing the relevant program it checks that

there is no type mismatch with the input form and the installed program (internal consistency check)

– Furthermore it generates type constraints that this new installed program imposes on other programs in the server that it interacts (there become external consistency checks)

• If either the internal or external check fails the program is rejected resulting in an error

Page 18: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

A Simple Typed Web Programming Language

• The simple functional Web programming language can be extended with types by requiring type declarations for function arguments

• The type system for this language shows how external type checking can be done– While traversing the program, the type system generates

a set of type constraints on external programs– Each constraints state a condition such as: a program

associated with a particular URL should consume Web forms of a particular type

Page 19: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

Solving Script Communication Problem with Type Checking

• Using type checking with this incremental system it can be guaranteed that – scripts do not get stuck when they are processing

appropriately typed forms– Server does not apply the scripts to forms with wrong

types

Page 20: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

Solving the http observer problem with timestamps

• Server keeps track of the number of processes submissions (this represent time)

• The external storage is changed so that it maps locations to values + timestamp for the last write

• The server also maintains the set of all storage locations read or written during the execution of a script (called a carrier set CS)– When severs sends a page to the consumer, it adds the

current time stamp and this set of locations as an extra hidden field

Page 21: CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

Solving the http observer problem with timestamps

• A form with carrier set CS and time stamp T submitted to a server is out of date if and only if any of the locations in CS have a timestamp at the server that is greater than T

• A runtime error can be generated when out of date forms are submitted preventing execution of scripts with out of date data– This approach solves the example problem of booking

an unintended flight• However, this approach can also generate false positives

(for example a page counter value may make the form out of date)– So the programmers must specify which reads or writes

are relevant, and an error is generated only when a relevant field is out of date