SEG 2106 SOFTWARE CONSTRUCTION INSTRUCTOR: HUSSEIN AL OSMAN THE
COURSE MATERIAL IS BASED ON THE COURSE CONSTRUCTED BY PROFS: GREGOR
V. BOCHMANN ( HTTPS://WWW.SITE.UOTTAWA.CA/~BOCHMANN/) JIYING ZHAO
(HTTP://WWW.SITE.UOTTAWA.CA/~JYZHAO/)
Slide 2
COURSE SECTIONS Section 0: Introduction Section 1: Software
development processes + Domain Analysis Section 2: Requirements +
Behavioral Modeling (Activity Diagrams) Section 3: More Behavioral
Modeling (State Machines) Section 4: More Behavioral Modeling (Case
Study) Section 5: Petri Nets Section 6: Introduction to Compilers
Section 7: Lexical Analysis Section 8: Finite State Automata
Section 9: Practical Regular Expressions Section 10: Introduction
to Syntax Analysis Section 11: LL(1) Parser Section 12: More on LL
Parsing (Error Recovery and non LL(1) Parsers) Section 13: LR
Parsing Section 14: Introduction to Concurrency Section 15: More on
Concurrency Section 16: Java Concurrency Section 17: Process
Scheduling Section 18: Web Services 2
EVALUATION SCHEME Assignments (4)25% Labs (7)15% Midterm
Exam20% Final Exam 40% Late assignments are accepted for a maximum
of 24 hours and they will receive a 30% penalty. 6
Slide 7
LABS Seven labs in total Three formal labs (with a report)
Worth between 3 to 4% The other labs are informal (without a
report) 1% for each one You show you work to the TA at the end of
the session 7
Slide 8
INFORMAL LABS Your mark will be proportional to the number of
task successfully completed: All the tasks are completed: 1% More
than half completed: 0.75% Almost half is completed: 0.5% You have
tried at least (given that you attended the whole session): 0.25%
8
Slide 9
MAJOR COURSE TOPICS Chapter 1: Introduction and Behavioral
Modeling Introduction to software development processes Waterfall
model Iterative (or incremental) model Agile model Behavioral
modeling UML Use case models (seen previously) UML Sequence
diagrams (seen previously) UML activity diagrams (very useful to
model concurrent behavior) UML state machines (model the behavior
of a single object) Petri Nets SDL 9
Slide 10
MAJOR COURSE TOPICS Chapter 2: Compilers, formal languages and
grammars Lexical analysis (convert a sequence of characters into a
sequence of tokens) Formal languages Regular expressions (method to
describe strings) Deterministic and Non-deterministic Finite
Automata Syntax analysis Context-free grammar (describes the syntax
of a programming language) Syntactic analysis Syntax trees 10
Slide 11
MAJOR COURSE TOPICS Chapter 3: Concurrency Logical and physical
concurrency Process scheduling Mutual exclusion for access to
shared resources Concurrency and Java programing Design patterns
and performance considerations 11
Slide 12
MAJOR COURSE TOPICS Chapter 4: Cool topics! We will vote on one
or more of these topics to cover (given that we have completed the
above described material, with some time to spare) Mobile
programing (mostly Android) Web services J2EE major components
Spring framework Agile programing (especially SCRUM) Other
suggestions 12
Slide 13
CLASS POLICIES Late Assignments Late assignments are accepted
for a maximum of 24 hours and they will receive a 30% penalty.
13
Slide 14
CLASS POLICIES Plagiarism Plagiarism is a serious academic
offence that will not be tolerated. Note that the person providing
solutions to be copied is also committing an offence as they are an
active participant in the plagiarism. The person copying and the
person copied from will be reprimanded equally according to the
regulations set by the University of Ottawa. Please refer to this
link for more information:
www.uottawa.ca/academic/info/regist/crs/0305/home_5_ENG.htm.
www.uottawa.ca/academic/info/regist/crs/0305/home_5_ENG.htm 14
Slide 15
CLASS POLICIES Attendance Class attendance is mandatory. As per
academic regulations, students who do not attend 80% of the class
will not be allowed to write the final examinations. All components
of the course (i.e laboratory reports, assignments, etc.) must be
fulfilled otherwise students may receive an INC as a final mark
(equivalent to an F). Absence from a laboratory session or an
examination because of illness will be excused only if you provide
a certificate from Health Services (100 Marie Curie, 3rd Floor)
within the week following your absence. 15
Slide 16
SECTION 1 SOFTWARE DEVELOPMENT PROCESS AND DOMAIN ANALYSIS
Slide 17
LECTURE TOPICS This lecture will briefly touch on the following
topics: Software Development Process Domain Analysis
Slide 18
TOPIC 1 SOFTWARE DEVELOPMENT PROCESS
Slide 19
LIFE CYCLE The life cycle of a software product from inception
of an idea for a product through domain analysis requirements
gathering architecture design and specification coding and testing
delivery and deployment maintenance and evolution retirement
Slide 20
MODELS ARE NEEDED Symptoms of inadequacy: the software crisis
scheduled time and cost exceeded user expectations not met poor
quality The size and economic value of software applications
required appropriate "process models"
Slide 21
PROCESS AS A "BLACK BOX"
Slide 22
PROBLEMS The assumption is that requirements can be fully
understood prior to development Unfortunately the assumption almost
never holds Interaction with the customer occurs only at the
beginning (requirements) and end (after delivery)
Slide 23
PROCESS AS A "WHITE BOX"
Slide 24
ADVANTAGES Reduce risks by improving visibility Allow project
changes as the project progresses based on feedback from the
customer
Slide 25
THE MAIN ACTIVITIES They must be performed independently of the
model The model simply affects the flow among activities
Slide 26
WATERFALL MODELS Invented in the late 1950s for large air
defense systems, popularized in the 1970s They organize activities
in a sequential flow Standardize the outputs of the various
activities (deliverables) Exist in many variants, all sharing
sequential flow style
Slide 27
A WATERFALL MODELS Domain analysis and feasibility study
Requirements Design Coding and module testing Integration and
system testing Delivery, deployment, and maintenance
Slide 28
WATERFALL STRENGTHS Easy to understand, easy to use Provides
structure to inexperienced staff Milestones are well understood
Sets requirements stability
Slide 29
WATERFALL WEAKNESSES All requirements must be known upfront
Deliverables created for each phase are considered frozen inhibits
flexibility Can give a false impression of progress Does not
reflect problem-solving nature of software development iterations
of phases Integration is one big bang at the end Little opportunity
for customer to preview the system (until it may be too late)
Slide 30
WHEN TO USE WATERFALL Requirements are very well known Product
definition is stable Technology is very well understood New version
of an existing product (maybe!) Porting an existing product to a
new platform High risk for new systems because of specification and
design problems. Low risk for well-understood developments using
familiar technology.
Slide 31
WATERFALL WITH FEEDBACK Domain analysis and feasibility study
Requirements Design Coding and module testing Integration and
system testing Delivery, deployment, and maintenance
Slide 32
ITERATIVE DEVELOPMENT PROCESS Also referred to as incremental
development process Develop system through repeated cycle
(iterations) Each cycle is responsible for the development of a
small portion of the solution (slice of functionality) Contrast
with waterfall: Water fall is a special iterative process with only
one cycle
Slide 33
ITERATIVE DEVELOPMENT PROCESS Iteration Planning Requirements
Update Architecture and Design Implementation Domain Analysis and
Initial Planning Testing Evaluation (involving end user) Deployment
Cycle
Slide 34
AGILE METHODS Dissatisfaction with the overheads involved in
software design methods of the 1980s and 1990s led to the creation
of agile methods. These methods: Focus on the code rather than the
design Are based on an iterative approach to software development
Are intended to deliver working software quickly and evolve this
quickly to meet changing requirements The aim of agile methods is
to reduce overheads in the software process (e.g. by limiting
documentation) and to be able to respond quickly to changing
requirements without excessive rework.
Slide 35
AGILE MANIFESTO We are uncovering better ways of developing
software by doing it and helping others do it. Through this work we
have come to value: Individuals and interactions over processes and
tools Working software over comprehensive documentation Customer
collaboration over contract negotiation Responding to change over
following a plan That is, while there is value in the items on the
right, we value the items on the left more.
Slide 36
THE PRINCIPLES OF AGILE METHODS PrincipleDescription Customer
involvement Customers should be closely involved throughout the
development process. Their role is to provide and prioritize new
system requirements and to evaluate the iterations of the system.
Incremental delivery The software is developed in increments with
the customer specifying the requirements to be included in each
increment. People not process The skills of the development team
should be recognized and exploited. Team members should be left to
develop their own ways of working without prescriptive processes.
Embrace change Expect the system requirements to change and so
design the system to accommodate these changes. Maintain
simplicityFocus on simplicity in both the software being developed
and in the development process. Wherever possible, actively work to
eliminate complexity from the system.
Slide 37
SCRUM PROCESS
Slide 38
PROBLEMS WITH AGILE METHODS It can be difficult to keep the
interest of customers who are involved in the process Team members
may be unsuited to the intense involvement that characterizes agile
methods Prioritizing changes can be difficult where there are
multiple stakeholders Minimizing documentation: almost nothing is
captured, the code is the only authority
Slide 39
TOPIC 2 DOMAIN ANALYSIS
Slide 40
DOMAIN MODELING The aim of domain analysis is to understand the
problem domain independently of the particular system we intend to
develop. We do not try to draw the borderline between the system
and the environment. We focus on the concepts and the terminology
of the application domain with a wider scope than the future
system.
Slide 41
ACTIVITIES AND RESULTS OF DOMAIN ANALYSIS 1.A dictionary of
terms defining the common terminology and concepts of the problem
domain; 2.Description of the problem domain from a conceptual
modeling viewpoint We normally use UML class diagrams (with as
little detail as possible) Remember, we are not designing, but just
establishing the relationship between entities 3.Briefly describe
the main interactions between the user and the system
Slide 42
EXAMPLE PROBLEM DEFINITION
Slide 43
We want to design the software for a simple Point of Sale
Terminal that operates as follows: Displays that amount of money to
pay for the goods to be purchased Asks the user to insert a
financial card (debit or credit) If the user inserts a debit card,
he or she is asked to choose the account type Asks the user to
enter a pin number Verifies the pin number against the one stored
on the chip Contacts the bank associated with the card in order to
perform the transaction
Slide 44
EXAMPLE DICTIONARY OF TERMS (1) Point of Sale Terminal: machine
that allows a retail transaction to be completed using a financial
card Credit card: payment card issued to users as a system of
payment. It allows the cardholder to pay for goods and services
based on the holder's promise to pay for them Debit card: plastic
payment card that provides the cardholder electronic access to his
or her bank account
Slide 45
EXAMPLE DICTIONARY OF TERMS (2) Bank: financial institution
that issues financial cards and where the user has at least one
account into which he or she can withdraw or deposit money Bank
Account: is a financial account between a user and a financial
institution User: client of that possesses a debit card and
benefits from the use of a point of sale terminal Pin number:
personal identification number (PIN, pronounced "pin"; often
erroneously PIN number) is a secret numeric password shared between
a user and a system that can be used to authenticate the user to
the system
Slide 46
EXAMPLE PROBLEM DOMAIN PinNumber FinancialCard Bank User
DebitCardCreditCard BankAccount * 1 1..2 * 1 1..* PosTerminal
*
Slide 47
EXAMPLE MAIN INTERACTIONS Inputs to POS Terminal: Insertion of
Financial Card, Pin Number, Specify Account, Confirm purchase
Outputs from POS Terminal: Error Message (regarding pin or funds),
Confirmation of Purchase,
Slide 48
SECTION 2 REQUIREMENTS BEHAVIORAL MODELING
Slide 49
TOPICS Review of some notions regarding requirements Client
requirements Functional requirements Non-functional requirements
Introduction to Behavioral Modeling Activity Diagrams
Slide 50
TOPIC 1 REQUIREMENTS
Slide 51
We will describe three types of requirements: Customer
requirements (a.k.a informal or business requirements) Functional
requirements Non-functional requirements
Slide 52
CUSTOMER REQUIREMENTS We have completed the domain analysis, we
are ready to get our hands dirty We need to figure out exactly what
the customer wants: Customer Requirements This is where the
expectations of the customer are captured Composed typically of
high level, non-technical statements Example Requirement 1: We need
to develop an online customer portal Requirement 2: The portal must
list all our products
Slide 53
FUNCTIONAL REQUIREMENTS Capture the intended behavior of the
system May be expressed as services, tasks or functions the system
performs Use cases have quickly become a widespread practice for
capturing functional requirements This is especially true in the
object-oriented community where they originated Their applicability
is not limited to object-oriented systems
Slide 54
USE CASES A use case defines a goal-oriented set of
interactions between external actors and the system under
consideration Actors are parties outside the system that interact
with the system An actor may be a class of users or other systems A
use case is initiated by a user with a particular goal in mind, and
completes successfully when that goal is satisfied It describes the
sequence of interactions between actors and the system necessary to
deliver the service that satisfies the goal
Slide 55
USE CASE DIAGRAMS
Slide 56
Include relationship: use case fragment that is duplicated in
multiple use cases Extend relationship: use case conditionally adds
steps to another first class use case Example:
Slide 57
USE CASE ATM EXAMPLE Actors: ATM Customer ATM Operator Use
Cases: The customer can withdraw funds from a checking or savings
account query the balance of the account transfer funds from one
account to another The ATM operator can Shut down the ATM Replenish
the ATM cash dispenser Start the ATM
Slide 58
USE CASE ATM EXAMPLE
Slide 59
Validate PIN is an Inclusion Use Case It cannot be executed on
its own Must be executed as part of a Concrete Use Case On the
other hand, a Concrete Use Case can be executed
Slide 60
USE CASE VALIDATE PIN (1) Use case name: Validate PIN Summary:
System validates customer PIN Actor: ATM Customer Precondition: ATM
is idle, displaying a Welcome message.
Slide 61
USE CASE VALIDATE PIN (2) Main sequence: 1.Customer inserts the
ATM card into the card reader. 2.If system recognizes the card, it
reads the card number. 3.System prompts customer for PIN.
4.Customer enters PIN. 5.System checks the card's expiration date
and whether the card has been reported as lost or stolen. 6.If card
is valid, system then checks whether the user- entered PIN matches
the card PIN maintained by the system. 7.If PIN numbers match,
system checks what accounts are accessible with the ATM card.
8.System displays customer accounts and prompts customer for
transaction type: withdrawal, query, or transfer.
Slide 62
USE CASE VALIDATE PIN (3) Alternative sequences: Step 2: If the
system does not recognize the card, the system ejects the card.
Step 5: If the system determines that the card date has expired,
the system confiscates the card. Step 5: If the system determines
that the card has been reported lost or stolen, the system
confiscates the card. Step 7: If the customer-entered PIN does not
match the PIN number for this card, the system re-prompts for the
PIN. Step 7: If the customer enters the incorrect PIN three times,
the system confiscates the card. Steps 4-8: If the customer enters
Cancel, the system cancels the transaction and ejects the card.
Postcondition: Customer PIN has been validated.
Slide 63
USE CASE WITHDRAW FUNDS (1) Use case name: Withdraw Funds
Summary: Customer withdraws a specific amount of funds from a valid
bank account. Actor: ATM Customer Dependency: Include Validate PIN
use case. Precondition: ATM is idle, displaying a Welcome
message.
Slide 64
USE CASE WITHDRAW FUNDS (2) Main sequence: 1.Include Validate
PIN use case. 2.Customer selects Withdrawal, enters the amount, and
selects the account number. 3.System checks whether customer has
enough funds in the account and whether the daily limit will not be
exceeded. 4.If all checks are successful, system authorizes
dispensing of cash. 5.System dispenses the cash amount. 6.System
prints a receipt showing transaction number, transaction type,
amount withdrawn, and account balance. 7.System ejects card.
8.System displays Welcome message.
Slide 65
USE CASE WITHDRAW FUNDS (3) Alternative sequences: Step 3: If
the system determines that the account number is invalid, then it
displays an error message and ejects the card. Step 3: If the
system determines that there are insufficient funds in the
customer's account, then it displays an apology and ejects the
card. Step 3: If the system determines that the maximum allowable
daily withdrawal amount has been exceeded, it displays an apology
and ejects the card. Step 5: If the ATM is out of funds, the system
displays an apology, ejects the card, and shuts down the ATM.
Postcondition: Customer funds have been withdrawn.
Slide 66
NON-FUNCTIONAL REQUIREMENTS Functional requirements define what
a system is supposed to do Non-functional requirements define how a
system is supposed to be Usually describe system attributes such as
security, reliability, maintainability, scalability, usability
Slide 67
NON-FUNCTIONAL REQUIREMENTS Non-Functional requirements can be
specified in a separate section of the use case description In the
previous example, for the Validate PIN use case, there could be a
security requirement that the card number and PIN must be encrypted
Non-Functional requirements can be specified for a group of use
cases or the whole system Security requirement: System shall
encrypt ATM card number and PIN. Performance requirement: System
shall respond to actor inputs within 5 seconds.
Slide 68
TOPIC 2 BEHAVIORAL MODELING
Slide 69
SOFTWARE MODELING UML defines thirteen basic diagram types,
divided into two general sets: Structural Modeling Behavioral
Modeling Structural Models define the static architecture of a
model They are used to model the things that make up a model the
classes, objects, interfaces and physical components In addition
they are used to model the relationships and dependencies between
elements
Slide 70
BEHAVIORAL MODELING Behavior Models capture the dynamic
behavior of a system as it executes over time They provide a view
of a system in which control and sequencing are considered Either
within an object (by means of a finite state machine) or between
objects (by analysis of object interactions).
Slide 71
UML ACTIVITY DIAGRAMS In UML an activity diagram is used to
display the sequence of actions They show the workflow from start
to finish Detail the many decision paths that exist in the
progression of events contained in the activity Very useful when
parallel processing may occur in the execution of some
activities
Slide 72
UML ACTIVITY DIAGRAMS An example of an activity diagram is
shown below (We will come back to that diagram)
Slide 73
ACTIVITY An activity is the specification of a parameterized
sequence of behavior Shown as a round-cornered rectangle enclosing
all the actions and control flows
Slide 74
ACTIONS AND CONSTRAINS An action represents a single step
within an activity Constraints can be attached to actions
Slide 75
CONTROL FLOW Shows the flow of control from one action to the
next Its notation is a line with an arrowhead. Initial Node Final
Node, two types: Activity Final NodeFlow Final Node
Slide 76
OBJECTS FLOW An object flow is a path along which objects or
data can pass An object is shown as a rectangle A short hand for
the above notation
Slide 77
DECISION AND MERGE NODES Decision nodes and merge nodes have
the same notation: a diamond shape The control flows coming away
from a decision node will have guard conditions
Slide 78
FORK AND JOIN NODES Forks and joins have the same notation:
either a horizontal or vertical bar They indicate the start and end
of concurrent threads of control Join synchronizes two inflows and
produces a single outflow The outflow from a join cannot execute
until all inflows have been received
Slide 79
PARTITION Shown as horizontal or vertical swim lane Represents
a group of actions that have some common characteristic
Slide 80
UML ACTIVITY DIAGRAMS Coming back to our initial example
Slide 81
ISSUE HANDLING IN SOFTWARE PROJECTS Courtesy of
uml-diagrams.org
Slide 82
MORE ON ACTIVITY DIAGRAMS Interruptible Activity Regions
Expansion Regions Exception Handlers
Slide 83
INTERRUPTIBLE ACTIVITY REGION Surrounds a group of actions that
can be interrupted Example below: Process Order action will execute
until completion, when it will pass control to the Close Order
action, unless a Cancel Request interrupt is received, which will
pass control to the Cancel Order action.
Slide 84
EXPANSION REGION An expansion region is an activity region that
executes multiple times to consume all elements of an input
collection Example of books checkout at a library modeled using an
expansion region: Checkout Books Find Books to Borrow Checkout Book
Show Due Date Place Books in Bags
Slide 85
EXPANSION REGION Another example: Encoding Video Encode Video
Capture Video Extract Audio from Frame Encode Video Frame Save
Encoded Video Attach Audio to Frame
Slide 86
EXCEPTION HANDLERS An exception handler is an element that
specifies what to execute in case the specified exception occurs
during the execution of the protected node In Java Try block
corresponds to Protected Node Catch block corresponds to the
Handler Body Node
Slide 87
SECTION 3 BEHAVIORAL MODELING
Slide 88
TOPICS We will continue with the subject of Behavioral Modeling
Introduce the various components of UML state machines
Slide 89
ACTIVITY DIAGRAMS VS STATE MACHINES In Activity Diagrams
Vertices represent Actions Edges (arrows) represent transition that
occurs at the completion of one action and before the start of
another one (control flow) Vertex representing an Action Arrow
implying transition from one action to another
Slide 90
ACTIVITY DIAGRAMS VS STATE MACHINES In State Machines Vertices
represent states of a process Edges (arrows) represent occurrences
of events Vertex representing a State Arrow representing an
event
Slide 91
UML STATE MACHINES Used to model the dynamic behaviour of a
process Can be used to model a high level behaviour of an entire
system Can be used to model the detailed behaviour of a single
object All other possible levels of detail in between these
extremes is also possible
Slide 92
UML STATE MACHINE EXAMPLE Example of a garage door state
machine (We will come back to this example later)
Slide 93
STATES Symbol for a state A system in a state will remain in it
until the occurrence of an event that will cause it to transition
to another one Being in a state means that a system will behave in
a predetermined way in response to a given event Symbols for the
initial and final states
Slide 94
STATES Numerous types of events can cause the system to
transition from one state to another In every state, the system
behaves in a different matter Names for states are usually chosen
as: Adjectives: open, closed, ready Present continuous verbs:
opening, closing, waiting
Slide 95
TRANSITIONS Transitions are represented with arrows
Slide 96
TRANSITIONS Transitions represent a change in a state in
response to an event Theoretically, it is supposed to occur in a
instantaneous manner (it does not take time to execute) A
transition can have Trigger: causes the transition; can be an event
of simply the passage of time Guard: a condition that must evaluate
to true for the transition to occur Effect: an action that will be
invoked directly on the system of the object being modeled (if we
are modeling an object, the effect would correspond to a specific
method)
Slide 97
STATE ACTIONS An effect can also be associated with a state If
a destination state is associated with numerous incident
transitions (transitions arriving a that state), and every
transition defines the same effect: The effect can therefore be
associated with the state instead of the transitions (avoid
duplications) This can be achieved using an On Entry effect (we can
have multiple entry effects) We can also add one or more On Exit
effect
Slide 98
SELF TRANSITION State can also have self transitions These self
transition are more useful when they have an effect associated with
them Timer events are usually popular with self transitions Below
is a typical example:
Slide 99
COMING BACK TO OUR INITIAL EXAMPLE Example of a garage door
state machine
Slide 100
DECISIONS Just like activity diagrams, we can use decisions
nodes (although we usually call them decision pseudo-states)
Decision pseudo-states are represented with a diamond We always
have one input transition and multiple outputs The branch of
execution is decided by the guards associated with the transitions
coming out of the decision pseudo-state
Slide 101
DECISIONS
Slide 102
COMPOUND STATES A state machine can include several
sub-machines Below is an example of a sub-machine included in the
compound state Connected Connected Waiting ProcessingByte
receiveByte byteProcessed disconnect Disconnected connect
closeSession
Slide 103
COMPOUND STATES EXAMPLE
Slide 104
Same example, with an alternative notation The link symbol in
the Check Pin state indicates that the details of the sub-machine
associated with Check Pin are specified in an another state
machine
Slide 105
ALTERNATIVE ENTRY POINTS Sometimes, in a sub-machine, we do not
want to start the execution from the initial state We want to start
the execution from a name alternative entry point
PerformActivity
Slide 106
ALTERNATIVE ENTRY POINTS Heres the same system, from a higher
level Transition from the No Already Initialized state leads to the
standard initial state in the sub-machine Transition from the
Already Initialized state is connected to the named alternative
entry point Skip Initializing
Slide 107
ALTERNATIVE EXIT POINTS It is also possible to have alternative
exit points for a compound state Transition from Processing
Instructions state takes the regular exit Transition from the
Reading Instructions state takes an "alternative named exit
point
Slide 108
USE CASE VALIDATE PIN (1) Use case name: Validate PIN Summary:
System validates customer PIN Actor: ATM Customer Precondition: ATM
is idle, displaying a Welcome message.
Slide 109
USE CASE VALIDATE PIN (2) Main sequence: 1.Customer inserts the
ATM card into the card reader. 2.If system recognizes the card, it
reads the card number. 3.System prompts customer for PIN.
4.Customer enters PIN. 5.System checks the card's expiration date
and whether the card has been reported as lost or stolen. 6.If card
is valid, system then checks whether the user- entered PIN matches
the card PIN maintained by the system. 7.If PIN numbers match,
system checks what accounts are accessible with the ATM card.
8.System displays customer accounts and prompts customer for
transaction type: withdrawal, query, or transfer.
Slide 110
USE CASE VALIDATE PIN (3) Alternative sequences: Step 2: If the
system does not recognize the card, the system ejects the card.
Step 5: If the system determines that the card date has expired,
the system confiscates the card. Step 5: If the system determines
that the card has been reported lost or stolen, the system
confiscates the card. Step 7: If the customer-entered PIN does not
match the PIN number for this card, the system re-prompts for the
PIN. Step 7: If the customer enters the incorrect PIN three times,
the system confiscates the card. Steps 4-8: If the customer enters
Cancel, the system cancels the transaction and ejects the card.
Postcondition: Customer PIN has been validated.
Slide 111
ATM MACHINE EXAMPLE Validate PIN:
Slide 112
ATM MACHINE EXAMPLE Funds withdrawal:
Slide 113
SECTION 4 BEHAVIORAL MODELING
Slide 114
TOPICS We will continue to talk about UML State Machine We will
go through a complete example of a simple software construction
case study with emphasis on UML State Machines End this section
with some final words of wisdom!
Slide 115
LAST LECTURE We have talked about UML State Machines States and
transitions State effects Self Transition Decision pseudo-states
Compound states Alternative entry and exit points Today, we will
tackle more advanced UML State Machines Concepts
Slide 116
HISTORY STATES A state machine describes the dynamic aspects of
a process whose current behavior depends on its past A state
machine in effect specifies the legal ordering of states a process
may go through during its lifetime When a transition enters a
compound state, the action of the nested state machine starts over
again at its initial state Unless an alternative entry point is
specified There are times you'd like to model a process so that it
remembers the last substate that was active prior to leaving the
compound state
Slide 117
HISTORY STATES Simple washing machine state diagram: Power Cut
event: transition to the Power Off state Restore Power event:
transition to the active state before the power was cut off to
proceed in the cycle
Slide 118
CONCURRENT REGIONS Sequential sub state machines are the most
common kind of sub machines In certain modeling situations,
concurrent sub machines might be needed (two or more sub state
machines executing in parallel) Brakes example:
Slide 119
CONCURRENT REGIONS Example of modeling system maintenance using
concurrent regions Idle Maintenance Testing devices Self diagnosing
Waiting Processing Command Testing Commanding command
commandProcessed [continue] commandProcessed [not continue]
maintain diagnosisCompleted testingCompleted shutDown
Slide 120
ORTHOGONAL REGIONS Concurrent Regions are also called
Orthogonal Regions These regions allow us to model a relationship
of And between states (as opposed to the default or relationship)
This means that in a sub state machine, the system can be in
several states simultaneously Let us analyse this phenomenon using
an example of computer keyboard state machine
Slide 121
KEYBOARD EXAMPLE (1) Keyboard example without Orthogonal
Regions
Slide 122
KEYBOARD EXAMPLE (2) Keyboard example with Orthogonal
Regions
Slide 123
GARAGE DOOR CASE STUDY Background Company DOORS inc.
manufactures garage door components Nonetheless, they have been
struggling with the embedded software running on their automated
garage opener Motor Unit that they developed in house This is
causing them to loose business They decided to scrap the existing
software and hire a professional software company to deliver bug
free software
Slide 124
CLIENT REQUIREMENTS Client (informal) requirements: Requirement
1: When the garage door is closed, it must open whenever the user
presses on the button of the wall mounted door control or the
remote control Requirement 2: When the garage door is open, it must
close whenever the user presses on the button of the wall mounted
door control or the remote control Requirement 3: The garage door
should not close on an obstacle Requirement 4: There should be a
way to leave the garage door half open Requirement 5: System should
run a self diagnosis test before performing any command (open or
close) to make sure all components are functional
Slide 125
CLIENT REQUIREMENTS Motor Unit (includes a microcontroller
where the software will be running) Wall Mounted Controller (a
remote controller is also supported) Sensor Unit(s) (detects
obstacles, when the door is fully open and when it is fully
closed)
Slide 126
USE CASE DIAGRAM Open Door Close Door Run Diagnosis Use Case
Diagram include Garage Door User Garage Door System
Slide 127
RUN DIAGNOSIS USE CASE Use Case Name: Run Diagnosis Summary:
The system runs a self diagnosis procedure Actor: Garage door user
Pre-Condition: User has pressed the remote or wall mounted control
button Sequence: 1.Check if the sensor is operating correctly
2.Check if the motor unit is operating correctly 3.If all checks
are successful, system authorizes the command to be executed
Alternative Sequence: Step 3: One of the checks fails and therefore
the system does not authorize the execution of the command
Postcondition: Self diagnosis ensured that the system is
operational
Slide 128
OPEN DOOR USE CASE Use Case Name: Open Door Summary: Open the
garage the door Actor: Garage door user Dependency: Include Run
Diagnosis use case Pre-Condition: Garage door system is operational
and ready to take a command Sequence: 1.User presses the remote or
wall mounted control button 2.Include Run Diagnosis use case 3.If
the door is currently closing or is already closed, system opens
the door Alternative Sequence: Step 3: If the door is open, system
closes door Step 3: If the door is currently opening, system stops
the door (leaving it half open) Postcondition: Garage door is
open
Slide 129
CLOSE DOOR USE CASE Use Case Name: Close Door Summary: Close
the garage the door Actor: Garage door user Dependency: Include Run
Diagnosis use case Pre-Condition: Garage door system is operational
and ready to take a command Sequence: 1.User presses the remote or
wall mounted control button 2.Include Run Diagnosis use case 3.If
the door is currently open, system closes the door Alternative
Sequence: Step 3: If the door is currently closing or is already
closed, system opens the door Step 3: If the door is currently
opening, system stops the door (leaving it half open)
Postcondition: Garage door is closed
Slide 130
HIGH LEVEL BEHAVIORAL MODELING
Slide 131
HIGH LEVEL STRUCTURAL MODEL
Slide 132
REFINED STRUCTURAL MODEL
Slide 133
REFINE BEHAVIORAL MODEL MOTOR UNIT buttonPressed(),
obstacleDetected() [isFunctioning()] Open Closing Closed Opening
HalfOpen buttonPressed() buttonPressed() [isFunctioning()]
doorOpen() doorClosed() buttonPressed() [isFunctioning()] Running
WaitingForRepair buttonPressed(), [! isFunctioning()] Timer (180 s)
[! isFunctioning()] Timer (180 s) [isFunctioning()]
Slide 134
REFINE BEHAVIORAL MODEL SENSOR UNIT CheckingForObstacles
CheckingIfDoorOpen CheckingIfDoorClosed [!isObstacleDetected()]
[isObstacleDetected()] [!isDoorOpen()] [isDoorOpen()] Sleeping
[!isDoorClosed()] [isDoorClosed()] Time (20 ms)
SendingObstacleEvent SendingOpenDoorEvent
SendingDoorClosedEvent
Slide 135
DO NOT FALL ASLEEP YET!
Slide 136
CODING Whenever we are satisfied with the level of detail in
our behavioral models, we can proceed to coding Some of the code
can be generated directly by tools from the behavioral model Some
tweaking might be necessary (do not use the code blindly) Humans
are still the smartest programmers
Slide 137
EVENT GENERATOR CLASS
Slide 138
SENSOR CLASS
Slide 139
Sensor State machine Implementation
Slide 140
UMPLE ONLINE DEMO UMPLE is a modeling tool to enable what we
call Model- Oriented Programming This is what we do in this course
You can use it to create class diagrams (structural models) and
state machines (behavioral models) The tool was developed at the
university of Ottawa Online version can be found at:
http://cruise.eecs.uottawa.ca/umpleonline/ Theres also an eclipse
plugin for the tool
Slide 141
UMPLE CODE FOR MOTOR UNIT STATE MACHINE class Motor { status {
Running { Open {buttonPressed[isFunctioning()]->Closing; }
Closing { buttonPressed()[isFunctioning()]->Opening;
ObstacleDetected()[isFunctioning()]->Opening;
doorClosed()->Closed;} Closed {
buttonPressed()[isFunctioning()]->Opening; } Opening {
buttonPressed()->HalfOpen; doorOpen()->Open; }
HalfOpen{buttonPressed()->Opening;}
buttonPressed()[!isFunctioning()]->WaitingForRepair; }
WaitingForRepair{ timer()[!isFunctioning()]->WaitingForRepair;
timer()[isFunctioning()]->Running;} }
Slide 142
MOTOR CLASS SNIPPETS Switching between high level states
Switching between nest states inside the Running compound
state
Slide 143
WHEN TO USE STATE MACHINES? When an object or a system
progresses through various stages of execution (states) The
behavior of the system differs from one stage to another When you
can identify clear events that change the status of the system They
are ideal for event driven programming (less loops and branches,
more events generated and exchanged) Lots of event are being
exchanged between objects When using even driven programming Make
sure you follow Observable or Event Notifier patterns Both are
pretty simple (similar to what we have done for the garage door
example)
Slide 144
BEHAVIORAL OVER- MODELING Please model responsibly!! Do not get
carried out with modeling every single detail to the point where
you run behind schedule You sell code, not models
Slide 145
BEHAVIORAL OVER- MODELING Now, be careful, you do not want
over-model Modern software development processes are all about only
doing just enough modeling for a successful product Therefore,
start with a high level model of the behavior This model should
give a clear overview of some (not necessary all) of the important
functionality of the system This would be similar to the first
garage door state machine we created
Slide 146
BEHAVIORAL OVER- MODELING Identify potential complex areas that
require further understanding We minimize the risk if we understand
these components well before we start programing Model these
complex areas in more details until you are satisfied that they are
well understood Use tools to generate code from your existing
models Do not rely blindly on tools (at least not yet!)
Slide 147
DESIGNING CLASSES WITH STATE DIAGRAMS Keep the state diagram
simple State diagrams can very quickly become extremely complex and
confusing At all time, you should follow the aesthetic rule: Less
is More If the state diagram gets too complex consider splitting it
into smaller classes Think about compound states instead of a flat
design
Slide 148
EXAMPLE OF A CD PLAYER WITH A RADIO On Displaying Current Time
Displaying Alarm Time Display Alarm Timer (3 s) Playing Radio
Playing CD off Off On H Play CD Play Radio
Slide 149
MORE UML STATE MACHINES EXAMPLES Flight State Machine
Slide 150
MORE UML STATE MACHINES EXAMPLES Flight State Machine
Nested
Slide 151
SECTION 5 PETRI NETS THESE SLIDES ARE BASED ON LECTURE NOTES
FROM: DR. CHRIS LING (HTTP://WWW.CSSE.MONASH.EDU.AU/~SLING/) 151
SEG2106 Winter 2014 Hussein Al Osman
Slide 152
TOPICS Today we will discuss another type of state machine:
Petri nets (this will be just an introduction) This will be the
last behavioral modeling topic we cover We will start the next
section of the course next week 152 SEG2106 Winter 2014 Hussein Al
Osman
Slide 153
OK, LETS START 153 SEG2106 Winter 2014 Hussein Al Osman
Slide 154
INTRODUCTION First introduced by Carl Adam Petri in 1962. A
diagrammatic tool to model concurrency and synchronization in
systems They allow us to quickly simulate complex concurrent
behavior (which is faster than prototyping!) Fairly similar to UML
State machines that we have seen so far Used as a visual
communication aid to model the system behavior Based on strong
mathematical foundation 154 SEG2106 Winter 2014 Hussein Al
Osman
Slide 155
EXAMPLE: POS TERMINAL (UML STATE MACHINE) 155 (POS= Point of
Sale) SEG2106 Winter 2014 Hussein Al Osman idled1d2 OK pressed 1
digit d3 d4 OK approve Approved Rejected OK Reject
Slide 156
EXAMPLE: POS TERMINAL (PETRI NET) Initial 1 digit d1d2d3 d4 OK
pressed approve approved OK Reject Rejected! 156 SEG2106 Winter
2014 Hussein Al Osman
Slide 157
POS TERMINAL Scenario 1: Normal Enters all 4 digits and press
OK. Scenario 2: Exceptional Enters only 3 digits and press OK. 157
SEG2106 Winter 2014 Hussein Al Osman
Slide 158
EXAMPLE: POS SYSTEM (TOKEN GAMES) Initial 1 digit d1d2d3 d4 OK
pressed approve approved OK Reject Rejected! 158 SEG2106 Winter
2014 Hussein Al Osman
Slide 159
A PETRI NET COMPONENTS The terms are bit different than UML
state machines Petri nets consist of three types of components:
places (circles), transitions (rectangles) and arcs (arrows):
Places represent possible states of the system Transitions are
events or actions which cause the change of state (be careful,
transitions are no longer arrows here) Every arc simply connects a
place with a transition or a transition with a place. SEG2106
Winter 2014 Hussein Al Osman 159
Slide 160
CHANGE OF STATE A change of state is denoted by a movement of
token(s) (black dots) from place(s) to place(s) Is caused by the
firing of a transition. The firing represents an occurrence of the
event or an action taken The firing is subject to the input
conditions, denoted by token availability SEG2106 Winter 2014
Hussein Al Osman 160
Slide 161
CHANGE OF STATE A transition is firable or enabled when there
are sufficient tokens in its input places. After firing, tokens
will be transferred from the input places (old state) to the output
places, denoting the new state 161 SEG2106 Winter 2014 Hussein Al
Osman
Slide 162
EXAMPLE: VENDING MACHINE The machine dispenses two kinds of
snack bars 20c and 15c Only two types of coins can be used 10c
coins and 5c coins (ah the old days!!) The machine does not return
any change SEG2106 Winter 2014 Hussein Al Osman 162
Slide 163
EXAMPLE: VENDING MACHINE (UML STATE MACHINE) SEG2106 Winter
2014 Hussein Al Osman 163 0 cent inserted 5 cents inserted 10 cents
inserted 15 cents inserted 20 cents inserted Deposit 5c Deposit 10c
Deposit 5c Take 20c snack bar Take 15c snack bar
Slide 164
EXAMPLE: VENDING MACHINE (A PETRI NET) SEG2106 Winter 2014
Hussein Al Osman 164 5c Take 15c bar Deposit 5c 0c Deposit 10c
Deposit 5c 10c Deposit 10c Deposit 5c Deposit 10c 20c Deposit 5c
15c Take 20c bar
Slide 165
EXAMPLE: VENDING MACHINE (3 SCENARIOS) Scenario 1: Deposit 5c,
deposit 5c, deposit 5c, deposit 5c, take 20c snack bar. Scenario 2:
Deposit 10c, deposit 5c, take 15c snack bar. Scenario 3: Deposit
5c, deposit 10c, deposit 5c, take 20c snack bar. 165 SEG2106 Winter
2014 Hussein Al Osman
Slide 166
EXAMPLE: VENDING MACHINE (TOKEN GAMES) SEG2106 Winter 2014
Hussein Al Osman 166 5c Take 15c bar Deposit 5c 0c Deposit 10c
Deposit 5c 10c Deposit 10c Deposit 5c Deposit 10c 20c Deposit 5c
15c Take 20c bar
Slide 167
MULTIPLE LOCAL STATES In the real world, events happen at the
same time A system may have many local states to form a global
state. There is a need to model concurrency and synchronization 167
SEG2106 Winter 2014 Hussein Al Osman
Slide 168
EXAMPLE: IN A RESTAURANT (A PETRI NET) SEG2106 Winter 2014
Hussein Al Osman 168 Waiter free Customer 1 Customer 2 Take order
Take order Order taken Tell kitchen wait Serve food eating
Slide 169
EXAMPLE: IN A RESTAURANT (TWO SCENARIOS) Scenario 1: Waiter
1.Takes order from customer 1 2.Serves customer 1 3.Takes order
from customer 2 4.Serves customer 2 Scenario 2: Waiter 1.Takes
order from customer 1 2.Takes order from customer 2 3.Serves
customer 2 4.Serves customer 1 169 SEG2106 Winter 2014 Hussein Al
Osman
Slide 170
EXAMPLE: IN A RESTAURANT (SCENARIO 2) Waiter free Customer 1
Customer 2 Take order Take order Order taken Tell kitchen wait
Serve food eating 170 SEG2106 Winter 2014 Hussein Al Osman
Slide 171
EXAMPLE: IN A RESTAURANT (SCENARIO 1) Waiter free Customer 1
Customer 2 Take order Take order Order taken Tell kitchen wait
Serve food eating 171 SEG2106 Winter 2014 Hussein Al Osman
Slide 172
NET STRUCTURES A sequence of events/actions: Concurrent
executions: e1 e2e3 e1 e2 e3 e4 e5 172 SEG2106 Winter 2014 Hussein
Al Osman
Slide 173
NET STRUCTURES Non-deterministic events - conflict, choice or
decision: A choice of either e1, e2 or e3, e4... e1e2 e3e4 173
SEG2106 Winter 2014 Hussein Al Osman
Slide 174
NET STRUCTURES Synchronization e1 174 SEG2106 Winter 2014
Hussein Al Osman
Slide 175
NET STRUCTURES Synchronization and Concurrency e1 175 SEG2106
Winter 2014 Hussein Al Osman
Slide 176
ANOTHER EXAMPLE A producer-consumer system, consist of: One
producer Two consumers One storage buffer With the following
conditions: The storage buffer may contain at most 5 items; The
producer sends 3 items in each production; At most one consumer is
able to access the storage buffer at one time; Each consumer
removes two items when accessing the storage buffer SEG2106 Winter
2014 Hussein Al Osman 176
Slide 177
A PRODUCER- CONSUMER SYSTEM ready p1 t1 produce idle send p2 t2
k=1 k=5 Storage p3 32 t3t4 p4 p5 k=2 accept accepted consume ready
ProducerConsumers 177 SEG2106 Winter 2014 Hussein Al Osman
Slide 178
A PRODUCER-CONSUMER EXAMPLE In this Petri net, every place has
a capacity and every arc has a weight. This allows multiple tokens
to reside in a place to model more complex behavior. SEG2106 Winter
2014 Hussein Al Osman 178
Slide 179
SHORT BREAK? SEG2106 Winter 2014 Hussein Al Osman 179 Are you
here yet?
Slide 180
BEHAVIORAL PROPERTIES Reachability Can we reach one particular
state from another? Boundedness Will a storage place overflow?
Liveness Will the system die in a particular state? SEG2106 Winter
2014 Hussein Al Osman 180
Slide 181
RECALLING THE VENDING MACHINE (TOKEN GAME) SEG2106 Winter 2014
Hussein Al Osman 181 5c Take 15c bar Deposit 5c 0c Deposit 10c
Deposit 5c 10c Deposit 10c Deposit 5c Deposit 10c 20c Deposit 5c
15c Take 20c bar
Slide 182
A MARKING IS A STATE... SEG2106 Winter 2014 Hussein Al Osman
182 t8 t1 p1 t2 p2 t3 p3 t4 t5 t6 p5 t7t7 p4 t9 M0 = (1,0,0,0,0) M1
= (0,1,0,0,0) M2 = (0,0,1,0,0) M3 = (0,0,0,1,0) M4 = (0,0,0,0,1)
Initial marking:M0
REACHABILITY M2 is reachable from M1 and M4 is reachable from
M0. In fact, in the vending machine example, all markings are
reachable from every marking. SEG2106 Winter 2014 Hussein Al Osman
184 M0 M1M2M3M0M2M4 t3t1t5t8t2t6 A firing or occurrence sequence
:
Slide 185
BOUNDEDNESS A Petri net is said to be k-bounded or simply
bounded if the number of tokens in each place does not exceed a
finite number k for any marking reachable from M0. The Petri net
for vending machine is 1-bounded. 185 SEG2106 Winter 2014 Hussein
Al Osman
Slide 186
LIVENESS A Petri net with initial marking M0 is live if, no
matter what marking has been reached from M0, it is possible to
ultimately fire any transition by progressing through some further
firing sequence. A live Petri net guarantees deadlock-free
operation, no matter what firing sequence is chosen. 186 SEG2106
Winter 2014 Hussein Al Osman
Slide 187
LIVENESS The vending machine is live and the producer-consumer
system is also live. A transition is dead if it can never be fired
in any firing sequence. 187 SEG2106 Winter 2014 Hussein Al
Osman
Slide 188
AN EXAMPLE SEG2106 Winter 2014 Hussein Al Osman 188 A bounded
but non-live Petri net p1 p2 p3 p4 t1 t2 t3t4 M0 = (1,0,0,1) M1 =
(0,1,0,1) M2 = (0,0,1,0) M3 = (0,0,0,1)
Slide 189
ANOTHER EXAMPLE p1 t1 p2p3 t2t3 p4 p5 t4 An unbounded but live
Petri net M0 = (1, 0, 0, 0, 0) M1 = (0, 1, 1, 0, 0) M2 = (0, 0, 0,
1, 1) M3 = (1, 1, 0, 0, 0) M4 = (0, 2, 1, 0, 0) 189 SEG2106 Winter
2014 Hussein Al Osman
Slide 190
OTHER TYPES OF PETRI NETS Object-Oriented Petri nets Tokens can
either be instances of classes, or states of objects. Net structure
models the inner behaviour of objects. SEG2106 Winter 2014 Hussein
Al Osman 190
Slide 191
AN O-O PETRI NET ready produce Storage accepted consume ready
ProducerConsumer send accept Producer state: ProducerState Item
produce( ) send(i: Item): void Consumer state: ConsumerState
accept( i: Item): void consume(i: Item) : void 191 SEG2106 Winter
2014 Hussein Al Osman
Slide 192
PETRI NET REFERENCES Murata, T. (1989, April). Petri nets:
properties, analysis and applications. Proceedings of the IEEE,
77(4), 541-80. Peterson, J.L. (1981). Petri Net Theory and the
Modeling of Systems. Prentice-Hall. Reisig, W and G. Rozenberg
(eds) (1998). Lectures on Petri Nets 1: Basic Models.
Springer-Verlag. The World of Petri nets:
http://www.daimi.au.dk/PetriNets/ SEG2106 Winter 2014 Hussein Al
Osman 192
Slide 193
SECTION 6 INTRODUCTION TO COMPILERS
Slide 194
TOPICS Natural languages Lexemes or lexical entities Syntax and
semantics Computer languages Lexical analysis Syntax analysis
Semantic analysis Compilers Compilers basic requirements
Compilation process
Slide 195
NATURAL LANGUAGES BASICS In a (natural) language: A sentence is
a sequence of words A word (also called lexemes of lexical units)
is a sequence of characters (possibly a single one) The set of
characters used in a language is finite (know as the alphabet) The
set of possible sentences in a language is infinite A dictionary
lists all the words (lexemes) of a language The words are
classified into different lexical categories: verb, noun, pronoun,
preposition.
Slide 196
NATURAL LANGUAGES BASICS A grammar (also considered the set of
syntax rules) to determine which sequences of words are well formed
Sequences must have a structure that obeys the grammatical rules
Well formed sentences, usually have a meaning that humans
understand We are trying to teach our natural languages to machines
With mixed results!!
Slide 197
ANALYSIS OF SENTENCES Lexical Analysis: identification of words
made up of characters Words are classified into several categories:
articles, nouns, verbs, adjectives, prepositions, pronouns Syntax
analysis: rules for combining words to form sentences Analysis of
meaning: difficult to formalize Easily done by humans Gives
machines a hard time (although natural language processing is
evolving) Big research field for those interested in graduate
studies
Slide 198
COMPUTER LANGUAGE PROCESSING In computer (or programming)
languages, one speaks about a program (corresponding to a long
sentence or paragraph) Sequence of lexical units or lexemes Lexical
units are sequences of characters Lexical rules of the language
determine what the valid lexical units of the language are There
are various lexical categories: identifier, number, character
string, operator Lexical categories are also known as tokens
Slide 199
COMPUTER LANGUAGE PROCESSING Syntax rules of the language
determine what sequences of lexemes are well-formed programs
Meaning of a well-formed program is also called its semantics A
program can be well-formed, but its statements are nonsensical
Example: int x = 0; x = 1; x = 0; Syntactically, the above code is
valid, but what does it mean??
Slide 200
COMPUTER LANGUAGE PROCESSING Compilers should catch and
complain about lexical and syntax errors Compilers might complain
about common semantic errors: public boolean test (int x){ boolean
result; if (x > 100) result = true; return result; } Your
coworkers or the client will complain about the rest!! Error
message: The local variable result may have not been
initialized
Slide 201
COMPILERS What is a compiler? Program that translates an
executable program in one language into an executable program in
another language We expect the program produced by the compiler to
be better, in some way, than the original What is an interpreter?
Program that reads an executable program and produces the results
of running that program We will focus on compilers in this course
(although many of the concepts apply to both)
Slide 202
BASIC REQUIREMENTS FOR COMPILERS Must-Dos: Produce correct code
(byte code in the case of Java) Run fast Output must run fast
Achieve a compile time proportional to the size of the program Work
well with debuggers (absolute must) Must-Haves: Good diagnostics
for lexical and syntax errors Support for cross language calls
(checkout Java Native Interface if you are interested)
Slide 203
ABSTRACT VIEW OF COMPILERS A compiler usually realizes the
translation in several steps; correspondingly, it contains several
components. Usually, a compiler includes (at least) separate
components for verifying the lexical and syntax rules:
Slide 204
COMPILATION PROCESS Machine Code Source Program Lexical
AnalyserSyntax AnalyserSemantic AnalyserIntermediate Code
GeneratorCode OptimizerCode Generator
Slide 205
COMPILATION PROCESS More than one course is required to cover
the details of the various phases In this course, we will scratch
the surface We will focus on lexical and syntax analysis
Slide 206
SOME IMPORTANT DEFINITIONS These definitions, although sleep
inducing, are important in order to understand the concepts that
will be introduced in the next lectures So here we go
Slide 207
ALPHABET Recall from beginning of the lecture (or
kindergarten): an alphabet is the set of characters that can be
used to form a sentence Since mathematicians love fancy Greek
symbols, we will refer to an alphabet as
Slide 208
ALPHABET is an alphabet, or set of terminals Finite set and
consists of all the input characters or symbols that can be
arranged to form sentences in the language English: A to Z,
punctuation and space symbols Programming language: usually some
well-defined computer set such as ASCII
Slide 209
STRINGS OF TERMINALS IN AN ALPHABET ={a,b,c,d} Possible strings
of terminals from include aaa aabbccdd d cba abab ccccccccccacccc
Although this is fun, I think you get the idea
Slide 210
FORMAL LANGUAGES : alphabet, it is a finite set consisting of
all input characters or symbols * : closure of the alphabet, the
set of all possible strings in , including the empty string A
(formal) language is some specified subset of *
Slide 211
SECTION 7 LEXICAL ANALYSIS
Slide 212
TOPICS The role of the lexical analyzer Specification of tokens
Finite state machines From a regular expressions to an NFA
Slide 213
THE ROLE OF LEXICAL ANALYZER Lexical analyzer is the first
phase of a compiler Task: read input characters and produce a
sequence of tokens that the parser uses for syntax analysis Remove
white spaces Lexical Analyser (scanner) Syntax Analyser (parser)
token Get next token Source Program
Slide 214
LEXICAL ANALYSIS There are several reasons for separating the
analysis phase of compiling into lexical analysis and syntax
analysis (parsing): Simpler (layered) design Compiler efficiency
Specialized tools have been designed to help automate the
construction of both separately
Slide 215
LEXEMES Lexeme: sequence of characters in the source program
that is matched by the pattern for a token A lexeme is a basic
lexical unit of a language Lexemes of a programming language
include its Identifiers: names of variables, methods, classes,
packages and interfaces Literals: fixed values (e.g. 1, 17.56,
0xFFE ) Operators: for Maths, Boolean and logical operations (e.g.
+, -, &&, | ) Special words: keywords (e.g. if, for, public
)
Slide 216
TOKENS, PATTERNS, LEXEMES Token: category of lexemes A pattern
is a rule describing the set of lexemes that can represent as
particular token in source program
Slide 217
EXAMPLES OF TOKENS double pi = 3.1416; The substring pi is a
lexeme for the token identifier. TokenSample Lexemes Informal
Description of Pattern typedouble if booelan_operator, >= or
>= idpi, count, d2Letter followed by letters and digits
literal3.1414, testAny alpha numeric string of characters
LEXICAL ERRORS Few errors are discernible at the lexical level
alone Lexical analyzer has a very localized view of a source
program Let some other phase of compiler handle any error
Slide 220
SPECIFICATION OF TOKENS We need a powerful notation to specify
the patterns for the tokens Regular expressions to the rescue!! In
process of studying regular expressions, we will discuss: Operation
on languages Regular definitions Notational shorthands
Slide 221
RECALL: LANGUAGES : alphabet, it is a finite set consisting of
all input characters or symbols * : closure of the alphabet, the
set of all possible strings in , including the empty string A
(formal) language is some specified subset of *
Slide 222
OPERATIONS ON LANGUAGES
Slide 223
Non-mathematical format: Union between languages L and M: the
set of strings that belong to at least one of both languages
Concatenation of languages L and M: the set of all strings of the
form st where s is a string from L and t is a string from M
Intersection between languages L and M: the set of all strings
which are contained in both languages Kleene closure (named after
Stephen Kleene): the set of all strings that are concatenations of
0 or more strings from the original language Positive closure : the
set of all strings that are concatenations of 1 or more strings
from the original language
Slide 224
REGULAR EXPRESSIONS Regular expression is a compact notation
for describing string. In Java, an identifier is a letter followed
by zero or more letter or digits letter (letter | digit)* | : or *
: zero or more instance of
Slide 225
RULES is a regular expression that denotes { }, the set
containing empty string If a is a symbol in , then a is a regular
expression that denotes {a}, the set containing the string a
Suppose r and s are regular expressions denoting the language L and
M, then (r) |(s) is a regular expression denoting L M. (r)(s) is
regular expression denoting LM (r) * is a regular expression
denoting (L)*.
Slide 226
PRECEDENCE CONVENTIONS The unary operator * has the highest
precedence and is left associative. Concatenation has the second
highest precedence and is left associative. | has the lowest
precedence and is left associative. (a)|(b)*(c) a|b*c
Slide 227
EXAMPLE OF REGULAR EXPRESSIONS
Slide 228
PROPERTIES OF REGULAR EXPRESSION
Slide 229
REGULAR DEFINITIONS If is an alphabet of basic symbols, then a
regular definition is a sequence of definitions of the form: d 1 r
1 d 2 r 2... d n r n Where each d i is a distinct name, and each r
i is a regular expression over the symbols in {d 1,d 2,,d i-1 },
i.e., the basic symbols and the previously defined names.
Slide 230
EXAMPLE OF REGULAR DEFINITIONS
Slide 231
NOTATIONAL SHORTHANDS Certain constructs occur so frequently in
regular expressions that it is convenient to introduce notational
short hands for them We have already seen some of these short
hands: 1.One or more instances: a+ denotes the set of all strings
of one or more as 2.Zero or more instances: a* denotes all the
strings of zero or more as 3.Character classes: the notation [abc]
where a, b and c denotes the regular expresssion a | b | c
4.Abbreviated character classes: the notation [a-z] denotes the
regular expression a | b | . | z
Slide 232
NOTATIONAL SHORTHANDS Using character classes, we can describe
identifiers as being strings described by the following regular
expression: [A-Za-z][A-Za-z0-9]*
Slide 233
FINITE STATE AUTOMATA Now that we have learned about regular
expressions How can we tell if a string (or lexeme) follows a
regular expression pattern or not? We will again use state
machines! This time, they are not UML state machines or petri nets
We will call them: Finite Automata The program that executes such
state machines is called a Recognizer
Slide 234
SHORT BREAK
Slide 235
FINITE AUTOMATA A recognizer for a language is a program that
takes as input a string x and answers Yes if x is a lexem of the
language No otherwise We compile a regular expression into a
recognizer by constructing a generalized transition diagram called
a finite automaton A finite automaton can be deterministic or
nondeterministic Nondeterministic means that more than one
transition out of a state may be possible on the same input
symbol
Slide 236
NONDETERMINISTIC FINITE AUTOMATA (NFA) A set of states S A set
of input symbols that belong to alphabet A set of transitions that
are triggered by the processing of a character A single state s 0
that is distinguished as the start (initial) state A set of states
F distinguished as accepting (final) states.
Slide 237
EXAMPLE OF AN NFA The following regular expression (a|b)*abb
Can be described using an NFA with the following diagram:
Slide 238
EXAMPLE OF AN NFA The previous diagram can be described using
the following table as well Remember the regular expression was:
(a|b)*abb
Slide 239
ANOTHER NFA EXAMPLE NFA accepting the following regular
expression: aa*|bb*
Slide 240
DETERMINISTIC FINITE AUTOMATA (DFA) A DFA is a special case of
a NFA in which No state has an -transition For each state s and
input symbol a, there is at most one edge labeled a leaving s
Slide 241
ANOTHER DFA EXAMPLE For the same regular expression we have
seen before (a|b)*abb
Slide 242
NFA VS DFA Always with the regular expression: (a|b)*abb NFA:
DFA:
Slide 243
EXAMPLE OF A DFA Recognizer for identifier:
Slide 244
TABLES FOR THE RECOGNIZER To change regular expression, we can
simply change tables
Slide 245
CODE FOR THE RECOGNIZER
Slide 246
SECTION 8 FINITE STATE AUTOMATA
Slide 247
TOPICS Algorithm to create NFAs from regular expressions
Algorithm to convert from NFA to DFA Algorithm to minimize DFA Many
examples.
Slide 248
CREATING DETERMINISTIC FINITE AUTOMATA (DFA) In order to create
a DFA, we have to perform the following: Create a Non-deterministic
Finite Automata (NFA) out of the regular expression Convert the NFA
into a DFA
Slide 249
NFA CREATION RULES A | B AB A* 12 A B 3 23 41 A A B 23 45
16
Slide 250
NFA CREATION EXAMPLES x | yz According to precedence rules,
this is equivalent to: x | (yz) This has the same form as A | B:
And B can be represented as: Putting all together: 12 y z 3 16 A B
23 45 y z 1 x 23 46 5 7
Slide 251
NFA CREATION EXAMPLES (x | y)* We have seen A*: Therefore, (x |
y)*: A 8 1 27 x y 34 56
Slide 252
NFA CREATION EXAMPLES abb
Slide 253
NFA CREATION EXAMPLES a*bb 23 41 a 5 b b 6
Slide 254
NFA CREATION EXAMPLES (a|b)*bc 70 16 a b 23 45 cb 8 9 9
Slide 255
CONVERSION OF AN NFA INTO DFA Subset construction algorithm is
useful for simulating an NFA by a computer program In the
transition table of an NFA, each entry is a set of states In the
transition table of a DFA, each entry is just a single state.
General idea behind the NFA-to-DFA conversion: each DFA state
corresponds to a set of NFA states
Slide 256
SUBSET CONSTRUCTION ALGORITHM Algorithm: Subset Construction -
Used to construct a DFA from an NFA Input: An NFA N Output: A DFA D
accepting the same language
Slide 257
SUBSET CONSTRUCTION ALGORITHM Method: Let s be a state in N and
T be a set of states, and using the following operations:
Slide 258
SUBSET CONSTRUCTION (MAIN ALGORITHM)
Slide 259
SUBSET CONSTRUCTION (-CLOSURE COMPUTATION)
Slide 260
CONVERSION EXAMPLE Dstates={A,B,C}, where A = (1,2,3,5,8) B =
(2,3,4,5,7,8) C = (2,3,5,6,7,8) 8 1 27 x y 34 56 xy ABC BBC CBC
Regular Expression : (x | y)*
Slide 261
CONVERSION EXAMPLE Regular Expression : (x | y)* x A B C B C A
y x x y y xy ABC BBC CBC
Slide 262
ANOTHER CONVERSION EXAMPLE Regular Expression: (a | b)*abb
Slide 263
ANOTHER CONVERSION EXAMPLE Regular Expression: (a | b)*abb
Slide 264
ANOTHER CONVERSION EXAMPLE Regular Expression: (a | b)*abb
Slide 265
MINIMIZING THE NUMBER OF STATES IN DFA Minimize the number of
states of a DFA by finding all groups of states that can be
distinguished by some input string Each group of states that cannot
be distinguished is then merged into a single state
Slide 266
MINIMIZING THE NUMBER OF STATES IN DFA Algorithm: Minimizing
the number of states of a DFA Input: A DFA D with a set of states S
Output: A DFA M accepting the same language as D yet having as few
states as possible
Slide 267
MINIMIZING THE NUMBER OF STATES IN DFA Method: 1.Construct an
initial partition of the set of states with two groups: The
accepting states group All other states group 2.Partition to new
(using the procedure shown on the next slide) 3.If new != , repeat
step (2). Otherwise, repeat go to step (4) 4. Choose one state in
each group of the partition as the representative of the group
5.Remove dead states
Slide 268
CONSTRUCT NEW PARTITION PROCEDURE for each group G of do begin
Partition G into subgroups such that two states s and t of G are in
the same subgroup if and only if for all input symbols a, states s
and t have transitions on a to states in the same group of ; /* at
worst, a state will be in a subgroup by itself*/ Replace G in new
by the set of all subgroups formed end
Slide 269
EXAMPLE OF NFA MINIMIZATION BC a DA c a a E F b a b b b F == A,
B, C, D, E F A B, C, D, E F A B, C, D, E F
Slide 270
EXAMPLE OF NFA MINIMIZATION Minimized DFA, where: 1: A 2: B, C,
D, E 3: F 21 c a 3 b 3
Slide 271
SECTION 9 PRACTICAL REGULAR EXPRESSIONS
Slide 272
TOPICS Practical notations that are often used with regular
expression Few practice exercises
Slide 273
PRACTICAL REGULAR EXPRESSIONS TRICKS We will see practical
regular expressions tricks that are supported by most regex
libraries Remember, regular expressions are not only used in the
context of compilers We often use them to extract information from
text Example: imagine looking in a log file that has been
accumulating entries for the past two months for a particular error
pattern Without regular expressions, this would be a tedious job
Sooner or later, when you work in the industry, you will encounter
such issues regular expressions will come in handy
Slide 274
MATCHING DIGITS To match a single digit, as we have seen
before, we can use the following regular expression: [0-9]
Nonetheless, since matching a digit is a common operation, we can
use the following notation: \d Slash is an escape character used to
distinguish it from the letter d Similarly, to match a non-digit
character, we can use the notation: \D
Slide 275
ALPHANUMERIC CHARACTERS To match an alphanumeric character, we
can use the notation: [a-zA-Z0-9] Or we can use the following
shortcut \w Similarly, we can represent any non-alphanumeric
character as follows: \W
Slide 276
WILDCARD A wildcard is defined to match any single character
(letter, digit, whitespace ) It is represented by the. (dot)
character Therefore, in order to match a dot, you have to use the
escape character: \.
Slide 277
EXCLUSION We have seen that [abc] is equivalent to (a | b | c)
But sometimes we want to match everything except a set of
characters To achieve this, we can use the notation: [^abc] This
matches any single character other than a, b or c This notation can
also be used with abbreviated character classes [^a-z] matches any
character other than a small letter
Slide 278
REPETITIONS How can we match a letter or a string that repeats
several times in a row: E.g. ababab So far, we have implemented
repetitions through three mechanisms: Concatenation: simply
concatenate the string or character with itself (does not work if
you do not know the exact number of repetitions) Kleene star
closure: to match letters or strings repeated 0 or more times
Positive closure: to match letters or strings repeated 1 or more
times
Slide 279
REPETITIONS We can also specify a range of how many times a
letter or string can be repeated Example, if we want to match
strings of repetition of the letter a between 1 and 3 times, we can
use the notation: a {1,3} Therfore, a {1,3} matches the string aaa
We can also specify an exact number of repetitions instead of a
range ab {3} matches the string ababab
Slide 280
OPTIONAL CHARACTERS The concept of the optional character is
somewhat similar to that of the kleene star The star operator
matches 0 or more instances of the operand The optional operator,
denoted as ? (question mark), matches 0 or one instances of the
operand Example: the pattern ab?c will match either the strings
"abc" or "ac" because the b is considered optional.
Slide 281
WHITE SPACE Often, we want to easily detect white spaces Either
to remove them or to detect the beginning or end of words Most
common forms of whitespace used with regular expressions: Space _,
the tab \t, the new line \n and the carriage return \r A whitespace
special character \s will match any of the specific whitespaces
above Similarly, you can match any non-white space character using
the notation \S
Slide 282
FEW EXERCISES Given the sentence: Error, computer will not shut
down Provide a regular expression that will match all the words in
the sentence Answer: \w*
Slide 283
FEW EXERCISES Given the sentence: Error, computer will not shut
down Provide a regular expression that will match all the non-
alphanumeric characters Answer: \W*
Slide 284
FEW EXERCISES Given the log file: [Sunday Feb. 2 2014] Program
starting up [Monday Feb. 3 2014] Entered initialization phase
[Tuesday Feb. 4 2014] Error 5: cannot open XML file [Thursday Feb.
6 2014] Warning 5: response time is too slow [Friday Feb. 7 2014]
Error 9: major error occurred, system will shut down Match any
error or warning message that ends with the term shut down Answer:
(Error|Warning).*(shut down)
Slide 285
FEW EXERCISES Given the log file: [Sunday Feb. 2 2014] Program
starting up [Monday Feb. 3 2014] Entered initialization phase
[Tuesday Feb. 4 2014] Error 5: cannot open XML file [Thursday Feb.
6 2014] Warning 5: response time is too slow [Friday Feb. 7 2014]
Error 9: major error occurred, system will shut down Match any
Error or Warning before between 1 and 6 th February Answer: \[\w*
Feb\. [1-6] 2014\] (Error|Warning)
Slide 286
SECTION 10 INTRODUCTION TO SYNTAX ANALYSIS 286
Slide 287
TOPICS Context free grammars Derivations Parse Trees Ambiguity
Top-down parsing Left recursion 287
Slide 288
THE ROLE OF PARSER 288
Slide 289
CONTEXT FREE GRAMMARS A Context Free Grammar (CFG) consists of
Terminals Nonterminals Start symbol Productions A language that can
be generated by a grammar is said to be a context-free language
289
Slide 290
CONTEXT FREE GRAMMARS Terminals: are the basic symbols from
which strings are formed These are the tokens that were produced by
the Lexical Analyser Nonterminals: are syntactic variables that
denote sets of strings One nonterminal is distinguished as the
start symbol The productions of a grammar specify the manner in
which the terminal and nonterminals can be combined to form strings
290
Slide 291
EXAMPLE OF GRAMMAR The grammar with the following productions
defines simple arithmetic expressions expr ::= expr op expr expr
::= id expr ::= num op ::= + op ::= - op ::= * op ::= / In this
grammar, the terminal symbols are num, id + - * / The nonterminal
symbols are expr and op , and expr is the start symbol 291
Slide 292
DERIVATIONS expr expr op expr is read expr derives expr op expr
expr expr op expr id op expr id * expr id*id is called a derivation
of id*id from expr. 292
Slide 293
DERIVATIONS If A::= is a production and and are arbitrary
strings of grammar symbols, we can say: A If 1 2 ... n, we say 1
derives n. 293
Slide 294
DERIVATIONS means derives in one step. means derives in zero or
more steps. if and then means derives in one or more steps. If S ,
where may contain nonterminals, then we say that is a sentential
form If does no contains any nonterminals, we say that is a
sentence * * * + * 294
Slide 295
DERIVATIONS G: grammar S: start symbol L(G): the language
generated by G Strings in L(G) may contain only terminal symbols of
G A string of terminals w is said to be in L(G) if and only if S w
The string w is called a sentence of G A language that can be
generated by a grammar is said to be a context-free language If two
grammars generate the same language, the grammars are said to be
equivalent + 295
Slide 296
DERIVATIONS We have already seen the following production
rules: expr ::= expr op expr | id | num op ::= + | - | * | / The
string id+id is a sentence of the above grammar because expr expr +
expr id + expr id + id We write expr id+id * 296
Slide 297
PARSE TREE expr op id+ This is called: Leftmost derivation
297
Slide 298
TWO PARSE TREES Let us again consider the arithmetic expression
grammar. For the line of code: x+2*y (we are not considering the
semi colon for now) Grammar: expr ::= expr op expr | id | num op
::= + | - | * | / Lexical Analyser x+z*y Syntax Analyser
id+id*idparse tree 298
Slide 299
TWO PARSE TREES Let us again consider the arithmetic expression
grammar. The sentence id + id * id has two distinct leftmost
derivations: expr expr op expr id op expr id + expr id + expr op
expr id + id op expr id + id * expr id + id * id expr expr op expr
expr op expr op expr id op expr op expr id + expr op expr id + id
op expr id + id * expr id + id * id Grammar: expr ::= expr op expr
| id | num op ::= + | - | * | / 299
Slide 300
TWO PARSE TREES expr op expr + id * expr op expr id expr op
expr id + expr op * id Equivalent to: id+(id*id) Equivalent to:
(id+id)*id Grammar: expr ::= expr op expr | id | num op ::= + | - |
* | / 300
Slide 301
PRECEDENCE The previous example highlights a problem in the
grammar: It does not enforce precedence It has not implied order of
evaluation We can expand the production rules to add precedence
301
Slide 302
APPLYING PRECEDENCE UPDATE The sentence id + id * id has only
one leftmost derivation now: expr expr + term term + term factor +
term id + term id + term * factor id + factor * factor id + id *
factor id + id * id Grammar: expr ::= expr + term | expr - term |
term term ::= term * factor | term / factor | factor factor ::= num
| id factor expr term expr + id termfactor* id 302
Slide 303
AMBIGUITY A grammar that produces more than one parse tree for
some sentence is said to be ambiguous. Example: Consider the
following statement: It has two derivations It is a context free
ambiguity 303
Slide 304
AMBIGUITY A grammar that produces more than one parse tree for
some sentence is said to be ambiguous 304
Slide 305
ELIMINATING AMBIGUITY Sometimes an ambiguous grammar can be
rewritten to eliminate the ambiguity. E.g. match each else with the
closest unmatched then This is most likely the intention of the
programmer 305
Slide 306
MAPPING THIS TO A JAVA EXAMPLE In Java, the grammar rules are
slightly different then the previous example Below is a (very
simplified) version of these rules ::= | ::= if ( ) else | other
stmnts ::= if ( ) | if ( ) else 306
Slide 307
MAPPING THIS TO A JAVA EXAMPLE For the following piece of code
if (x==0) if (y==0) z = 0; else z = 1; After running the lexical
analyser, we get the following list of tokens: if ( id == num ) if
(id == num) id = num ; else id = num ; 307
Slide 308
MAPPING THIS TO A JAVA EXAMPLE Token input string: if ( id ==
num ) if (id == num) id = num ; else id = num ; Building the parse
tree: stmnt ()exprmatchedifelsematched()exprstmntif umatched
matched 308
Slide 309
MAPPING THIS TO A JAVA (ANOTHER) EXAMPLE Token input string: if
( id == num ) else id = num ; Building the parse tree:
()exprmatchedifelsematched stmnt matched 309
Slide 310
TOP DOWN PARSING A top-down parser starts with the root of the
parse tree, labelled with the start or goal symbol of the grammar
To build a parse tree, we repeat the following steps until the
leafs of the parse tree matches the input string 1.At a node
labelled A, select a production A::= and construct the appropriate
child for each symbol of 2.When a terminal is added to the parse
tree that does not match the input string, backtrack 3.Find the
next nonterminal to be expanded 310
Slide 311
TOP DOWN PARSING Top-down parsing can be viewed as an attempt
to find a leftmost derivation for an input string Example: Input
string cad Grammar: We need to backtrack! ::= c d ::= ab | a
311
Slide 312
EXPRESSION GRAMMAR Recall our grammar for simple expressions:
Consider the input string: id num * id 312
Slide 313
EXAMPLE Reference Grammar 313
Slide 314
EXAMPLE Another possible parse for id num * id If the parser
makes the wrong choices, expansion does not terminate This is not a
good property for a parser to have Parsers should terminate,
eventually 314
Slide 315
LEFT RECURSION A grammar is left recursive if: It has a
nonterminal A such that there is a derivation A A for some string
Top down parses cannot handle left- recursion in a grammar +
315
Slide 316
ELIMINATING LEFT RECURSION Consider the grammar fragment: Where
and do not start with foo We can re-write this as: Where bar is a
new non-terminal This Fragment contains no left recursion 316
Slide 317
EXAMPLE Our expression grammar contains two cases of
left-recursion Applying the transformation gives With this grammar,
a top-down parser will Terminate (for sure) Backtrack on some
inputs 317
Slide 318
PREDICTIVE PARSERS We saw that top-down parsers may need to
backtrack when they select the wrong production Therefore, we might
need predictive parsers to avoid backtracking This is where
predictive parsers come in useful LL(1): left to right scan,
left-most derivation, 1-token look ahead LR(1): left to right scan,
right most derivation, 1-token look ahead 318
Slide 319
SECTION 11 LL(1) PARSER 319
Slide 320
TOPICS LL(1) Grammar Eliminating Left Recursion Left Factoring
FIRST and FOLLOW sets Parsing tables LL(1) parsing Many examples
320