SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT...

39
SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow

Transcript of SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT...

Page 1: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

SIE 550 – Formal Languages

Lecture for SIE 550Matt Dube

Doctoral Student – SpatialIGERT Fellow

Page 2: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Languages

What do we think of when we think of languages?

letters numbers punctuationsyllables

words

parts of speech

grammar

phrases

sentences

BUILDING BLOCKS

BUILDINGS IN THE

LANGUAGE

Page 3: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

What are we going to do today?

• Not quite what you bargained for

• We are going to learn how to write in a foreign language that none of us can actually speak

• We will then use that language to set up the constructs that govern computers and database constructs

Page 4: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

The Language Is….

KOREAN

Page 5: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Korean Words

말하다

This is a one word English physical action. So everyone take a guess by doing a physical action.

SPEAK

Page 6: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Now That You Understand Korean…

• (pause for raucous laughter)

• Korean appears to be a graphical language, like Chinese and quite unlike English

• Looks are very deceiving though

말하다

Page 7: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

KOREAN ALPHABET

Korean is an alphabetical language!

Consonants Vowels

Page 8: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Consonants Vowels

말하다

But wait….Those characters aren’t in the alphabet!

Let’s take a closer look:So what exactly is each character in the phrase “speak”?

Page 9: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Syllables

• Each character in written Korean is actually a syllable!

• Any patterns?

Page 10: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Consonants Vowels

말하다

Page 11: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Syllables

• Korean characters are syllables!• Any patterns?

– Consonant followed by a vowel– Consonant followed by a vowel followed by a

consonant• All characters in written Korean follow either of those two

patterns• A syllable is thus a composite of characters in a

specified order (and they all exist ironically enough).• Words, phrases, sentences, etc. are composite of

syllables.

Page 12: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

What About English?

NOT SO SIMPLE!

Page 13: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Example

What then does “qeb” mean?

ABSOLUTELY NOTHING!

“dog”

CUTE, CUDDLY, FURRY ANIMAL THAT BARKS

Page 14: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

English Language

• Syllables do not work with our alphabet

• Thus some are not valid

• Syllables are phonetically based

• Korean is a phonetically based language

• English is the hardest language to learn

• Korean is the easiest language to learn

Page 15: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Why Did We Do That?

• Necessity

• Computers and databases need a “Korean” type of language as opposed to an English type of language– Parsing– Relevancy

• Korean is a formal language

Page 16: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Formal Languages• Characters

– Letters, numbers, punctuation– Building blocks– We call these terminal symbols or axioms

• Uses of characters– Syllables, words, phrases, grammar, sentences,

etc.– Buildings– We call these non-terminal symbols or production

rules or predicates

Page 17: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Where do we find them?• Music• Mathematics• World Languages• Art• Computer Languages• Almost Anywhere

Page 18: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Conventions

• The production rules form the syntax (grammar/spelling) of a language

• Valid combinations under the syntax are called well formed formulas (sentences)

• Example:– Jack in the box– We should say: Jack is in the box.

JACK

BOX

IN

Page 19: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Forming a WFF

• Operators are needed for these languages:

• | = exclusive or• ::= = is replaced by• [ ] = optional (0 or 1)• { } = optional (0 to many)• “ ” = designators of terminal symbols• Examples of situations for these?

Assigning a sign (+ or -)

Fullname ::= First Last

Area code

Letters in a name

Page 20: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Programming Korean Syllables

• Start with what we are trying to create

Page 21: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Korean Syllables

• start ::= syllable

start “is replaced by” syllable

Page 22: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Programming Korean Syllables

• Start with what we are trying to create

• Establish the form of the creation

Page 23: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Korean Syllables

• start ::= syllable

• syllable ::= consonant vowel [consonant]

consonant followed by a vowel

another consonant if necessary

Page 24: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Programming Korean Syllables

• Start with what we are trying to create

• Establish the form of the creation

• Establish the terminal symbols

Page 25: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Korean Syllables

can be substituted for consonant

can be substituted for vowel

• start ::= syllable

• syllable ::= consonant vowel [consonant]

• consonant ::= “A” | “B” | “C” | “D” | “E” | “F” | “G” | “H” | “I” | “J” | “K” | “L” | “M” | “N”

• vowel ::= “0” | “1” | “2” | “3” | “4” | “5” | “6” | “7” | “8” | “9”

Page 26: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Let’s Try to Program Korean Syllables

• Start with what we are trying to create

• Establish the form of the creation

• Establish the terminal symbols

• Congratulations! We can now generate any Korean syllable mechanically

• Let’s test a few examples

Page 27: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

• start ::= syllable

• syllable ::= consonant vowel [consonant]

• consonant ::= “A” | “B” | “C” | “D” | “E” | “F” | “G” | “H” | “I” | “J” | “K” | “L” | “M” | “N”

• vowel ::= “0” | “1” | “2” | “3” | “4” | “5” | “6” | “7” | “8” | “9”

Korean Syllables

A3A?100?AAA?K9?M4?O7K?3D?C3P0?

Page 28: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Syllables aren’t enough though…

• We speak and write in words

• What do we need to do to make our program generate possible Korean words?

Page 29: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Korean Words

• start ::= word

• word ::= {syllable}

arbitrary number of syllables

What is wrong with this?A word can have 0 syllables???

How can we deal with this?

Page 30: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Korean Words Revised

• start ::= word

• word ::= syllable {syllable}

one syllable0 to n more possible

Now add the lines from syllable

Page 31: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Korean Words

• start ::= word

• word ::= syllable {syllable}

• syllable ::= consonant vowel [consonant]

• consonant ::= “A” | “B” | “C” | “D” | “E” | “F” | “G” | “H” | “I” | “J” | “K” | “L” | “M” | “N”

• vowel ::= “0” | “1” | “2” | “3” | “4” | “5” | “6” | “7” | “8” | “9”

Page 32: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

This goes to show that semantics are not accounted for in a formal

language.

The Problem with English…

DOG SHY BEE AXE

QEVNo matter what, we can’t define syllables or words such that we get all “words” as results (provided we don’t code all words in)!

• start ::= syllable

• syllable ::= consonant vowel [consonant]

• consonant ::= “B” | “C” | “D” | “F” | “G” | “H” | “J” | “K” | “L” | “M” | “N” | “P” | “Q” | “R” | “S” | “T” | “V” | “W” | “X” | “Y” | “Z” |

• vowel ::= “A” | “E” | “I” | “O” | “U”

Page 33: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Positive Integers

• Let’s construct a language for positive integers on the board.

Page 34: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Why Formal Languages?

• Mathematical Induction– How many syllables exist in Korean?

• 14 Consonants * 10 Vowels * (14 Consonants + 1 Blank) = 2,100

– How many syllables in English?• Can’t tell without counting one by one

Page 35: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Why Formal Languages?

• Parsing– Conversion to other useful information

• Φ = “ph”• 1/2 = ½

Page 36: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Why Formal Languages?

• Grafiti Palm Pilots– Allow for symbolic recognition– Writing on a small object easier than typing on

one (texting on a cell phone)

Page 37: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Why Formal Languages?

• Sequential Operations on a Computer– Selecting text– Drawing– Menu browsing

Page 38: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

Why Formal Languages?

• Mechanical recognition of commands– Spell checkers– Proper commands in DOS prompt

Page 39: SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow.

To think about

• Won’t be collected, but for your own exercise:

• Create a formal language that will output addition and subtraction questions for positive integers, namely 4+9-7=?– Should be able to do arbitrary amount of

calculations– No leading zeroes for a number

• Discuss on Friday