Ontologies for the Semantic Web: Time and Space Jerry R. Hobbs USC/ISI Marina del Rey, CA.
Doing More with LESS: Logic Embedded in SpreadSheets Andre Valente (KSVentures and USC/ISI) David...
-
date post
22-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Doing More with LESS: Logic Embedded in SpreadSheets Andre Valente (KSVentures and USC/ISI) David...
Doing More with LESS: Logic Embedded in SpreadSheets
Andre Valente (KSVentures and USC/ISI)David Van Brackle (ISX)Hans Chalupsky (KSVentures and USC/ISI)Gary Edwards (ISX)
LESS: Logic Embedded in SpreadSheets
Logic Embedded in SpreadSheets (LESS) will provide the user with a system which combines the power of logic-based knowledge representation and reasoning with the familiar and easily-mastered user interface paradigm of a spreadsheet
LESS was developed under an SBIR contract from DARPA Project Manager: David Gunning
Motivation Spreadsheets are Widespread but Error Prone
[Abraham & Erwig, 2003] 55 Million users (= programmers) of spreadsheets in
2005 Spreadsheets contain alarming rates of errors [Erwig
and Burnett 2002] Spreadsheets have their strengths…
Rapid model development Interactive “What If” feedback Scalable application of easily defined computational
logic … and weaknesses
Hard to express complex logical statements Data structure hides relationships
Logic spreadsheets can
Improve quality (less errors) Increase maintainability Support the construction of more
explicit and complex models Be a practical platform for knowledge
acquisition
Key Challenges Basic Perspective:
A pragmatic, scruffy, engineering approach Start with open mind – try things out, see what works
Usefulness – How to develop a useful tool, not just an academic exercise? Answer: a real hybrid Enrich the functionality of a spreadsheet while maintaining
on its core interaction paradigm Spreadsheet should not be a GUI for knowledge bases Logic should not be just a function library in the
spreadsheet Usability – How to get the syntax and the human interaction
right to make it usable Answer: four tiers Hide as much logic as needed Allows different trade-offs
The LESS Model
Tier 1: Logic functions available programmatically in formulas
Tier 2: Tabular-oriented Knowledge Integrated into
Spreadsheet GUI
Tier 3: Full Integration of
Spreadsheet and Logic System
Tier 4: App Templates
Tier 1: Logic functions available programmatically in formulas
PowerLoom functionality available to Spreadsheet functions as function calls
Lifting Mappings – pushing info into the KB
Query Mappings – extracting info from the KB
ASSERT( “Age”, $A2, B2 )
A B C D
1 Person Age Car Happy?
2 Fred 14
3 Joe 17 Ford Yes
4 Sam 33 Saab
IF( ASK( “Happy”, $A3 ), “Yes” )
Tier 2: Tabular-oriented Knowledge Integrated into Spreadsheet GUI
Instance Tables: A collection of instances of a given class will be represented as a special spreadsheet table
Creating a new Instance Table is equivalent to creating a Class in an Ontology
Rules are entered as text, much like Spreadsheet Functions
Tier 3: Full Integration of Spreadsheet and Logic System
Increased usability Hide away details of logic formulations Develop special GUIs integrated into the
spreadsheet Wizards Special cell editors TreeTable metaphors
Requires tighter integration of logic and spreadsheet data models
Tier 4: Application Templates Templatize working LESS applications Hide even more of logic to users Better deployment, packaging mechanisms
Domain Paks Templates captured by experts in a given field
and widely distributed Production tools (meta level) Mechanisms for connecting to existing data
(e.g. in DBMSs)
LESS Architecture
LESS Components LESS integrates:
PowerLoom, a powerful logic KR&R system Microsoft Excel, the most widely used
spreadsheet in the world LESS is implemented as an extension (add-
in) to Microsoft Excel Status: prototype implementation to be
demonstrated here LESS Excel add-in has 1.6Mb (includes
PowerLoom libraries) Implemented in C++ and Visual Basic
Excel Spreadsheet
PowerLoom
Push
Know
ledg
e In
to P
ow
erLo
om
Invoke
Pow
erLo
om
Functio
ns
Pull K
now
led
ge
From
Pow
erLo
om
Invoke
Spre
ad
sheet
Functio
ns
Push
Know
ledg
e T
o S
pre
adsh
eet
Component Interactions
LESS – a mapping layer between Excel and PowerLoom
PowerLoom PowerLoom is result of 20 years of research
and practical use of KR&R systems at USC/ISI
Powerful and expressive Unique “pragmatic” stance to tackle real
world problems Usability more important than theoretical
"neatness" Allow trade-offs between
expressivity/completeness A KR system for real-world applications
Used in HPKB, CoABS, RKF, EELD, etc.
PowerLoom Highlights Representation in fully-expressive first-order predicate logic KIF syntax with many extensions to standard FOL:
Type, set & cardinality relations (e.g., subset-of, range-cardinality) Second-order definitions via holds Classical negation and negation-by-failure, defaults
Frame-style definition language as syntactic sugar Defconcept, defrelation, deffunction, definstance, defrule
Incremental monotonic and non-monotonic updates Interleave definitions, assertions, retractions with retrieval and
inference Context mechanism separates name and assertion spaces with
inheritance Provides structuring mechanism, facilitates hypothetical reasoning
Several reasoning mechanisms Reasoning specialists architecture for “plug-ins” (e.g., time reasoner)
RDBMS interface based on lifting axioms
PowerLoom Status and Distribution Written in STELLA
Available in Lisp, C++ and Java Both STELLA and PowerLoom are now Open Source
Recent development for PowerLoom GNU or Mozilla licenses
Current release: PowerLoom 3.0.2.beta Distributed as Lisp, C++ and Java source More recently in STELLA source ~600 downloads world-wide ~400 subscribers to the mailing lists
Why Have a Logic Spreadsheet at All?
A Key Insight: Relational vs. Positional Referencing (1) Spreadsheets use positional referencing to refer to
information G5 = relative cell reference $G$5 = absolute cell reference A5:D5 = range of cells {A5 B5 C5 D5}
Positional Referencing is very intuitive in the beginning…
But it is highly error-prone More complicated than it seems at a first glance Spreadsheets try hard do update references when
changes are made (e.g., a line is inserted), but errors are frequently introduced
Also, restricted to two dimensions
A Key Insight: Relational vs. Positional Referencing (2) Logic uses relational referencing to refer
to information Meaning is conveyed by relations and assertions Information is referenced by open variables as
well as role in relations Example:
Logic function (hours ?person ?month ?task) Means the number of hours a ?person spends on
a ?task in a ?month N-ary relations and more powerful
proposition specification languages add power to spreadsheets
A Key Insight: Relational vs. Positional Referencing (3)
Key insight of LESS is to add relational referencing to spreadsheets
Mixed model allies best of both worlds Cells can still be used to refer to data
Cell Variables can be bound to contents of cells Logic propositions express relationship
between several information elements
LESS Features – Highlights
LESS Features – Highlights
Explicit relationships Multi-dimensional relations Better, Explicit (Business) Rules Powerful information querying Platform for Knowledge
Acquisition/Engineering Platform for deploying Knowledge-
Based Systems
About Our Examples Three main examples:
Pilot training spreadsheet Budget spreadsheet Grading spreadsheet
All examples are real-life spreadsheets Pilot training example based on a spreadsheet used by ANG
149th Tactical Fighter Squadron Budget spreadsheet was used by a small business for
writing cost proposals for government contracts Grading spreadsheet was used to grade an actual course as
well as an exercise for a class in spreadsheets Sometimes simplified for presentation purposes
Syntax in these slides is simplified (we are still adding syntactic sugar)
LESS Features – Highlights
Explicit relationships Multi-dimensional relations Better, Explicit (Business) Rules Powerful information querying Platform for Knowledge
Acquisition/Engineering Platform for deploying Knowledge-
Based Systems
Positional Referencing Looks Easier Than It Is (1)
For example, B6 contains the formula =SUM(B3:B5)
Simple, easy…
A B C D E1 Task 12 Person Jan-05 Feb-05 Mar-05 Apr-053 Fred 12 4 12 14 4 Wilma 4 7 15 3 5 Barney 10 8 3 6 6 Total 26 19 30 23
Hours
Positional Referencing Looks Easier Than It Is (2)
Even such a simple sheet generates errors when a new employee (line) is added!
Totals stayed the same but should have been updated
A B C D E1 Task 12 Person Jan-05 Feb-05 Mar-05 Apr-053 Fred 12 4 12 14 4 Wilma 4 7 15 3 5 Barney 10 8 3 6 6 Pedrita 10 7 9 9 7 Total 26 19 30 23
Hours
Positional Referencing Looks Easier Than It Is (3)
What is the problem? In positional referencing SUM(B3:B5)
really means “sum the three cells above me” This clearly is not what is intended
Intention of the user is “sum the Jan-05 hours for all persons in Task 1”
Relational referencing allows a user to say just that!
Relational Referencing in LESS Makes Relationship Explicit (1) With LESS, we want to express the relationships explicitly
(hours ?person ?month ?task) Means the number of hours a ?person spends on a ?task in a ?month
In order to add relational information to a spreadsheet, we need to add a lifting formula
One way is to add a lifting formula on top of each column: =ASSERTMULTI(“hours”, A3:A5, B2, $A$1) This expands into and performs a series of assertions, e.g.:
(hours A3 B2 A1) = (hours “Fred” Jan-05 “Task 1”) = 12 (hours A4 B2 A1) = (hours “Wilma” Jan-05 “Task 1”) = 4
A B C D E1 Task 12 Person Jan-05 Feb-05 Mar-05 Apr-053 Fred 12 4 12 14 4 Wilma 4 7 15 3 5 Barney 10 8 3 6 6 Total 26 19 30 23
Hours
Relational Referencing in LESS Makes Relationship Explicit (2) Once the information is lifted into PowerLoom, we
can easily express a cell as “sum the Jan-05 hours in Task 1 for all persons”
B6 contains the formula:=SUM (RETRIEVEALL (“hours ?person Jan-05 “Task 1”)”))
Notice mix of logic and spreadsheet functions
A B C D E1 Task 12 Person Jan-05 Feb-05 Mar-05 Apr-053 Fred 12 4 12 14 4 Wilma 4 7 15 3 5 Barney 10 8 3 6 6 Total 26 19 30 23
Hours
LESS Features – Highlights
Explicit relationships Multi-dimensional relations Better, Explicit (Business) Rules Powerful information querying Platform for Knowledge
Acquisition/Engineering Platform for deploying Knowledge-
Based Systems
Multi-dimensional relations (1)
Problem: positional referencing only works when information has at most two dimensions (3-arity relations)
Intuitively easy to see: user cannot make a range with three dimensions
Multi-dimensional relations (2) Example: calculate totals
across tables Formula looks much less
intuitive, e.g. B24=B3+B10+B17 Can’t use ranges
Extremely hard to maintain and error prone! E.g., if a new task is added or
there are many tasks What you really want to say is
“sum the hours in Jan-05 for Fred in all tasks”
A B C D E1 Task 12 Person Jan-05 Feb-05 Mar-05 Apr-053 Fred 12 4 12 14 4 Wilma 4 7 15 3 5 Barney 10 8 3 6 6 Total 26 19 30 23 78 Task 29 Person Jan-05 Feb-05 Mar-05 Apr-05
10 Fred 4 7 4 7 11 Wilma 12 12 4 14 12 Barney 10 8 10 8 13 Total 26 27 18 29 1415 Task 316 Person Jan-05 Feb-05 Mar-05 Apr-0517 Fred 12 12 4 14 18 Wilma 4 4 7 3 19 Barney 10 10 8 6 20 Total 26 26 19 23 2122 Total23 Person Jan-05 Feb-05 Mar-05 Apr-0524 Fred 28 23 20 35 25 Wilma 20 23 26 20 26 Barney 30 26 21 20 27 Total 78 72 67 75
Hours
Hours
Hours
Hours
What happened? Task table can be seen as a model with two
dimensions – person and monthly hours BUT: Overall information model reflects
indexing in three dimensions – person, month and task
When information needs to be aggregated across more than two dimensions, the spreadsheet model suffers
Pivot tables can overcome some limitations, but only for some cases
LESS Supports Multi-Dimensional Relations Assuming the same lifting
described earlier, we can easily express “sum the hours in Jan-05 for Fred in all tasks” as:
=SUM (RETRIEVEALL ("(hours Fred Jan-05 ?task)”))
This can also easily be turned in to an Excel-like template formula by mixing cell variables:
=SUM (RETRIEVEALL ("(hours A24 B$23 ?task)”))
A B C D E1 Task 12 Person Jan-05 Feb-05 Mar-05 Apr-053 Fred 12 4 12 14 4 Wilma 4 7 15 3 5 Barney 10 8 3 6 6 Total 26 19 30 23 78 Task 29 Person Jan-05 Feb-05 Mar-05 Apr-05
10 Fred 4 7 4 7 11 Wilma 12 12 4 14 12 Barney 10 8 10 8 13 Total 26 27 18 29 1415 Task 316 Person Jan-05 Feb-05 Mar-05 Apr-0517 Fred 12 12 4 14 18 Wilma 4 4 7 3 19 Barney 10 10 8 6 20 Total 26 26 19 23 2122 Total23 Person Jan-05 Feb-05 Mar-05 Apr-0524 Fred 28 23 20 35 25 Wilma 20 23 26 20 26 Barney 30 26 21 20 27 Total 78 72 67 75
Hours
Hours
Hours
Hours
LESS Features – Highlights
Explicit relationships Multi-dimensional relations Better, Explicit (Business) Rules Powerful information querying Platform for Knowledge
Acquisition/Engineering Platform for deploying Knowledge-
Based Systems
LESS Provides Better, Explicit Rules Spreadsheets encode business rules in formulas Several problems:
Rules are hidden from view – harder to audit, opaque Rules have to be repeated everywhere they are used
– higher chance for errors, difficult to maintain Another problem is that spreadsheet formulas are
very poor in expressing rules that are not strictly mathematical formulas No good mechanism for complex conditions
LESS allows a user to model explicit (business) rules Better methodology (trend towards explicit business
rules) Easier to debug and maintain Improves visibility, communication
Better, Explicit Rules - Example (1) In the example below, the formula to calculate the grade letter
(column H) requires nested conditionals: IF($H4<70,"D", IF($H4<75,"C-", IF($H4<78,"C", IF($H4<80,"C+",
IF($H4<85,"B-", IF($H4<88,"B", IF($H4<90,"B+", IF($H4<95,"A-","A"))))))))
Opaque, hard to read, easy to make mistakes “Rule” is repeated once per line – error prone, hard to maintain
A B C D E F G H1 Exam Final Letter2 Student 1 2 1 2 Grade Grade Grade3 Parker 56 84 89 93 90 81 B-4 May 95 92 79 83 95 90 A-5 Watson 89 87 98 78 80 86 B6 Jameson 66 95 99 69 69 79 C+7 Osborn 88 51 79 63 79 72 C-
Assignments Tests
Better, Explicit Rules – Example (2) In LESS we can define rules elsewhere and simply apply
them In the same example, the formula to calculate the grade
letter (column H) is a simple query:RETRIEVE("1 (letter-grade ", A3, " ?x)" ) )
Notice how the rules are defined once, applied several times
A B C D E F G H1 Exam Final Letter2 Student 1 2 1 2 Grade Grade Grade3 Parker 56 84 89 93 90 81 B-4 May 95 92 79 83 95 90 A-5 Watson 89 87 98 78 80 86 B6 Jameson 66 95 99 69 69 79 C+7 Osborn 88 51 79 63 79 72 C-
Assignments Tests
Better, Explicit Rules – Example (3)
In LESS rules can be defined separately and explicitly, and reused everywhere:
A B C D E F G11 Classification rules - Version 112 !=assert((=> (>= ?x 95) (letter-grade ?x "A")))13 !=assert((=> (and (< ?x 95) (>= ?x 90))(letter-grade ?x "A-")))14 !=assert((=> (and (< ?x 90) (>= ?x 88))(letter-grade ?x "B+")))15 !=assert((=> (and (< ?x 88) (>= ?x 85))(letter-grade ?x "B")))16 !=assert((=> (and (< ?x 85) (>= ?x 80))(letter-grade ?x "B-")))17 !=assert((=> (and (< ?x 80) (>= ?x 78))(letter-grade ?x "C+")))18 !=assert((=> (and (< ?x 78) (>= ?x 75))(letter-grade ?x "C")))19 !=assert((=> (and (< ?x 75) (>= ?x 70))(letter-grade ?x "C-")))20 !=assert((=> (< ?x 70) (letter-grade ?x "D")))
Better, Explicit Rules – Example (4)
Rules can also be parameterized on Excel data – a nice fit with Excel modeling style:
A B C D E F G H I J K L M N O P22 Classification rules - Version 223 =POWERLOOM( "(defconcept student)" )24 =POWERLOOM( "(deffunction final ((?x student :-> (?n number))" )25 Min Max Letter =POWERLOOM( "(deffunction letter-grade ((?x student)) :-> (?s string))" )26 95 100 A =ASSERT("=>",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A26), P( "=<", "?g", C38 )), P("letter-grade","?x",LIT(B26)))27 90 94 A- =ASSERT("=>",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A27), P( "=<", "?g", C39 )), P("letter-grade","?x",LIT(B27)))28 88 89 B+ =ASSERT("=>",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A28), P( "=<", "?g", C40 )), P("letter-grade","?x",LIT(B28)))29 85 87 B =ASSERT("=>",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A29), P( "=<", "?g", C41 )), P("letter-grade","?x",LIT(B29)))30 80 84 B- =ASSERT("=>",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A30), P( "=<", "?g", C42 )), P("letter-grade","?x",LIT(B30)))31 78 79 C+ =ASSERT("=>",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A31), P( "=<", "?g", C43 )), P("letter-grade","?x",LIT(B31)))32 75 77 C =ASSERT("=>",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A32), P( "=<", "?g", C44 )), P("letter-grade","?x",LIT(B32)))33 70 74 C- =ASSERT("=>",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A33), P( "=<", "?g", C45 )), P("letter-grade","?x",LIT(B33)))34 0 69 D =ASSERT("=>",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A34), P( "=<", "?g", C46 )), P("letter-grade","?x",LIT(B34)))
Better, Explicit Rules – Example (5)
For ultimate elegance, we can use templates (Phase II):
A B C D E F G H I38 Min Max Letter39 95 100 A40 90 94 A-41 88 89 B+42 85 87 B43 80 84 B-44 78 79 C+45 75 77 C46 70 74 C-47 0 69 D4849 Template: =assert((=> (and (<= ?x ??P1) (>= ?x ??P2) ((letter-grade ?x ??P3)))50 =apply-template(B49,A39:C47)
LESS Features – Highlights
Explicit relationships Multi-dimensional relations Better, Explicit (Business) Rules Powerful information querying Platform for Knowledge
Acquisition/Engineering Platform for deploying Knowledge-
Based Systems
Powerful Information Querying Spreadsheets have limitations on how to
summarize information Particularly if the information is symbolic
Basic problem is that it is hard to select information from other parts of the spreadsheet Worse if information is in multiple tables and/or
has more than three dimensions Special wizards such as Excel pivot tables
help overcome some of these deficiencies But they still have restricted expressivity
LESS Demo – Squadron Training Examples
Also – Tiers 2, 3
LESS Features – Highlights
Explicit relationships Multi-dimensional relations Better, Explicit (Business) Rules Powerful information querying Platform for Knowledge
Acquisition/Engineering Platform for deploying Knowledge-
Based Systems
New Possibilities Opened with LESS
One way to look at logic spreadsheets is for logic to add spice to spreadsheets Delivers incremental improvements (Only) user is spreadsheet user
We found that logic spreadsheets can be good for knowledge engineers as well
Two types of applications: Template-based knowledge acquisition Reasoning
Template-based knowledge acquisition (1) A common problem in knowledge acquisition is to acquire knowledge about
many similar objects KA approaches are hard to understand for SMEs
Require understanding of modeling decisions For instance, not reasonable to ask an SMEs to model (example from DARPA
ARPI and JFACC Programs):(defconcept F-15 :IS-PRIMITIVE Fighter-Aircraft :IMPLIES (:AND (:FILLED-BY NICKNAME "Eagle") (:FILLED-BY MADE-BY MCDONNELL-DOUGLAS) (:EXACTLY 2 ENGINES) (:ALL GUN M61A1) (:FILLED-BY IN-SERVICE 1972) (:FILLED-BY PRIMARY-FUEL JP-4) (:FILLED-BY ALTERNATE-FUEL JP-5 JP-8 JET-A
JET-A-1 JET-B)) :DEFAULTS (:AND (:FILLED-BY CEILING 65000ft) (:FILLED-BY COST 15000000) (:FILLED-BY CREW-SIZE 1)))
Template-based knowledge acquisition (2) An approach to deal with this problem is to have
“knowledge templates” Templates abstract the way information is used in the logic
formalism Middleware code (e.g. a Lisp macro) “routs” the information in
the template to its place in the knowledge base (e.g., slot filler, type, default value)
Example of template implemented as Lisp macro:(def-aircraft-c F15 :types fighter-attack-aircraft :nickname "Eagle" :made-by McDonnell-Douglas :engine-count 2 :primary-fuel JP-4 :alternate-fuel (JP-5 JP-8 Jet-A Jet-A-1 Jet-B))
Ideal division of labor: SME enters information easily Knowledge engineer designs templates to map into a KB
Template-based knowledge acquisition (3) Logic Spreadsheets are great for
implementing knowledge templates Table format is natural for most SMEs Spreadsheet is widely used (info may already be
in a spreadsheet!) Information also available for other uses
Example:
A C D F G H1 Aircraft Maker # Engines Primary Fuel Alternate Fuels …2 F-15 McDonnel Douglas 2 JP-4 JP-5 JP-8 Jet-A Jet-A-1 Jet-B …3 F-16 General Dynamics 1 JP-4 JP-5 JP-8 Jet-A Jet-A-1 Jet-B …4 … … … … … …
LESS Features – Highlights
Explicit relationships Multi-dimensional relations Better, Explicit (Business) Rules Powerful information querying Platform for Knowledge
Acquisition/Engineering Platform for deploying
Knowledge-Based Systems
Reasoning Spreadsheets can contain full-blown knowledge-based
systems Plays on availability of information already in spreadsheets
For instance, we can re-implement PROSPECT [Valente and Schacchi, 2002] in LESS Input was a list of (instances of) business process models KB has knowledge about critiquing process models Output is agenda of problems to evaluate Applied to critique accounting processes in large corporation
LESS implementation would: Use tables and templates to express input process models KB can be provided explicitly or as a domain pak Output agenda of problems in the process models can be
implemented as an instance table
Tier 2 Example
Create instance table
Creating instances