Four ways to represent computer executable rules
-
date post
18-Oct-2014 -
Category
Technology
-
view
1.001 -
download
1
description
Transcript of Four ways to represent computer executable rules
Cover Page
Uploaded June 24, 2011
Four Ways to
Represent Computer‐
Executable Rules Author: Jeffrey G. Long ([email protected])
Date: July 25, 2008
Forum: Talk presented at the InterSymp 2008 Conference, sponsored by the
International Institute for Advanced Studies in Systems Research and Cybernetics
(IIAS). Paper published in conference proceedings, available at
http://iias.info/pdf_general/Booklisting.pdf
Contents
Pages 1‐5: Preprint of Article
Pages 6‐26: Slides (but no text) for presentation
License
This work is licensed under the Creative Commons Attribution‐NonCommercial
3.0 Unported License. To view a copy of this license, visit
http://creativecommons.org/licenses/by‐nc/3.0/ or send a letter to Creative
Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.
1
Four Ways to Represent Computer-Executable Rules
Jeffrey G. Long [email protected]
Abstract Rules have long been used by society but have rarely been studied explicitly in their own right. They are increasingly recognized as interesting and useful abstractions. The recent trend towards business rules has brought the subject front-and-center in the business world, as have interests in work process re-engineering over the past twenty years. Rules for computerized applications currently are represented in three ways:
as software instructions as production rules in the rulebase of an expert system as pairs of XML tags.
Each of these has its strengths and weaknesses. This paper discusses these approaches and briefly describes a proposed fourth approach, namely representing most rules in a relational DBMS. I view this as an exercise in notational engineering, i.e. examining alternative represen-tations to select one that is “best” in some engineering sense. Key Words: Business Rules; Software; Expert Systems; XML; Relational Databases General Features of Rules Any manner of representing rules must have several fundamental features, including:
what kind of events can initiate a cascade of rule executions the sequence in which rules are to be inspected, if sequence matters (including loops) the various conditions under which each rule is to be inspected and/or fired what happens if no rule, one rule, or multiple rules are found that match selection criteria how to resolve conflicts if multiple actions are prescribed when and how to stop or complete a rule cascade.
To be a rule management system, such a system must also have metadata such as:
who created or updated the rule, and when why the rule was created/updated by what device the rule was created or updated (manually, by import, by software, etc.) whether the rule can safely be changed without consulting others
2
what kind of further “research” ought to be done regarding a rule, if any (e.g. are there questions about the rule? Might it be obsolete?).
Software Rules Software rules are implemented as lines of code in a computer language such as Java. Such rules are typically called “business logic” rather than “business rules,” and are specified in terms of one of four standard programming constructs:
an ordered sequence of instructions loops, used to specify conditional re-iterations of rules If-Then-Else statements that select among two or more options Case statements that select among multiple options.
The result of executing a software rule is that either (a) internal or external data values are updated, or (b) program control goes to a portion of the program that is specified. From there, further rules are found and executed. Because different situations often have similar but slightly different rules, parameters are often specified whereby the code reads the parameter (typically stored as a data element) and branches to another section of code based on the value of the parameter. This allows software designers to anticipate predictable differences in the way different users might want the system to work. An example of a parameter is the definition of a fiscal year-end month, so that accounting systems can handle the fact that any month may be the year-end of a fiscal year for a particular user. The ability to specify rules as software provides a very fine-grained ability to represent complex and contingent rules. The downside of this is that there are always many such rules, typically thousands or more, and as a result there are thousands to millions of lines of code in a typical software application, or even a single software object. This large code corpus is difficult to comprehend, and, since it must evolve with new rules, ensures significant life-cycle maintenance costs. As with any complex system, changing one part of the system may have unanticipated consequences for other parts. And since only programmers can update the code, there is always the risk of miscommunication between the subject experts and the programmers. Production Rules In expert systems there is an inference engine, that knows only the rules of inference, a rulebase that specifies the rules (called productions), and an initial set of facts (the environment). Rules are triggered by facts, and any and all rules are selected that match the current environment. Those rules are added to an agenda, any conflicts are resolved (often via rule prioritization) and the remaining rules are fired. The result of firing a rule is to make a change to the facts (assert-ing new facts or withdrawing existing facts), which may then cause other rules to be fired. This process continues until a specified end-point is reached, or until there are no more rules on the agenda.
3
Production rules are formulated in an If-Then (sometimes If-Then-Else) format. There can be an unlimited number of If-conditions, used to specify the specific environmental conditions under which the Then-action(s) will be taken, and an unlimited number of Then-actions. Rules are typically stored in a text file which is loaded into memory at runtime, as are the initial facts. The way rules are defined (formatted) has become important for rule interchange among different systems, and the Object Management group (OMG) released in 11/2007 a Beta version of its Production Rule Representation specification. This approach has shed light on the kind of thinking that an expert seems to do, namely to look for salient features of a given environment, respond to those features with changes to the environment, and then respond to the changed environment. Its downside is that when the rulebase exceeds a few thousand rules the system may behave in an unexpected manner, for the rule interactions are hard to anticipate, and the order of rule execution is important. Another difficulty is that there are many (possibly thousands of) free-standing, independent rules to manage, even when the rules are grouped into rulesets. Yet future expert systems will need to manage not just thousands but hundreds of thousands, even millions, of rules. XML Rules Much work has been done in recent years towards the design and standardization of XML-based Rule Markup Languages. These are intended to make rules more easily maintainable by non-programmers; to serve the semantic web; and to define rules in a manner not tied to any particular vendor’s technology. A primary driver has been the increasing need to communicate and cooperate with numerous systems not only within an organization but now across organiza-tions (e.g. to customers, vendors, regulatory agencies, etc). This has led to an interest in exter-nalizing certain rules outside of software so they may be more readily examined and changed. The eXtendable Markup Language (XML) format has been widely adopted as a general framework for the specification of rules (e.g. RuleML, R2ML). XML tags are used to demark the beginning and the end of operators and relations to check for a particular rule; these may be nested and combined as necessary. Rules so demarked may then be searched for and read by multiple applications. There is a W3C Working Group dedicated to producing a Rule Inter-change Format (RIF), and the OMG is working on a variety of important areas, and recently released version 1.0 of its Semantics of Business Vocabulary and Business Rules. One difficulty of this approach is that those who maintain the rules are still left with an enormous number of free-standing, independent rules to manage. Integrity constraints are being developed, but there is still no referential integrity, such that an update can cascade to all places where an entity is referenced. Lastly, there is little query or reporting capability by which one can scan or update rules quickly and easily. These problems are similar to the problems encountered with the software representation of rules. An example of a simple RuleML rule implementation to give a premium customer a 5% discount on any regular product is shown in Figure 1 below.
4
<imp> <_head> <atom> <_opr><rel>discount</rel></_opr> <var>customer</var> <var>product</var> <ind>5.0 percent</ind> </atom> </_head> <_body> <and> <atom> <_opr><rel>premium</rel></_opr> <var>customer</var> </atom> <atom> <_opr><rel>regular</rel></_opr> <var>product</var> </atom> </and> </_body> </imp>
Figure 1: RuleML for a Price Discount Decision
Ultra-Structure Rules Since 1985 I’ve developed and used a fourth approach, called “Ultra-Structure”. This approach removes all business rules that might ever change from the software, leaving only the control logic for a “competency rule engine” as software. The rest of the rules are represented via relational tables; there are no data or facts in the system, only rules. Rules can be converted from their natural language form (e.g. a policy manual) into one or more rules having a canonical form consisting of:
one or more “If” statements, defining conditions under which the rule should be inspected one or more “Then-Consider” statements, defining additional considerations (before
deciding what to do next) and/or actions one or more metarule data fields specifying who set up the rule, why, whether it can
safely be changed without consulting others, etc. We can then categorize those rules into a small number of formats called “ruleforms” that are defined by their form and meaning, such that any logically possible rule pertaining to that application area (e.g. order processing) can be expressed in some table in the system. This has the profound effect of reducing the myriad numbers of known (and future unknown) rules to a manageably small number of tables, typically less than 100 for an enterprise system. Lastly, we can implement each ruleform as a table. All rules having the same number of If-statements and similar meanings are grouped together into one table, with the If-statements
5
(called factors) forming columns that constitute the primary key of the table (and thereby guaranteeing the uniqueness of each rule). Other columns in the table (called considerations) represent the Then-Consider statements and the metadata about the rule. Thus, most business rules are represented not as software, and not as data in XML tags, but as records (relations) in a modern RDBMS. Questioning decades of focus on software, under this approach software is seen as more of a problem than a solution, and the focus is on rules represented as relational data. By specifying business rules as records in a RDBMS, the only software that remains is control logic that knows nothing about the world except what tables to look at, in what order, and what to do based on rules selected for execution. Key benefits of this approach are that:
the amount of software required is reduced between 10-100 times since this control logic is unlikely to change over time, the software and data structures
stay remarkably stable even as the rules continue to evolve rules can evolve by simply changing data, without any software changes, so many kinds
of changes can be implemented immediately subject experts and business managers can explain new rules to business analysts (not
only programmers), who can then directly update the rules through the RDBMS. The key benefits of using a relational database for storing such rules are that the RDBMS:
provides access security and logging of changes provides utilities for querying and reporting on large numbers (millions) of rules guarantees referential integrity can easily handle millions of rules as necessary.
This approach is not presented as a perfect solution to the software bottleneck. Still to be addressed are (a) the need to determine when certain conditions that might arise have not been anticipated by any rule in the system, (b) the difficulty conventional programmers have with looking in two places (the “data” as well as the software) to understand the logic of a situation, and (c) the semantics of data such that each data element (such as “order date”) really means the same thing to all parties. The OMG is working to address this last issue with its new standard. We recently used this approach to create and install an enterprise system for a US$175M wholesale distributor. References
Long, J., and Denning, D. (1995); Ultra-Structure: A design theory for complex systems and processes; Communications of the ACM Vol. 38, No. 1 (pp. 105-120)
Four Ways to Represent Computer-Executable Rules
Jeffrey G. [email protected]
IIAS Baden-Baden ConferenceJuly 2008
Minimum Requirements of Rule Management
The sequence in which rules are to be inspected, if sequence matters (including loops)matters (including loops)
The various conditions under which each rule is to be inspected and/or fired
What happens if no rule one rule or multiple rules are found What happens if no rule, one rule, or multiple rules are found that match selection criteria
How to resolve conflicts if multiple actions are prescribedWh d h t t / d l d When and how to stop/end a rule cascade
Exceptions to rules are rules also.
July 20082
Conventional Ways to Represent Rules
Software (e.g. Java, C#) Production Rules (e g CLIPS Jess) Production Rules (e.g. CLIPS, Jess) XML (e.g. RuleML, JessML )
Natural languages Mathematical functions Chemical formulae Music notation
July 20083
Software Rules
If (premium customer) and (regular product)Then (discount is 5%)– Then (discount is 5%)
– Else (discount is 0%)
Select Case (customer category)– Case “Premium”
Select Case (product category)(p g y)– Case “Regular”
discount = 5%
July 20084
Features of Software as a Notational System
Many valid ways to express a given rule– both a strength and a weakness, depending on programmerboth a strength and a weakness, depending on programmer
Seemingly easy to change– but many times changes create new and unexpected
problemsp The starting point, stopping point, and sequence of operations
are defined wholly and explicitly by the programmer Control is based on program structure; rules (lines of code) are p g ( )
data-insensitive and ordered One missing bracket changes rule, can make it and entire
system inoperable (unexecutable)
July 20085
XML Rules
<imp> <_head>
<_body> <and>
<atom> <_opr><rel>discount</rel></_opr> <var>customer</var>
d t /
<atom> <_opr><rel>premium</rel></_opr> <var>customer</var> </atom>
<var>product</var> <ind>5.0 percent</ind> </atom>
</ head>
<atom> <_opr><rel>regular</rel></_opr> <var>product</var> </atom> _
</and> </_body></imp>
July 20086
XML Rule Markup Features
Vendor-independent standard. Other rule standardization efforts include RIF, PRR, CL, SBVR; open source rules pcommunities include jBoss Rules, Jess, Prova, OO jDrew, Mandarax, XSB, XQuery
Designed for use on Semantic Web – distributed, (partially) open, heterogeneous environments
One missing bracket changes rule, can make it unexecutable
July 20087
Production Rules
(defrule MAIN::good-customer-discount(product is regular)(product is regular)(customer is premium)=>(assert (price-discount is 5%)))
July 20088
Production Rule Features
The knowledge (rules) and the data (facts and instances) are separated, and the inference engine is used to apply the p g pp yknowledge to the data
Rules are data-sensitive and unordered; control is based on data statedata state
There are three phases: rule-matching, rule-selection, and rule-execution
There are limited choices during rule selection, depending on the inference engine used to resolve a conflict set
July 20089
Real-World Rules are More Complex
Must be inspected from most specific circumstances (exceptions) to most general (whole classes)(exceptions) to most general (whole classes)
Have multiple circumstances (3-10 “factors”) Each factor has many possible values (5+)
Ci t t i f th i ti f l Circumstances trigger further inspection of complex “considerations” (e.g. QOH)
After being selected, additional rules may need to determine final outcome (e.g. lowest price)
July 200810
But They Don’t Easily Handle Many Rules Having Multiple Factors and Multiple Values
Order Entry Product Type = Price = Customer Type No NoOrder Entry ypRegular?
Yes
Price * 1.00yp
= Premium?No No
Yes
Customer Type = Premium?
Yes
Price = Price * 0.90
Price = Price * 1.00 No
Price = Price * 0.95
Yes
July 200811
Additional Management Requirements
Who created or updated the rule, and when was last update Why the rule was created/updated Why the rule was created/updated By what device the rule was created or updated (manually, by
import, by software, etc.)Wh th th l f l b h d b ith t Whether the rule can safely be changed by a person without consulting others
What kind of further “research” ought to be done regarding a l if th ti b t th l ? Mi ht it brule, if any, e.g. are there questions about the rule? Might it be
obsolete?
July 200812
Merge Tools & Techniques of:
Information Managementdatabases industrial strength platforms– databases, industrial-strength platforms
Knowledge Management– repository for knowledge of organization, both human-
oriented and machine-oriented
Knowledge Engineering– simulation of expert decision-making with continuous
decision process improvement
July 200813
p p
Ultra-Structure Rules
July 200814
Ultra-Structure Provides Rules with Place-Value
Existing Optionsfreedom of expression
Ultra-Structureexpression of rules is– freedom of expression
means complex syntax
ti i i d
– expression of rules is constrained by ruleforms
ti i i d– semantics is assigned largely by syntax
– semantics is assigned positionally
– result is great freedom but low manageability
– result is adequate freedom plus high manageability
July 200815
Ruleforms Define Place-Value Rule Semantics
Rul
es
July 200816
Benefits
Rule-recognition not triggered by working memory state but by events; different events involve different rulesevents; different events involve different rules
Able to define and manage more complex rules– multiple factors and multiple values per factor address need
for high number of possible permutationsfor high number of possible permutations – multiple considerations applied during rule-recognition
RDBMS permits better management of millions of rules– using standard RDBMS tools, report-writers, etc.– can be read and managed by subject experts
Can exchange tables of rules as data
July 200817
g
Conclusion
The problems with rule management are primarily caused by how we represent ruleshow we represent rules
This is a classic notation/representation problem
Ultra-Structure uses a new abstraction (i.e. ruleforms) to provide a time-tested way of assigning meaning by column
July 200818
References
J. Long, D. Denning (1995), “Ultra-Structure: A design theory for complex systems and processes”; Communications of the ACM Vol. 38, No. 1 (pp. 105-120)
H. Boley, S Tabet, G. Wagner, “Design Rationale for RuleML: A Markup Language for Semantic Web Rules” at citeseer.ist.psu.edu/boley01design.htmlLanguage for Semantic Web Rules at citeseer.ist.psu.edu/boley01design.html
CLIPS Reference Manual (3/28/2008)
July 200819
Other Articles by JL
Long, J., "Automated Identification of Sensitive Information in Documents Using Ultra-Structure". In Proceedings of the 20th Annual ASEM Conference, American Society for Engineering Management (October 1999)
Long, J., "Editor's Note." In Long, J. (guest editor), Semiotica Special Issue: Notational Engineering, Volume 125-1/3 (1999)Notational Engineering, Volume 125 1/3 (1999)
Long, J., "A new notation for representing business and other rules." In Long, J. (guest editor), Semiotica Special Issue: Notational Engineering, Volume 125-1/3 (pp 215 227) (1999)1/3 (pp. 215-227) (1999)
Long, J., "How could the notation be the limitation?" In Long, J. (guest editor), Semiotica Special Issue: Notational Engineering, Volume 125-1/3 (1999)
July 200820
Writings by Others
Shostko, A., “Design of an automatic course-scheduling system using Ultra-Structure.” In Long, J. (guest editor), Semiotica Special Issue: Notational Engineering, Volume 125-1/3 (1999)Engineering, Volume 125 1/3 (1999)
Oh, Y., and Scotti, R., “Analysis and Design of a Database using Ultra-Structure Theory (UST) – Conversion of a Traditional Software System to One Based on UST,” Proceeding of the 20th Annual Conference, American Society for Engineering Management (1999)for Engineering Management (1999)
Parmelee, M., “Design For Change: Ontology-Driven Knowledgebase Applications For Dynamic Biological Domains.” Master’s Paper for the M.S. in I.S. degree, University of North Carolina, Chapel Hill (November 2002)I.S. degree, University of North Carolina, Chapel Hill (November 2002)
Maier, C., CoRE576 : An Exploration of the Ultra-Structure Notational System for Systems Biology Research. Master’s Paper for the M.S. in I.S. degree, University of North Carolina, Chapel Hill (April 2006)
July 200821