The role of the knowledge engineer. Knowledge acquisition.

The role of the knowledge engineer.

Knowledge acquisition.

Software development: conventional systems and

KBS

You are probably familiar with a standard model of the software development life cycle. It is likely to be something like this:

Feasibility study Analysis Requirements definition Design Implementation Testing

Maintenance & review

Software development: conventional systems and

KBS

Knowledge-based systems require special approaches to systems analysis, especially to the collection of the data (or rather knowledge) on which the system is based.

We will discuss the ways in which this model needs to be modified to take account of these special features in lecture 8.

Knowledge Engineering

The term "knowledge engineering" is often used to mean the process of designing building installing

an expert system or other knowledge- based system. In other words, the whole process of making a KBS, from beginning to end.

Knowledge Engineering

Some authors use the term to mean just the phase in which the knowledge base is built.

Building the knowledge base

Five processes can be identified: 1. Knowledge acquisition 2. Knowledge analysis & representation 3. Knowledge validation 4. Inference design 5. Explanation and justification These are not stages that have to follow

each other - some of them will run concurrently.

Knowledge Acquisition

Knowledge acquisition is:

The process of gathering the knowledge to stock the expert system's knowledge base.


This has proved to be the most difficult component of the knowledge engineering process. It's become known as the 'knowledge acquisition bottleneck', and expert system projects are more likely to fail at this stage than any other.

This is the principle reason why expert systems have not become more widespread.


Sources of knowledge: Documents: textbooks, journal articles, technical

reports, records containing case histories, etc. This will almost never be sufficient to

provide the knowledge base for a real-world expert system.

The range of problems which a textbook examines and solves is always smaller than the range of problems that a human expert is master of.


Sources of knowledge: Human experts

Knowledge Elicitation

The most important part of knowledge acquisition is knowledge elicitation - obtaining knowledge from a human expert (or human experts) for use in an expert system.

Knowledge elicitation is difficult. Hence the knowledge acquisition bottleneck mentioned above.

It is necessary to find out what the expert(s) know, and how they use their knowledge.


Expert knowledge includes: domain-related facts & principles; problem-solving strategies; meta-knowledge - for instance, knowledge

about when to use a particular piece of knowledge;

explanations and justifications.


The knowledge elicitation/analysis task involves finding at least one expert in the domain who:

is willing to provide his/her knowledge; has the time to provide his/her knowledge; is able to provide his/her knowledge.

- any or all of these are liable to prove difficult.


The knowledge elicitation/analysis task involves repeated interviews with the expert(s), probably

combined with other, non-interview, techniques.

Knowledge Elicitation - the compiled knowledge

problem

One major obstacle to knowledge elicitation: experts cannot easily describe all they know about their subject.

They do not necessarily have much insight into the methods they use to solve problems.

Their knowledge is "compiled" (like a compiled computer program - fast & efficient, but unreadable).

Knowledge Elicitation - interview techniques

Some of the interview techniques used in knowledge elicitation: Unstructured interview. A general discussion of

the domain, designed to provide a list of topics and concepts.

Structured interview. Concerned with a particular concept within the domain - a particular problem-solving skill or small group of skills.


interview techniques : Problem-solving interview. The DE is provided

with a real-life problem, of a kind that they deal with during their working life, and asked to solve it. As they do so, they are required to describe each step, and their reasons for doing what they do. The transcript of their verbal account is called a protocol.


interview techniques : Think-aloud interview. As above, but the DE

merely imagines that they are solving the problem presented to them, rather than actually doing it. Once again, they describe the steps involved in solving the problem.


interview techniques : Critical incident analysis. The DE is asked to

provide details of cases which were particularly difficult, or of special interest for some other reason. He/she describes how they were solved, and the lessons that were learnt.


interview techniques : Dialogue. The DE interacts with a client, in the

way that they would normally do during their normal work routine.


interview techniques : Review. The KE and DE examine the record of

an interview session together.

Knowledge Elicitation - non-interview techniques

Some of the non-interview techniques used in knowledge elicitation: Sample lecture preparation. The DE prepares a

lecture, and the KE analyses its content.


non-interview techniques: Concept sorting ("card sort"). The DE is

presented with a series of cards, with the names of domain concepts written on them, spread out on a table top, and asked to arrange them into clusters, in such a way that the cards in each cluster have something important in common. Then the DE is asked to name the principles that he/she has used to form these clusters. This process can be repeated to produce a hierarchy of concepts.


non-interview techniques: Repertory grid (particularly the "laddered grid"

technique). Questionnaires. Especially useful when the

knowledge is to be elicited from several different experts.


It is standard practice to tape-record KE sessions.

For something like a problem-solving interview, one would wish to videotape it as well.

However, KEs should be aware of the costs this involves, in time and money - it can take as much as 15 hours of secretarial time to transcribe and edit a one-hour interview.

Knowledge analysis & representation

Simultaneously with the knowledge acquisition process, a knowledge analysis process takes place. The KE uses the data - the transcripts and protocols, etc - from the knowledge acquisition sessions to build a good model of the expertise that the DE is using to solve problems in the domain.


The raw data (taken from the DE) is converted into intermediate representations. These are structured representations of the knowledge, but not yet the sort of coded knowledge that can be put into the knowledge base.

This will improve the knowledge engineer's understanding of the subject;


This will probably provide knowledge in a form that can be shown to the DE, for criticism and correction;

This provides easily-accessible knowledge for future KEs to work from (knowledge archiving).

The intermediate representation is then converted into the knowledge representation formalism which is to be used in the KBS software.

Knowledge validation

It is necessary to verify the knowledge against the knowledge source (the expert or document).

It is also necessary to validate the knowledge against known outcomes.

The objective is to produce knowledge of high integrity.

Inference design

It may be necessary to design the software which will comprise the inference engine; or a particular shell may already have been specified.

Explanation and justification

An explanation facility, capable of explaining/justifying any of the reasoning and conclusions that the system produces, needs to be designed and programmed.

Computer-assisted knowledge elicitation

Since knowledge engineering skills, and hence knowledge engineers, are rare (see appendix), it would be desirable to automate the job.

i.e. to write an expert system to do knowledge engineering.


The state of the art in AI (especially in natural language processing) is not sufficiently advanced to permit fully-automated knowledge elicitation.


However, 'knowledge elicitation workbenches', or 'knowledge engineering environments', are commercially available e.g. KEE, KnAcqTools, ETS, KRITON,

AQUINAS; their principle use is to simplify the task of

converting a protocol into frames, rules, etc., and inserting these structures into an expert system shell as soon as they are formulated.

Fully computerised knowledge acquisition

It might be thought that one could avoid using a domain expert altogether, by building a system that could extract knowledge, given facts about the domain.

This is the approach taken by machine learning systems: "classic" machine learning systems such as ID3

(Quinlan, 1979) & AQ11 (Michalski & Chilauski, 1980);


systems designed to provide knowledge for a particular system's knowledge base, e.g. META-DENDRAL, designed to discover rules for the rule-base in DENDRAL;

data mining systems; these do a similar job to classic machine learning systems, but work on a very large database of information.

sub-symbolic systems, i.e. neural nets and genetic algorithms. More about these in the last lecture in this course.


There are plenty of examples of machine learning systems producing formerly-unknown knowledge, and knowledge that was better than that of a domain expert

Knowledge discovery

e.g.(1) META-DENDRAL produced rules about the behaviour of

molecules in a mass spectroscope that were published in a chemistry journal as original contributions to the field;

Knowledge discovery

e.g.(2) AQ11 produced rules about how to diagnose

diseases in Soya bean plants.

AQ11’s rules were correct 97% of the time. The domain expert's rules were correct 83% of the time; he abandoned his rules, and adopted AQ11's rules instead.

Knowledge discovery

e.g.(1) META-DENDRAL produced rules about the behaviour of molecules in a mass spectroscope that were published in a chemistry journal as original contributions to the field;

e.g.(2) AQ11 produced rules about how to diagnose diseases in Soya bean plants. They were correct 97% of the time. The domain expert's rules were correct 83% of the time; he abandoned his rules, and adopted AQ11's rules instead.


This approach is particularly fruitful in 'knowledge-poor' domains, i.e. domains where not much expert knowledge is available.

However, it is a mistake to believe that one can do machine learning without a domain expert - at the very least, you need an expert to select the training examples, and to explain the domain terminology. Probably also to identify the features of the examples which are likely to be relevant.

The role of the knowledge engineer. Knowledge acquisition.

Documents

Transcript of The role of the knowledge engineer. Knowledge acquisition.