The implementation of medical guidelines which help contain rising medical costs can lead to an appreciable financial gain. In addition to the need to reduce health care costs, the software also serves a backup for the physician and pharmacist who must make decisions in a reasonable amount of time, but are presented with a large amount of patient information which might take hours to analyze. This aspect may have an effect on the patient's health. Finally, the process of computerizing the guidelines serves as a useful check on the guidelines, identifying situations (cases) the guidelines fail to address.
Expert systems developed in the early 1980's solved problems by chaining rules together and typically had a simple control structure and uniform representation of knowledge in the form of heuristic production rules. No account was taken of the inherent gradation or inexactness usually found in the real world and so the production rules frequently formed a complex hierarchy with exceptions to the rules, exceptions to the exceptions, and so forth, in an attempt to make black and white rules fit the boundaries of situations best described in shades of gray. Fuzzy sets were first proposed by Zadeh [18] in 1965, but fuzzy sets and fuzzy logic to this day remain controversial and misunderstood. While today's tools frequently provide mechanisms to track the certainty of conclusions, they do not always do so in a manner which makes sense.
The simplicity of control structure and knowledge representation in early expert systems also led to problems relating to knowledge acquisition, explanation, brittleness, and maintainability[2]. Even current expert systems languages and shells flexibly based on a combination of rules, frames, and logical inference do not distinguish between different types of reasoning. For example, one would expect that the task of designing a car would require significantly different reasoning strategies than the task of diagnosing a malfunction in a car. The tools and methodologies that the expert systems utilized are low level with respect to modeling the needed behavior. In essence, these systems resemble an assembly language for writing expert systems. While today's systems and are useful, a perspective that directly addresses knowledge level control structures and the higher level issues of knowledge-based reasoning is needed for the next generation of AI development.
Expert systems have been applied in a variety of domains.([5], [11] and [16]) As mentioned in the introduction of the paper, our application domain is the computerization of Medical Practice Guidelines. Practice Guidelines are Òproper indications for performing medical procedures, treatments and proper management of specific clinical problemsÓ[17]. Guidelines are also cited as potential tools for reducing the costs of health care, for enhancing quality assurance, and for improving medical education.
Two areas of competence were selected for initial implementation and study. One area was delimited by symptom and the other by category of drug prescribed. Both were selected to provide significant financial impact, as the medications involved frequently cost many thousands of dollars per prescription. In both cases, there are established guidelines for treatment. The guidelines are written for the use of human specialists and their use rests on a combination of common-sense reasoning and educated expectations. The guidelines themselves are based on reasoning under uncertainty and thus include an estimate of the quality of the supporting evidence on which they are based.
The first segment of the project was concerned with prescriptions for febrile neutropenia (presumably) caused by infectious diseases. Febrile neutropenia consists of high fever combined with a very low number of white blood cells in the bloodstream. This particular symptom can be the result of a wide variety of disease organisms. The other segment of the project dealt with the prescription of CSFs, or Colony Stimulating Factors, for cancer patients being treated with chemotherapy [12]. CSFs act to stimulate the growth of colonies of white blood cells, and so are among the drugs which may be prescribed for febrile neutropenia due to infection. It is possible, then, that a patient undergoing chemotherapy may have an infectious disease resulting in febrile neutropenia and a CSF prescription, so there is a small overlap between the two cases. The two sets of guidelines are independent so that any such prescription must satisfy both sets of guidelines.
Not surprisingly, implementation of the guidelines involves some significant representational issues. These issues place extensive demands on the tools used for knowledge representation and reasoning. These demands are particularly strong in the case of the CSF-usage guidelines. Additional demands on the tool come from sources such as the need to access information required by the guidelines and the need to communicate effectively with the pharmacist.
Rather than dealing with known facts, the tool is required to deal with evidence of varying degrees of certainty, basing its conclusions on a preponderance of the evidence. Much of the knowledge expressed within the guidelines is naturally captured in terms of fuzzy sets[19], even where the definition of the fuzzy set is context dependent. At the same time, the effect of many of the guidelines is most naturally captured in terms of Bayesian reasoning. Bayesian reasoning is part of the standard conceptual toolbox of the medical and pharmaceutical experts in the project.
The tool must deal with multiple levels of representation, as the guidelineÕs ontology ranges from simple measurements to the intent of human agents (requiring reasoning from effect to cause). The tool must handle complex temporal relationships dealing with both the interval and point-like aspects of time. Finally the tool must be capable of handling meta- knowledge for dealing with conflicts between different parts of the guidelines. It is desirable that the expression of knowledge within the tool be understandable by human experts. Finally, in order to useful, the tool must be consistent with the task of efficiently accessing diverse data sources (heterogeneous data bases).
Later, we will provide brief examples of the way in which most of these demands are met, however, two matters require special attention. One of them concerns the relationship between statistically derived conclusions (Bayesian) and fuzzy reasoning; the other concerns the use of task control structures at the knowledge level.
Fuzzy sets are extremely useful in capturing categorizations performed by humans [7]. The guidelines abound with terms that are naturally understood as fuzzy sets. "Febrile neutropenia", "myelosuppressive regimens", "high-risk patients", "decreased immune function", "excessive dose reduction", "dose-intensive", and "modest decreases" are but a few examples of such terms. Reference is made to fuzzy time intervals such as in "prolonged neutropenia". It would be effectively impossible to capture the knowledge embedded in the guidelines without a way of dealing with these fuzzy terms. It would be possible to define explicit boundaries, but this does not truly capture the knowledge. To use a non-medical example, we could say that being tall was evidence for being a good basketball player. We could define tall as 7' or above. Yet an individual which who is 6' 11" also gives us reason for expecting a good basketball player, even if the evidence is not as strong as is the case for an individual 7' or above. Similarly, we could define febrile in terms of a threshold temperature, but a temperature a tenth of a degree below and a tenth of a degree above the threshold are not significantly different. The ability to capture imprecise terms must be a requirement of the tool.
The natural interpretation in terms of fuzzy sets is, however, not without problems. Fuzzy set theory and fuzzy reasoning rely on an analysis in terms of possibilities rather than probabilities[4]. There can be no theoretically sound method for freely converting between the two. Simply to provide a method of measuring the possibility an individual belongs to a set and then to use this possibility measurement as a probability (as can be done with some tools) is not meaningful. How, then, is a Bayesian reasoner to cope with fuzzy terms? We will return to this question.
The Knowledge Acquisition and Design Structuring (KADS) system, developed by researchers at the University of Amsterdam and by other European partners, attempts to generalize this process and to address the problems of model formulation head on. KADS is a methodology for the structured and systematic development of knowledge-based systems which aims to provide software engineering support for the knowledge-engineering process from the first exploratory interviews with experts to the development of an operational system. KADS is a manual, nonformal method which divides the development process into a suite of models that represent the transformation of knowledge from initial informal phases into an operational system. Knowledge is represented using four distinct layers that represent the different types of expertise that developers must represent within a knowledge based system. Our tool is intended to support this approach by providing support for the representation of knowledge at each level. These four types of knowledge we distinguish are:
The inference-layer describes the "canonical inferences" of the application area. The developer who uses KADS represents these inferences in terms of meta-classes and knowledge sources. These canonical inferences represent the low-level procedural meta- knowledge of the domain. The use of probabilistic or fuzzy reasoning belongs to the inference layer.
The third layer of a KADS model is the task layer. Here, the developer specifies the sequence in which the expert should invoke specific inference layer knowledge sources to achieve solutions to a given type of problem. The task layer consequently models the control knowledge required by the problem solver.
Finally, KADS allows for a strategy-layer with which developers can specify how the problem solver might wish to choose dynamically among alternative task-layer control sequences.
To return to our basketball example, we may have a normal distribution centered on 5' 5" for the general population and a skewed normal distribution centered on 6' 5" for tall individuals. Given these probability distributions and taking the known percent of the individuals in the general population who are usefully tagged as tall as the prior probability of a specific individual being tagged as tall, we can then use the continuous case of Bayes theorem to establish the probability that an individual of a given height is a member of the set of tagged individuals.
The probability, given certain characteristics, that an individual is a member of the tagged set is analogous, in a certain sense, to the possibility the individual is a member of a fuzzy set, but the probability, however derived, is an actual probability and can be treated as such. Specifically, if we have appropriate S and N values that let us adjust the probability of being a good basketball player based on being (or not being) a member of the set of people tagged as tall and for a given individual of certain height we have the probability that the individual is usefully tagged as tall, then adjusting the probability of being a good ballplayer is a straightforward process.
One of the features of fuzzy set theory is that possibilities are not required to add to certainty. The analogous fact is that there is a useful dichotomy between the basis on which individuals are tagged and the actual set of individuals so tagged. Sets of tagged individuals may overlap or fail to tag any individual, even where the categories are both exhaustive and mutually exclusive. To return to our example, the world may be composed of short and tall individuals, but for an individual of precisely average height it may not be useful to tag the individual as either short or tall, because no useful conclusions can be drawn about such an individual. On the other hand, a case which occurs in the guidelines is the distinction between adult and child. For purposes of reasoning about allowable and required dosage levels, useful, but less than absolutely certain information, can be derived from the classification of a teenage boy with an age of 14 and average height and weight as a child AND as an adult. For this reason, the probability that such an individual is usefully tagged as a child and the probability that the individual is usefully tagged as an adult adds to more than 100%, even though child and adult are taken as mutually exclusive states.
A natural question might be, "If this approach simply mimics (more or less) the calculations based on fuzzy logic, why not simply use fuzzy logic as a basis of calculation"? Using an approach based on probabilities enables us to combine evidence from sources of information which are known to be dependent or independent in conformance with the well understood laws of probability, thus obtaining sharper (but still dependable) discriminations than would be possible using fuzzy logic (which can be loosely interpreted as assuming that any two sources of information may be dependent). 4
The complexity of our application and the level of competence required demanded more than a small knowledge based system. Designing it as a pure first-generation rule based system would undoubtedly lead to the difficulties associated with this technology such as brittleness of the system, poor explanation capabilities, and difficulties in maintaining the knowledge base[3]. Instead we separate the domain modeling from the problem solving tasks.
We identify the following components for a task:
As stated in Section 1, it is desirable that the expression of knowledge within the tool be understandable by human experts. Among other things, this means that any rules or other representations of knowledge should be readable without a specialized knowledge of the tool (although general knowledge, such as a familiarity with Bayesian reasoning, can be assumed). In Section 2, the remaining requirements the application creates for the tool are divided into the four levels of knowledge that must be represented. They are:
The inference layer deals with the application of rules, Bayesian reasoning, and "fuzzy" sets (as represented by equivalence classes of sets of tagged individuals). Again, the ability to apply rules updating a database of facts is a primitive capability of CLIPS. Thus, the inference layer tools can concentrate on dealing with inexact knowledge.
The strategic and task layers are uniformly dealt with by modeling the task structure at the knowledge level. This might not be adequate if the strategic demands of the application were more severe, but in fact the application is such that the strategic demands, though existent, are relatively simple, flowing from a natural hierarchy of desired outcomes.
The suite of tools, then support the application are:
If febrile and neutropenic then Treatment = Therapeutic. S=23.4 N=.42
The table entry "%febrile(temperature)" indicates a function called febrile using a parameter temperature that returns the probability of the patient being usefully tagged as febrile based on his body temperature. The NLI finds this entry in the Structure Table that tells the NLI that febrile is a membership probability function with temperature as its input parameter. Then the NLI also gets other system information such as the module name to which the current rule belongs.
The NLI goes after the input files one after the other, maintaining a record of the modules that the different rules belong to. At the same time, it automatically numbers the rules for each module. It is required that the rules for each module are kept in one single file, though the various rules for the total program may be kept separately in different files. This does not prevent more than one module per file however. If the input file is of the form filename.inp, then the output file of CLIPS code will have the name filename.clp
Another feature of the NLI is that it maintains an input data table which gives the function names and the function parameters. This table is an essential feature in that the interpreter uses this table to define the various CLIPS functions which would be needed to do the translation from natural language to CLIPS rules.
As CLIPS code is generated, the NLI needs to know the module name under which the rules are being kept. The following convention for automatic rule naming is adopted. Each rule is named with the concatenation of the module name followed by "rule" followed by a two digit number, starting from 01 and updated automatically after compiling each rule.
The various control mechanisms currently represented are direct activate, call activate, and compete activate as they are specified in section 4. State information is maintained for each task, and each task is implemented as a Clips Module.
Figure 6.1: The task structure of the Neutropenic guidelines During the analysis of the neutropenic guidelines we identified 12 major tasks and 17 subtasks. The system collects all the relevant patient information, then determines the antibiotic therapy. Since a broad-spectrum of drugs are required to be prescribed, determining the drugs and their appropriate dosage were identified as the two major tasks. Figure 6.1 shows the major tasks to be performed in accordance with the guidelines.
Each of the 12 major tasks can use subtasks to perform subgoals or make decisions as deemed necessary. For example, the Determine Antibiotic Therapy task decides upon the therapy to be used for the treatment of the patient. The four subtasks (Figure 6.2) will compete against each other and return confidence values to the main task with will then decide the appropriate therapy to be used for treatment. The subtasks are:
Figure 6.2: Determining the Therapy
Dealing with this takes our full spectrum of tools. The object oriented capabilities let us specify an ontology that meets our requirements. We have, for example, people which have the property of having names, always first and last, with the options of middle names or initials, titles and the like. People divide into doctors, pharmacists, and patients. For our purposes, patients have most of the interesting properties, however each prescription is a relation between exactly one doctor, pharmacist, and patient. For each prescription the doctor has an intent. In reality the doctors intent may complex, but we simply need to know whether the doctorÕs intent is to increase chemotherapy dose-intensity or not, which allows for a simple representation. Both the existence of unusually high dose-intensities and the absence of other rationales for prescribing CSFs are evidence of such motivations. Generally the higher the dosage and the less likely that alternative rationales exist for the prescription, the stronger our reason for believing that the doctor has such an intent. Based on an internal table of normal dosage level by drug and/or regimen, we create a function that takes the dosage level and returns the probability (odds) that the particular dosage level is usefully tagged as high. Alternate reasons for prescribing CSFs will already be in probabilistic form and so updating the odds using log-interpolation is straightforward. Since the evidence for high chemotherapy dose-intensity is known to be based on evidence independent from that used in evaluating alternative motivations for CSF prescription, the independent form of the and-operator can be used.
The way in which problems of implementation are solved on each level can also be seen from the example of radiation therapy and CSF treatment. Here we use the object-level to specify an ontology of events which divide into duration events with start and stop date/time pairs which are distinct, and instantaneous events for which the start and stop date/time pairs are identical. Both radiation therapy and the CSF treatment which results from a CSF prescription are duration events. Determining the interval of time between the end of radiation therapy and onset of CSF treatment requires a probabilistic membership function which measures likelihood that an interval is usefully tagged as sufficiently long. Combining this evidence with other evidence is, based on approximating Bayesian-interpolation.
Suppose that these cases overlap? Since both of these examples give prohibitions on the use of CSFs there is no conflict. Other parts of the guideline specify when CSF usage is indicated. How can conflicts be resolved and unnecessary analysis avoided?
The meta-knowledge required to resolve conflicts between guidelines ranges from simple to complex. The relationships between the 11 major CSF-guidelines and associated factors such as clinical research trials are loosely hierarchical and mostly implied. The guidelines for the use of CSFs in patients receiving concurrent chemotherapy and irradiation will normally over-ride all other considerations. Other guideline conflicts require the distinction between the conditions under which Bayesian reasoning from available evidence may take place (i.e. is appropriate) and the conditions which influence this reasoning. Those strategic (selecting alternative control sequences) conditions which control when Bayesian reasoning can be applied are known as strategic control facts. As an example, under the guidelines for secondary CSF administration, evidence for delay in chemotherapy caused by prolonged neutropenia is supporting evidence for the use of CSFs. The adjustment of the probability that the use of CSFs are justified, however are dependent on first establishing the strategic control fact that the CSF administration takes place in the context of a secondary administration (of a specific chemotherapy), as opposed to primary administration, or administration of CSFs for therapeutic purposes.
The guidelines for secondary CSF administration contain an example of context dependent "fuzzy" sets. In the absence of febrile neutropenia, prolonged neutropenia which causes excessive dose reduction in ongoing chemotherapy can be a supporting factor for the use of CSFs. The manner in which a dose reduction receives an "excessive" tag, however, depends on the context of where in the cycles of chemotherapy such dose reduction takes place. A dose reduction which, based on the amount of reduction for the specific drug, has a high probability of being in the set of excessive dose-reductions early in the treatment may have a very low probability of being in the set of excessive dose-reductions towards the end of chemotherapy.
[2] Jean-Marc David, Jean-Paul Krivine and Reid Simmons. Second generation expert systems: A step forward in knowledge engineering. In Jean-Marc David and Jean-Paul Krivine, editors, Second Generation Expert Systems. Springer-Verlag, New York, 1993.
[3] R. Davis. Where are we and where do we go from here? AI Magazine, 3:2:3-22, 1982
[4] D. Dubois, J. Lang, and H. Prade. Fuzzy sets in approximate reasoning, Part 1: inference with possibility distributions. In Fuzzy Sets and Systems, 4:141-202, 1991.
[5] R. Duda, J. Gaschnig, and P. Hart. Model design in the prospector consultant system for mineral exploration. In D. Machie, editor, Expert Systems in the Microelectronic Age. Edinburg University Pres, 1980.
[6] M. Eschbach and J. Cunnyngham. The logic of fuzzy Bayesian influence at International Fuzzy Systems Association Symposium of Fuzzy information Processing in Artificial Intelligence and Operational Research, Cambridge, England, 1984.
[7] F. Esragh and E.H. Mamdani. A general approach to linguistic approximation. In E.H. Mamdani and B.R. Gaines, editors, Fuzzy Reasoning and Its Applications, Academic Press, London, 1981.
[8] R. Fikes and N. Nilsson. Strips: a new approach to the application of theorem proving to problem solving. Artificial Intelligence, 2:189-208, 1971
[9] J. Fox. Towards a reconciliation of fuzzy logic and standard logic, In International Journal of Man-Machine Studies, 15:213-220, 1981.
[10] S. Haack. Do we need fuzzy logic, In International Journal of Man-Machine Studies, 11:437-445, 1979.
[11] J. McDermott. R1: a rule-based configurer of computer systems. Artificial Intelligence, 19:1:39-88, 1982
[12] L. L. Miller, et. al., ASCO Ad Hoc Colony-Stimulating Factor Guideline Expert Panel. American Society of Clinical Oncology Recommendations for the Use of Hematopoietic Colony-Stimulating Factors: Evidence-Based, Clinical Practice Guidelines, Journal of Clinical Oncology, 12:11:2471-2508, 1994.
[13] Mark A. Musen. An overview of knowledge acquisition. In Jean-Marc David and Jean- Paul Krivine, editors, Second Generation Expert Systems, Springer-Verlag, New York, 1993
[14] A. Newell and H. Simon. GPS: a program that simulates human thought. In Feigenbaum and Feldman, editors, Computers and Thought, 1963
[15] J. Pearl. Probabilistic Reasoning in Intelligent Systems. , Morgan Kaugmann, Palo Alto, 1988
[16] E. Shortliffe. Computer Based Medical Consultation: MYCIN. American Elsevier, 1976
[17] S. H. Woolf. Practice guidelines: A new reality in medicine. in recent developments. Arch Intern Med, 150:1811-1818, 1990.
[18] L.A. Zadeh. Fuzzy sets. In Information and Control 8:338-353, 1965.
[19] L. A Zadeh. A theory of approximate reasoning. In J. Hayes., D.Michie, and L. Mikulich, editors, Machine Intelligence 9, Halstead Press, New York, 1979.
1Email clifton@cs.uh.edu
2Funded by a Texas Advanced Technology Grant (1/94 - 5/96).
3See [5], [11], and [16] for examples.
4 Devotees of fuzzy logic may take legitimate umbrage with this simplified exposition. There is actually a strong and a weak or-operator and similarly two and-operators defined with respect to an exposition of fuzzy logic in terms of the Lukasiewicz implication. For comparison to our approach see [6].