Welcome to DELVAUX's Home Page


DELVAUX is an inductive learning environment for classification tasks that learns PROSPECTOR-style Bayesian rules from sets of examples. The objective of DELVAUX project is to explore to what extent machine learning techniques can be used for the rule generation process in expert systems . In DELVAUX, a genetic algorithm approach is used to learn the best rule-set, in which a population consists of rule-sets and rule-sets generate offspring through the exchange of rules permitting fitter rule-sets to produce offspring with a higher probability.

Pygmalion - Paul Delvaux (1939)



About the DELVAUX Project In Details

 

Evolutionary Balance - James Rosenquist (1977)



Research Personnel



The Objective of the Project

Extracting rules from experts is quite complicated and time-consuming, especially if their decision-making uses approximate reasoning techniques [2]. Consequently, it appears attractive to explore if machine learning techniques could be helpful to facilitate this task.

The objective of DELVAUX project is to explore to what extent machine learning techniques can be used for the rule generation process in expert systems. More specifically, the goal of DELVAUX project is to develop an inductive learning environment that learns Bayesian rules for classification tasks, such as the classification of flowers, or of different kinds of glass.

In DELVAUX, a genetic algorithm approach is used to learn the best rule-set, in which a population consists of rule-sets that generate offspring through the exchange of rules permitting fitter rule-sets to produce offspring with a higher probability.


Bayesian Rules in DELVAUX

In this section, the semantics of Bayesian classification rules that we try to learn will be explained with an Iris-flower classification problem (a more detailed discussion of these topics has been given in [1],[3], [5]): based on sepal length(SL), sepal width(SW), petal length(PL), and petal width(PW) it has to be decided, if a given Iris-flower is a Setosa, Versicolor, or Virginica.

A set of training examples is given to our learning environment; in the case of the Iris-flower classification problem, each example consists of the values of the four attributes and the class the particular example belongs to as depicted in Figure 1 (e.g., the first example is a Setosa which has the value 0.3 for attribute SL, 0 for PL, 1.0 for SW, and 0.67 for PW). The objective of the DELVAUX learning environment is to take these training examples as an input and to learn a Bayesian rule-set for the classification task. For an example, the Bayesian rule-set in Figure 1 could be the result of the learning process that will be discussed in more detail in the next section.

 

Figure 1. Example of Training Data Set and Bayesian Rule-set



In general, rules used in our learning environment work with odds instead of probabilities, using the following conversion from probabilities to odds: O(H):= P(H) / (1-P(H)). That is, a probability of 0.75 is converted to odds of 3= 0.75 / 0.25. Initially, only the prior odds of being a particular flower are given - in our example we assume that O(Setosa)=1, O(Versicolar)=0.11, and O(Virginica)=0.67; that is, 50% of Iris-flowers are Setosa, 10% are Versicolor, and 40% are Virginica. These odds are then updated by the Bayesian rules in the light of new evidence (in our case new evidence means interpreting the sepal length, sepal width, petal length or petal width of a particular example), yielding the posterior odds for each class of Iris-flower. Finally, the class with the highest posterior odds is selected by the rule-set as the classification of the example. If the rule-set's classification agrees with the class of the example, we say that the rule-set classified the example correctly.

Let us explain, how the rule-set depicted in Figure 1 classifies the first example (0.3, 0.0, 1.0, 0.67, Setosa). In Figure 1, S and N are odds-multipliers, measuring the sufficiency and necessity of left-hand-side condition for the right-hand-side conclusion. The multiplier S is used when the left-hand-side condition of the rule evaluates to true, N is used if it evaluates to false.