Problem Specification: Develop a adaptive learning
system that learns rule-sets / programs that have been evolved using evolutionary programming,
that process binary inputs from 8
binary input channels and take 4 different actions based on the inputs: 00, 01, 10, 11. Moreover, the classifier
might decide not to take any action at all, called "no action", or 'na' for short.
The classifier receives a reward / punishment
based on its output between 1 and 99 units. Training sequences are specified as a list of
input-payoff pairs; for example:
00001111 00=+04 01=+05 10=-98 11=-02
means that for the input 00001111 the outputs 00 and 01 are rewarded by
4 and 5 units; the outputs 10, 11 result in a punishment of
98 units, and 2 units respectively. If the classifier does not take any action, it receives a
reward of 0 for this action.
Project Philosopy: Each student only choses one approach of the two approaches outlined below. Students that use the same approach should share the same architecture and might share general software. However, each student implements his/her individual adaptive learning system!
Rule-based Approach: Design and implement a
classifier system that evolves a single rule-set (
Michigan approach) that consist of rules that are specified using
the following syntax:
"string of
length 8 consisting of 0,1 or *": "output (00,01,10, or 11)"
For example,
1*******:11
represents a rule that selects the
action 11 in the case that the input symbol starts with '1'.
Design a learning strategy, a conflict resolution strategy, a default action selection strategy, and a learning schedule of a classifier system of your own choice. A conflict resolution strategy decides which decision to make, if more than one action is found in the set of matching classifiers. A learning schedule determines when learning takes place. A default action selection strategy determines which action will be taken in the case that no classifier matches the input.
Genetic Programing Approach: Develop a genetic programming system that evolves a population of programs that have 8 input parameters (which are the values of the 8 input channel), and whose fitness is defined as the reward obtained for the last m training examples (where m is a parameter). First the architecture of the genetic progamming system has to be determined. The architecture determines the structure of program(s) evolved and how actions are selected by the classifier system. Second, a learning strategy and a learning schedule has to be chosen. The learning strategy determines how learning is accomplished: e.g. how programs are evolved, how their fitness is computed, what roles previously learned programs play in the learning process, etc. The learning schedule determines when learning takes place.
General Considerations: Be aware of the fact that your classfier system is only allowed to learn from past experience (violating this constraint will be severely penalized); that is, prescanning or preanalyzing the training file are not allowed in the project. Moreover, the classifier systems is not allowed to use knowledge concerning the feedback of actions that were not selected for the given input. This, however, does not prevent you from analyzing the feedback for taken actions for past inputs.
Report the results of running Benchmark1. Traning sequences in benchmarks consist of either 400 or 1200 examples; that is, each training sequence consists of 400/1200 input-payoff pairs. Test your system by giving the same training sequence twice (design your program in such a way that it can process the input file multiple times) to your system for each benchmark, and report the result of running your system for the benchmark. Also maintain a system bankaccount that keeps track of your system's performance which is initialized to 0 after seeing the 200th example (that is, the first 200 examples are ignored when evaluating learning performance). Report the value of this bankaccount after seeing the 300th, 600th, 900th, 1200th, 1500th,... example.
Write a report (approx. three single spaced pages) that outlines the architecture of your learning environment, describes the strategies and algorithms that were used). Additionally, prepare a second document that lists and summarizes the results of running your program for the first benchmark (run your system 4 times for the first benchmark, and report the results of the 4 runs). Your system will also be run for a second benchmark during your program demo. Be also prepared to explain the design and evolution of the program during the program demo.
Add a display option to your program that, everytime the program goes into learning mode, displays the current rule-set, parameters of individual classifiers belonging to the current rule-set (e.g. strength,...), your system's bankaccount, and visualizes the rule-set evolution process (at least in part). You can use any programming language to perform the project. Be also aware of the fact that time is limited, and the project has to be finished in about a month. These requirements have to be modified for the genetic progamming approach!