- Your initial experiment (just to test the neural-network software packages) is to show that a two-layer feed-forward Perceptron network (one input-unit layer, one output-unit layer, and one set of trainable weights between them) is not sufficient to separate the exclusive OR function, separating (0, 1) and (1, 0) from (0, 0) and (1, 1). This neural network has two binary input units and one output unit that is trained to be positive for (0, 1) and (1, 0) and negative for (0, 0) and (1, 1). Then show that a three-layer network (add another layer of, say, two units between the input and output layer) that has two sets of adjustable weights easily solves this problem.
- You first real experiment is to create a simple model of the visual cortex by designing
Hubel/Wiesel-like cells
(see also the
*The Visual Cortex*article) and to perform a character recognition study that compares this model against a simple three-layer Perceptron. This experiment should be completed by the halfway point of the semester and the results presented at our second in-class meeting. The model of the visual cortex is as follows:- Input layer: 20x20 retina of binary (0 or 1) units, total of 400 units.
- Input patterns:
initially use the subset of the uppercase alphabet that consists of horizontal and vertical line segments,
that is the characters E, F, H, I, L, T.
Let each character be represented by a
5x7 bit pattern as follows (* = 1, blank = 0):

***** ***** * * * * ***** * * * * * * * * * * * * * * **** **** ***** * * * * * * * * * * * * * * * * * ***** * * * * ***** *

Allow a bit pattern to be placed at any position inside the 20x20 retina that is not adjacent to an edge; by shifting the 5x7 character pattern around the retina there are 168 (12*14) possible non-retinal-edge positions of a character pattern. - The second layer consists of Hubel/Wiesel-like line detectors
for detecting horizontal and vertical lines.
The horizontal line detectors are as follows (the vertical ones are similar but rotated 90 degrees):

--- +++++ ---

Each of these detectors units has a threshold of 3. The input to a detector unit, when superimposed on a portion of the retina, is determined by adding the +'s and subtracting the -'s of underlying retinal units. Then, if the input value is equal to or greater than the threshold, the detector output is activated (set to 1); otherwise it is inactive (set to 0). There are a total of 224 (14*16) horizontal and 224 vertical line detectors, one for each of the possible non-retinal-edge positions of a detector. Thus, there is a total of 448 second-layer units. - The third layer is a hidden layer of 200 units.
- The output (fourth) layer consists of 6 units, one for each of the characters to be recognized.
- Input patterns for training and testing: For training, generate 40 random positions (of the possible 168) for each of the 6 characters, for a total of 240 (40*6) input patterns. For testing, similarly generate another 240 input patterns.
- Training: use the standard back-propagation algorithm to train the weights between layers 2 and 3 and between 3 and 4 to produce a 1 output for the output-layer unit corresponding to the input character.
- Testing: after the network is trained, run the test patterns to determine the recognition accuracy.

- Your second real experiment is elaborate on the previous experiment as follows:
- Increase the alphabet to the full 26 uppercase letters, the first three as follows:

* **** *** * * * * * * * * * * * * * **** * ***** * * * * * * * * * * * **** ***

- The second layer of Hubel/Wiesel-like line detectors is doubled to include detectors at angles
in increments of 45 degrees, that is at angles of 0, 45, 90, and 135 degrees.
We also add the following edge detectors for detecting edges:

--- +++ +++ and ---

These are oriented at the same angles as the line detectors. The edge detectors have a threshold of 2. - The third layer is a hidden layer of 200 units.
- The output (fourth) layer is increased to 26 units, one for each of the characters to be recognized.
- The input patterns for training and testing are doubled to 480 each and generated randomly as above.
- Conduct experiments with both clean and noisy input. Random noise is easily added to the input layer patterns: 2% (a random 8 of the 400 input units is changed, by either change 1 to 0, or 0 to 1), 5%, 10%, 15%, and 20% noise.

- Increase the alphabet to the full 26 uppercase letters, the first three as follows:
- Another experiment, a real-world classification problem, might be added later.