Human-Assisted Pattern Classification

Background - Research Issues

This section comes essentially verbatim from a National Science Foundation proposal [1].

We are interested in enhancing human-computer interaction in applications of pattern recognition where higher accuracy is required than is currently achievable by automated systems, but where there is enough time for a limited amount of human interaction. This topic has so far received only limited attention from the research community.

To make use of complementary human skill and automated methods in difficult segmentation and discrimination tasks, it is necessary to establish seamless communication between the two. The key to efficient interaction is the display of the automatically fitted model that allows the human to retain the initiative throughout the classification process. We believe that such interaction must be based on a visible model, because (1) high-dimensional feature space is incomprehensible to the human, and (2) the human is not necessarily familiar with the properties of the various classes, and cannot judge the adequacy of the current decision boundaries. This precludes efficient interaction with the feature-based classifier itself. Furthermore, human judgment of the adequacy of the machine-proposed prototypes, compared visually to the unknown object, is far superior to any classifier-generated confidence measure. In contrast to classification, deciding whether two pictures are likely to represent the same class does not require familiarity with all the classes.

We propose the protocol and research plan as follows:

  1. Rough machine segmentation of the unknown object, based on some visual model of the family of objects of interests, and on parameters extracted from "ground truth" segmentation of a set of training samples. The results of the segmentation are presented graphically by fitting the model to the object. This requires the formulation of a parametric model for each application of interest. It also requires statistical estimation of the model parameters from a set of already segmented samples.
  2. Human refinement of the object boundary -- as many times as necessary -- by means of an appropriate graphical interface. Rough segmentation suffices in many cases. More precise segmentation is necessary for machine classification only for more easily confused objects. Here, we must develop an efficient method of interactive refinement of an imperfect object boundary, based on our success with rapid segmentation of flowers in complex backgrounds. After each corrective action, it is desirable to have the machine attempt to improve the segmentation by taking advantage of an improved start state, as do CAVIAR [2] and IVS [3]. The method should also be applicable to efficient segmentation of the training samples for Item 1.
  3. Automated ranking of the classes in order of the most likely class assignment of the previously segmented unknown object. The classifier accepts automatically extracted feature vectors that may represent shape, color, texture, etc. This item needs a set of feature extraction algorithms, and a statistical classifier appropriate for the number of features and the number of training samples. Since both feature extraction and statistical classification have received much attention in the context of pattern recognition tasks of practical interest, we will use only off-the-shelf algorithms.
  4. Display of labeled exemplars of the top-ranked classes for human inspection. Provisions are made also for inspection of additional reference samples from any class. All statistical classification algorithms are capable of ranking the classes according to posterior probability. The display of Item 4 depends, however, on the complexity of the object and the size of the available canvas. In mobile systems with diminutive displays, provisions must be made for scrolling or paging the candidate images and the individual reference samples of each class. This item requires only familiarity with GUI design principles and tools.
  5. Final classification by selection of the exemplar most similar, according to human judgment, to the unknown object. We will also explore the possibility of extending this protocol to include interactive feature extraction. Because an exemplar of the correct class is brought into view sooner or later either by the automated classifier or by the operator scrolling through the display, Item 5 requires only a mouse or stylus click on an image of the chosen class.
Although the five steps are presented above in linear order, the operator may at any time return to refining the current segmentation instead of searching for additional exemplars in the ranked candidate list. Of course, in a well-designed system the correct class is usually near the head of the list. We have formalized the sequence of transitions of visual recognition states as a Finite State Machine.

The combination of human and machine classification tends to yield a very low error rate. It is therefore advantageous to add each classified sample into the current reference set. In our experience, additional samples improve the estimates of segmentation and classification parameters even if a few of these samples are mislabeled in consequence of errors in classification.

[1] G. Nagy, C. Tappert, J. Zou, and S. Cha, "Combining human and machine capabilities for improved accuracy and speed in critical visual recognition tasks," NSF Proposal, 2003.
[2] G. Nagy and J. Zou, "Interactive Visual Pattern Recognition," Proc. Int. Conf. Pattern Recognition, vol. III, pp. 478-481, 2002.
[3] A. Evans, J. Sikorski, P. Thomas, J. Zou, G. Nagy, S.-H. Cha, and C. C. Tappert, "Interactive Visual System," Pace CSIS Tech. Rep., no.196, 2003.


Students on this project will conduct experiments and will not be expected to develop software. The experiments will be conducted on the flower-recognition application called IVS (Interactive Visual System) which is being recovered by customer Amir Schur. It currently runs on a laptop PC and is being ported to an Android phone and possibly also an iPhone. The experiments will be conducted either on smart phones or via an Anroid Development Kit for Eclipse that has an Android simulator. Although not required, it would be good if one or more students on this project have an Android phone or iPhone.

Two experiments will be conducted:

  1. Preliminary experiment with the initial flower photos to become familiar with the application and experimental procedures
  2. Final experiment with a large inventory of flower images.

The following is a list of sub-experiments to be conducted by the team for each of the two experiments:

  1. Manual identification of flowers by comparing photo images to flower guidebook photos
  2. Pure automatic identification of flowers using IVS without human assistance
  3. Interactive man-machine identification of flowers using each of the six types of human assistance:
    1. Determine the most dominant petal color. User selects "Petal Color 1" and three boxes appear on top of screen. Click on the selected color in the image and that color will appear in the first box. Hit OK after satisfied with selection or hit CLEAR to restart the action.
    2. Determine the second dominant petal color. User selects "Petal Color 2" and continues as above.
    3. Determine the color of the stamen (center of flower). User selects "Color Stamen" and continues as above.
    4. Determine the number of petals. User selects "Petal Count" action and four boxes appear at the top of the screen. The first displays the petal count, which starts at zero. Click the "plus sign" box to increase the value, the "minus sign" box to decrease the value, and the OK button when finished.
    5. The user selects "Crop Area" and then selects a rectangle in the image to capture the vertical and horizontal bounds of the target flower.
    6. The user selects "Petal Outline" and then selects a rectangle in the image to capture the horizontal and vertical bounds of a target flower petal by pressing and tracing around a single typical petal. Click the OK button when satisfied with outline or the CLEAR button to restart the task.
    After all interactive steps are taken, click the IDENTIFY button to find the top three choices, the "Next 3" button to see the next 3 choices, etc.
So that the human-subject experiments are not biased one way or the other, half of the subjects should start the manual and then go to the interactive experiment, and the other half vice versa. Also, the subjects should not be flower experts or familiar with flower classification.

Two parameters will be assessed in each experiment: the average accuracy and the average time to make the identification decision. Accuracy will be determined as a function of choice (first, second, ..., tenth) so that accuracy can be determined, for example, by first choice, within top two choices, within top three choices, etc. The manual sub-experiment might have only one choice per test photo.

The following are instructions for capturing additional flower images:

Fast Agile XP Deliverables

We will use the agile methodology, particularly Extreme Programming (XP) which involves small releases and fast turnarounds in roughly two-week iterations. Many of these deliverables can be done in parallel by different members or subsets of the team.

The following is the current list of deliverables (ordered by the date initiated, initiated date marked in bold red if programming involved, deliverable modifications marked in red, completion date and related comments marked in green, pseudo-code marked in blue):

  1. 2/1 Guidebook and additional photos:
  2. 2/1 Preliminary manual identification (this and the other preliminary experiments can be done in parallel with #1):
  3. 2/1 Preliminary Pure automatic identification:
  4. 2/1 Preliminary Interactive man-machine identification: