Interactive Sign/Face System

The goal of visual pattern recognition during the past fifty years has been the development of automated systems that rival or even surpass human accuracy, at higher speed and lower cost. Human interaction is considered, if at all, only to deal with "rejects" in the final step.

There are pronounced differences between human and machine cognitive abilities. Humans apply to recognition a rich set of contextual constraints and superior noise filtering abilities to excel in gestalt tasks, like object-background separation. Computers, however, can store thousands of images and associations between them, never forget a name or a label, and compute geometric moments and probability distributions.

These differences suggest that a system that combines human and machine abilities can, in some situations, outperform both. This is the general goal of CAVIAR (Computer Assisted Visual InteraActive Recognition).

Interesting research problems include:

Applications of CAVIAR are designed to recognize members of families of similar objects more accurately than machine vision and faster than most laypersons. It draws on the technologies of sequential pattern recognition, image database, expert systems, pen computing, and digital camera technology. For a description of a rudimentary wild flower recognition system implemented on a laptop and the basis for last year's system, see the paper Interactive Visual Pattern Recognition by G. Nagy and J. Zou, Proc. Int. Conf. Pattern Recognition (ICPR), 2002. Also, see the paper produced by last year's team Interactive Visual System. This year we will work on one or both of two interesting applications: the recognition of signs (for example, shop or traffic signs in foreign languages such Arabic or Chinese) and the recognition of human faces (for example, from an airport scene). For another related reference see CMU sign translation project demonstrated at ICPR2002; they are trying to solve a similar problem fully automatically.

This project concerns the development of the system on a handheld computer together with the direct capture and processing of associated photos. Last year we used the Sharp Zaurus which takes a plug-in camera and runs Linux.