Keystroke Biometric System

Background

For general background information see Overview of Biometric Projects.

We have been exploring keystroke biometric applications. Keystroke biometric systems measure typing characteristics believed to be unique to an individual and difficult to duplicate. There is a commercial product, BioPassword, currently used for hardening passwords (short input) in existing computer security schemes. The keystroke biometric is one of the less-studied biometrics; researchers tend to collect their own data and no known studies have compared identification techniques on a common database. Nevertheless, the published literature is optimistic about the potential of keystroke dynamics to benefit computer system security and usability.

The keystroke biometric has several possible applications. One application is an authentication process (binary accept/reject response, yes you are the person you claim to be or no you are not). For example, password entry could be "hardened" by adding as a keystroke authentication process as a second stage following password matching before allowing user entry. Thus, if the password is not entered in the normal keystroke pattern, the system could ask the user to reenter it. For example, a user on a particular occasion might be drinking a cup of coffee and be entering the password uncharacteristically with one hand. The system, then, could reject the password, sending the user a message like, "Please reenter your password in your normal manner," and after, say, three tries, possibly rejecting the user entirely. The user upon receiving the message would likely put down the coffee cup and enter the password in his/her normal fashion in order to be accepted. Another use of such an authentication process is to authenticate students taking online tests by their keystroke patterns.

A second application is to identify an individual from his/her keystroke pattern (one-of-n response). Suppose, for example, there has been a problem with the circulation of offensive emails from easily accessible desktops in a work environment. The security department wants to reduce this problem by collecting keystroke biometric data from all employees and developing a keystroke biometric identification system.

We have developed in CSIS at Pace University a keystroke biometric identification system (one-of-n response) over the last four years. We have presented experimental results at three external and several internal conferences. The next paragraph contains the abstract of our most recent conference paper; for the full paper see Keystroke Conference Paper (slides).

ABSTRACT: For long-text input of 650 keystrokes, a biometric system was developed for applications such as identifying perpetrators of inappropriate e-mail or fraudulent Internet activity. A Java applet collected raw keystroke data over the Internet, appropriate long-text-input features were extracted, and a pattern classifier made identification decisions. Experiments were conducted on a total of 118 subjects using two input modes - copy and free-text input - and two keyboard types - desktop and laptop keyboards. Results indicate that the keystroke biometric can accurately identify an individual who sends inappropriate email (free text) if sufficient enrollment samples are available and if the same type of keyboard is used to produce the enrollment and questioned input samples. For laptop keyboards we obtained 99.5% identification accuracy on 36 users, which decreased to 97.9% on a larger population of 47 users. For desktop keyboards we obtained 98.3% accuracy on 36 users, which decreased to 93.3% on a larger population of 93 users. Accuracy decreases significantly when subjects used different keyboard types or different input modes for enrollment and testing.

Project

Last semester we began authentication testing (accept/reject response) on the keystroke system, and we will continue this approoach this semester, see Keystroke Technical Paper (fall 2007) and associated slides. Also see Authentication Technical Paper (fall 2007) and associated slides.

This semester the focus of the keystroke biometric work will be on longitudinal (over time) authentication experiments. One of the main criticisms of our earlier work has been that most of the data was collected over a short time interval, with some subjects entering all their data in one session, which tends to result in data that is overly consistent and not representative of the variation of typing patterns over reasonable periods of time. Therefore, most biometric studies gather data over periods of several weeks, months, or even years. Last semester we conducted some preliminary longitudinal experiments, collecting data (five samples in each of four quadrants per person) from four subjects (the team members) on three ocassions at two-week intervals (see results in their technical paper). This semester you should work with your customer and subject matter expert to gather additional keystroke data sets (a data set from a subject contains five samples in each of the four quadrants) from

Then, run the data through the feature extractor to put it into the proper format, and pass it on to the authentication team for further processing to obtain performance results.

Midterm Checkpoint (our second classroom meeting).
For this checkpoint you should, as a minimum, have obtained two additional data sets (spaced at a two-week interval) from at least each team member (20 samples per person as indicated above). These data must also be run through the feature extraction program and output in the proper format, and the formated feature data passed on to the authentication team for further processing to obtain performance results. Going beyond the minimum, new data from an additional 5-20 new subjects would be good. Better would be to obtain additional data from last semester's four team members (interval of several months). Better yet would be to obtain additional data from 10-30 of Dr. Vallani's subjects (interval of about two years).