Keystroke Biometric System

We are exploring keystroke biometric applications. Keystroke biometric systems measure typing characteristics believed to be unique to an individual and difficult to duplicate. There is a commercial product, BioPassword, currently used for hardening passwords (short input) in existing computer security schemes. The keystroke biometric is one of the less-studied biometrics; researchers tend to collect their own data and no known studies have compared identification techniques on a common database. Nevertheless, the published literature is optimistic about the potential of keystroke dynamics to benefit computer system security and usability.

The keystroke biometric has several possible applications. One application is for hardening password entry by adding a keystroke authentication process (accept/reject response) as a second stage following password matching before allowing user entry. Thus, if the password is not entered in the normal keystroke pattern, the system could ask the user to reenter it. For example, a user on a particular occasion might be drinking a cup of coffee and be entering the password uncharacteristically with one hand. The system, then, could reject the password, sending the user a message like, "Please reenter your password in your normal manner," and after, say, three tries, possibly rejecting the user entirely. The user upon receiving the message would likely put down the coffee cup and enter the password in his/her normal fashion in order to be accepted.

A second application is to identify an individual from his/her keystroke pattern. Suppose, for example, there has been a problem with the circulation of offensive emails from easily accessible desktops in a work environment. The security department wants to reduce this problem by collecting keystroke biometric data from all employees and developing a keystroke biometric identification system (one-of-n response).

We have developed in CSIS at Pace University a keystroke biometric identification system (one-of-n response) over the last 2-3 years; for last year's project see the Projects page at CS616 - Spring 2006. We have presented experimental results at three external and several internal conferences. The next paragraph contains the abstract of our most recent conference paper; for the full paper see Keystroke Conference Paper (slides).

ABSTRACT: For long-text input of 650 keystrokes, a biometric system was developed for applications such as identifying perpetrators of inappropriate e-mail or fraudulent Internet activity. A Java applet collected raw keystroke data over the Internet, appropriate long-text-input features were extracted, and a pattern classifier made identification decisions. Experiments were conducted on a total of 118 subjects using two input modes - copy and free-text input - and two keyboard types - desktop and laptop keyboards. Results indicate that the keystroke biometric can accurately identify an individual who sends inappropriate email (free text) if sufficient enrollment samples are available and if the same type of keyboard is used to produce the enrollment and questioned input samples. For laptop keyboards we obtained 99.5% identification accuracy on 36 users, which decreased to 97.9% on a larger population of 47 users. For desktop keyboards we obtained 98.3% accuracy on 36 users, which decreased to 93.3% on a larger population of 93 users. Accuracy decreases significantly when subjects used different keyboard types or different input modes for enrollment and testing.

The next phase of the keyboard biometric effort will center around the aspect of fallback which, directly put, will answer the question "what do you do if you have an incomplete or insufficient data set?" Two new Fallback models have been developed - one based on touch-typing principles and the other on statistical analysis of the data. These models will be encoded and tested, with the results reported and analyzed on a very deep level. Specifically we will identify when fallback occurs, which path was taken, and which node/level contributed the most in terms of information gain.

Beyond the keyboard biometric effort, the fallback models will be further refined and developed into algorithms that can be applied to any setting where data is insufficient or missing. The macro-goal is decision making with incomplete or imperfect information. For example, in Marketing Research, the age old problem of "survey non-response" where a survey respondent omits a portion of his/her answers is an excellent candidate and opportunity to apply this work. Other areas where this might apply include national security, general operations management, attrition prediction in industries such as banking and telecom, and stock outages in industries such as retail.

Team members of this project will definitely get a publication that can appear on their resume.