Keystroke Biometric
Intrusion Detection

Background

According to Wikipedia (January 2011), "Keystroke logging (often called keylogging) is the action of tracking (or logging) the keys struck on a keyboard, typically in a covert manner so that the person using the keyboard is unaware that their actions are being monitored." Parents often install keylogger software on the home computer so they can track what their kids do on the computer and particularly what websites they visit.

Some keylogger software will not only record the sequence of keys struck but also their timing information, that is when a key is struck and when it is released. If this timing information is sufficiently accurate, it can be used for biometric purposes.

Over the last seven or so years we have developed the powerful Pace University Keystroke Biometric System (PKBS). This system was developed for text input applications like online exams requiring, for example, short text answers to questions. This system requires users to key text into a Java applet to produce PKBS input files.

Recently we have gone beyond text input to determine the utility of PKBS for arbitrary types of keyboard input: text, spreadsheet, browser, etc. Initial work on this problem has been described in a Research Day 2011 paper and a Research Day 2012 paper. These projects used the keylogger developed by Eric Fimbel.

Our keystroke biometric system has recently made the news, see NYT article and last semester's student Logan Romm in Westchester Magazine.

Project

This semester's project involves collecting keyboard input and running experiments -- no programming is involved. We will continue to use the Fimbel keylogger to obtain arbitrary keyboard input for keystroke biometric analysis.

The application of interest is intrusion detection. Intruder detection, by which we mean the discovery that someone other than the authentic user is using a computer, has become of interest to various organizations including the US Government. Given the scenario where an authentic user leaves his system unlocked and unattended, the question therefore becomes how fast and how accurate can the unauthorized use of that computer be detected. Our solution is to detect the intruder from an analysis of his keystroke input which would presumably differ substantially from that of the authorized user.

The experiments will involve running data through PKBS, obtaining results, and analyzing the results.

This semester we will collect four types of keystroke/mouse input data (detailed instruction will be provided by Ned Bakelman):

  1. Text data: from applications like Microsoft Word and email
  2. Spreadsheet data: from applications like Microsoft Excel
  3. Browser data: from Microsoft Internet Explorer, Firefox, etc., using applications like Google, etc.
  4. Open data: typical comuter keyboard input activity -- email, IM, facebook, web activity, etc. (any of types 1-3 above, and try to capture a variety)

Code and Instructions

All code and instructions will be provided by customer Ned Bakelman.