Keystroke Biometric
Intrusion Detection

Background

According to Wikipedia (January 2011), "Keystroke logging (often called keylogging) is the action of tracking (or logging) the keys struck on a keyboard, typically in a covert manner so that the person using the keyboard is unaware that their actions are being monitored." Parents often install keylogger software on the home computer so they can track what their kids do on the computer and particularly what websites they visit.

Some keylogger software will not only record the sequence of keys struck but also their timing information, that is when a key is struck and when it is released. If this timing information is sufficiently accurate, it can be used for biometric purposes.

Over the last seven or so years we have developed the powerful Pace University Keystroke Biometric System (PKBS). This system was developed for text input applications like online exams requiring, for example, short text answers to questions. This system requires users to key text into a Java applet to produce PKBS input files.

Recently we have gone beyond text input to determine the utility of PKBS for arbitrary types of keyboard input: text, spreadsheet, program execution, etc. Initial work on this problem has been described in a Research Day 2011 paper from the Spring 2011 project and a Technical paper from the Fall 2011 project. These projects used the keylogger developed by Eric Fimbel.

The application of interest is intrusion detection, by which we mean the discovery that someone other than the authentic user is using a computer. This has become of interest to various organizations including the US Government. Given the scenario where an authentic user leaves his system unlocked and unattended, the question therefore becomes how fast and how accurate can the unauthorized use of that computer be detected. Our solution is to detect the intruder from an analysis of his keystroke input which would presumably differ substantially from that of the authorized user.

Project (no programming)

This project involves collecting different kinds of keyboard input samples and then running experiments on them for analysis. It will take time to learn how to use the system and its various components; however previous team's documentation should help with this. We will continue to use the Fimbel Keylogger to obtain arbitrary keyboard input for keystroke biometric analysis.

The experiments will involve running data through PKBS, obtaining results, and analyzing the results. This includes the following steps:

  1. Install the Fimbel keylogger on machines to collect the data files.
  2. Convert the Fimbel keylogger sample files into the PKBS input format.
  3. Prepare the training and testing files for input to PKBS. This involves running the Featrue Extractor program to produce a feature file and then separating that file into training and testing files.
  4. Run the training and testing files through PKBS to obtain an output file.
  5. Run the PKBS output file through the BAS Calculator program to obtain FAR, FRR, and overall performance.
  6. Run the BAS Calculator output through the ROC Curve Data Generator program to obtain Receiver Operating Characteristic (ROC) curves.

This semester we will collect three types of keystroke/mouse input data (detailed instruction will be provided by Ned Bakelman):

  1. Spreadsheet data: from Microsoft Excel
  2. Browser data: from Microsoft Internet Explorer, Firefox, etc., using applications like Google, etc.
  3. Simulated intruder input: to be described by Ned Bakelman

Code and Instructions

All code and instructions will be provided by customer Ned Bakelman and/or Vinnie Monaco.