Feature Data Format

Format of the Feature Data Files

The biometric feature data output format from the mouse movement, stylometry, and keystroke projects will take the form of a file or corresponding spreadsheet. The form of the file is as follows with fields in a record comma delimited and items in a field slash delimited:

Feature Value Normalization

The following pseudo-code will normalized the feature values into the range 0-1.

for i = 1 to number_of_features
     min =  999999 {initialize to a large positive number}
     max = -999999 {initialize to a large negative number}
     for j = 1 to number_of_samples {find min and max}
          if feature_value (i,j) < min then min = feature_value (i,j)
          if feature_value (i,j) > max then max = feature_value (i,j)
     end
     for j = 1 to number_of_samples {normalize}
          feature_value (i,j) = (feature_value (i,j) - min) / (max - min)
     end
end

Examples of Feature Data Files

Mouse movement biometric data example

Mouse movement biometric data example created September 2007
4
MaryJones/F/26, left-handed, Dell mouse, fixed 10-button sequence/used right hand, 2, 0.13668, 0.53375
MaryJones/F/26, left-handed, Dell mouse, fixed 10-button sequence/used right hand, 2, 0.14378, 0.56275
JohnSmith/M/27, right-handed, optical mouse, random 10-button sequence/used right hand, 2, 0.53628, 0.43865
JohnSmith/M/27, right-handed, optical mouse, random 10-button sequence/used right hand, 2, 0.43628, 0.53865

Stylometry biometric data example

Stylometry biometric data example created September 2007
6
MaryJones/F/26, bachelors degree, Dell laptop, structured email task, 2, 0.13668, 0.53375
MaryJones/F/26, bachelors degree, Dell laptop, structured email task, 2, 0.14378, 0.56275
JohnSmith/M/27, masters degree, Compaq handheld, free email task, 2, 0.53628, 0.43865
JohnSmith/M/27, masters degree, Compaq handheld, free email task, 2, 0.43628, 0.53865
ChrisHill/F/02-04-1983, PhD degree, Dell desktop, free email task, 2, 0.39734, 0.92862
ChrisHill/F/02-04-1983, PhD degree, Dell desktop, free email task, 2, 0.49924, 0.98861

Keystroke biometric data example

Keystroke biometric data example created September 2007
8
MaryJones/F/08-01-1981, left-handed, Dell laptop, copy task, 2, 0.13668, 0.53375
MaryJones/F/08-01-1981, left-handed, Dell laptop, copy task, 2, 0.14378, 0.56275
JohnSmith/M/06-01-1980, right-handed, Dell laptop, email task, 2, 0.53628, 0.43865
JohnSmith/M/06-01-1980, right-handed, Dell laptop, email task, 2, 0.43628, 0.53865
JohnSmith/M/04-21-1982, left-handed, Dell desktop, copy task, 2, 0.88321, 0.43464
JohnSmith/M/04-21-1982, left-handed, Dell desktop, copy task, 2, 0.78721, 0.33262
ChrisHill/F/02-04-1983, right-handed, Dell desktop, email task, 2, 0.39734, 0.92862
ChrisHill/F/02-04-1983, right-handed, Dell desktop, email task, 2, 0.49924, 0.98861

Notes on the Feature Data Files

1) A "?" is used to indicate a data item that is unknown, unavailable, or not relevant.
2) The mouse movement example also has two classes: Mary Jones and John Smith. The stylometry example has three classes: Mary Jones, John Smith, and Chris Hill. The keystroke example has four classes: Mary Jones, John Smith born 1980, John Smith born 1982, and Chris Hill. This information is implicit in the data but not explicitly specified.
3) Although the biometric feature measurements have been represented with five decimal places as shown above for simplicity, eight or ten decimal places is recommended for the actual project data.