Nearest Neighbors

We use a subset of the "Iris Plants Database" dataset (provided by WEKA, contained in the "iris.arff" file).

Each plant record (i.e., example) is represented by the 5 attributes.

  1. SepalLength: the sepal length in cm.
  2. SepalWidth: the sepal width in cm.
  3. PetalLength: the petal length in cm.
  4. PetalWidth: the petal width in cm.
  5. Class: the classification attribute, with the possible values {Iris-setosa, Iris-versicolor, Iris-virginica}.

We want to predict the class for each of the following plants:

Part 1 - Manual Computation

Apply the Nearest Neighbor learning algorithm to classify the three to-be-predicted plants (i.e., Plants #16-18), to determine what kind of plant it is.

Try the three different values for the neighborhood size; i.e., k=1; 3; and 5. Use one of the geometry distance functions (e.g., Manhattan or Euclidean distance function).

  1. For k=1, convert the data of the set of Plants #16-18 (together with their predicted class) into the ARFF format, and save it in the "plants_test1.arff" file.
  2. For k=3, convert the data of the set of Plants #16-18 (together with their predicted class) into the ARFF format, and save it in the "plants_test2.arff" file.
  3. For k=5, convert the data of the set of Plants #16-18 (together with their predicted class) into the ARFF format, and save it in the "plants_test3.arff" file.

Part 2 - Analysis with WEKA