Assignment References

Textbooks Available Online:
Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data, EMC Education Services 2015.
Big Data Now, O'Reilly Media 2012.
Pattern Classification, Duda, Hart, and Stork. Wiley 2000.
Pattern Recognition and Machine Learning, Bishop. Springer 2006.
The Elements of Statistical Learning, Hastie, Tibshirani, and Friedman. Springer 2011.
Data Mining (3rd Ed.), Witten, Frank, and Hall. Morgan Kaufmann 2011.

Other Books
Introduction to Algorithms, Cormen, Leiserson, Rivest, and Stein. MIT Press 2009.
Big Data, Data Mining, and Machine Learning, Jared Dean. Wiley 2014.
Analytics in a Big Data World: The Essential Guide to Data Science and its Applications, Bart Baesens. Wiley 2014.
Data Smart: Using Data Science to Transform Information into Insight, John Foreman. Wiley 2013.
Doing Data Science: Straight Talk from the Frontline, Cathy O'Neil and Rachel Schutt. O'Reilly 2013.

Assignment 1: Bayes Decision Theory vs kNN
Introduction to pattern classification and machine learning: Duda, Chapter 1
Bayes Decision Theory: Duda, Chapter 2
Non-parametric classification procedures, including kNN: Duda, Chapter 4
kNN procedure: Hastie, Chapter 2
Bayes' Theorem:  New York Times Articles on Bayesian Statistics  Video 1  Video 2  Video 3
Normal Probability Distribution:  Univariate Distribution  Multivariate Distribution
Covariance Matrix  Mahalanobis Distance
k-Nearest-Neighbor (kNN) algorithm  Video 1  Video 2  Video 3

Assignment 2: Linear Regression
Simple linear regression - Khan Academy
    Formula Derivation (4 parts)
    Examples (2 parts)
    R-squared or coefficient of determination (2 parts)
Linear regression calculator
General regression: Bishop, Chapter 1
General regression analysis via matrix pseudoinverse - Algorithms texbook
Linear Regression versus Principal Component Analysis

Assignment 3: K-Means Clustering
Algorithm when seeds are samples
Algorithm when seeds are random points, not samples