Authorship Verification of Social Media Content

Background

Authorship verification is often considered as the task of comparing multiple pieces of writing to determine if they are produced by a single author, such as Shakespeare and suspected Shakespeare writings. This project, however, deals with the type of problem where there is no closed candidate set, but rather one suspect, a purported author, and the challenge is to determine if the suspect is or is not the author. Traditional machine learning algorithms such as support vector machine, decision trees, and na´ve Bayes have achieved high accuracy rates for this type of classification problems. However, new algorithms in the natural language processing space, such as word2vec, might have potential for improving accuracy rates even further.

Project Description

This project aims to explore the effectiveness of using the word2vec algorithm for authorship verification on social media content (e.g. Facebook postings, Tweets, microblogs etc.).

Project Deliverables

References

This project is a continuation of last semester's project, see 2016 Fall Project Paper.