Linguistic Analysis of DPS Dissertations

Background

The Doctor of Professional Studies (DPS) in Computing at Pace University is a unique doctoral program that allows active IT professionals to earn a doctorate degree in three years through part-time study. There are currently about 70 completed DPS dissertations, see DPS dissertations.

Project

The goal of this project is to use DPS dissertations as training datasets so that documents can be classified in real-time. A typical use case would be a user performing searches on the Internet and having documents "scored" in terms of relevance to the training datasets.

This project basically involves learning information extraction tools, such as LingPipe and Mallet, and applying them to the DPS dissertations, and these are very interesting tools!

Overview:

For additional information see Rinaldo's document.

Fast Agile XP Deliverables

We will use the agile methodology, particularly Extreme Programming (XP) which involves small releases and fast turnarounds in roughly two-week iterations. Many of these deliverables can be done in parallel by different members or subsets of the team.

The following is the current list of deliverables (ordered by the date initiated, initiated date marked in bold red if programming involved, deliverable modifications marked in red, completion date and related comments marked in green, pseudo-code marked in blue):

  1. 2/1 . Plan the semester's work with your customer Rinaldo DiGiorgio.