STAT 527 Text Mining

Intensive investigation of text mining methodologies, including pattern matching with regular expressions, reformatting data, contingency tables, part-of-speech tagging, top-down parsing, probability and text sampling, the bag-of-words model and the effect of sample size. Extensive use of Perl and Perl modules to analyze text documents.

Credits

4

Prerequisite

STAT 521 or permission of the instructor.

General Education

Offered

  • Spring