Using sentence-level classification to predict sentiment at the document-level

Date

2012-05

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This report explores various aspects of sentiment mining. The two research goals for the report were: (1) to determine useful methods in increasing recall of negative sentences and (2) to determine the best method for applying sentence level classification to the document level. The methods in this report were applied to the Movie Reviews corpus at both the document and sentence level. The basic approach was to first identify polar and neutral sentences within the text and then classify the polar sentences as either positive or negative. The Maximum Entropy classifier was used as the baseline system in which the application of further methods was explored. Part-of-speech tagging was used for its effectiveness to determine if its inclusion increased recall of negative sentences. It was also used to aid in the handling of negations within sentences at the sentence level. Smoothing was investigated and various metrics to describe the sentiment composition were explored to address goal (2). Negative recall was shown to increase with the adjustment of the classification threshold and was also seen to increase through the methods used to address goal (2). Overall, classifying at the sentence level using bigrams and a cutoff value of one was observed to result in the highest evaluation scores.

Description

text

Citation