Predicting Community Preference of Comments on the Social Web
Abstract
Large-scale socially-generated metadata is one of the key features driving the growth and success of the emerging Social Web. Recently there have been many research efforts to study the quality of this metadata - like user-contributed tags, comments, and ratings
- and its potential impact on new opportunities for intelligent information access. However, much existing research relies on quality assessments made by human experts external to a Social Web community. In the present study, we are interested in understanding how an online community itself perceives the relative quality of its own user-contributed content, which has important implications for the successful selfregulation and growth of the Social Web in the presence of increasing spam and a flood of Social Web metadata. We propose and evaluate a machine learning-based approach for ranking comments on the Social Web based on the community's expressed preferences, which can be used to promote high-quality comments and filter out low-quality comments. We study several factors impacting community preference, including the contributor's reputation and community activity level, as well as the complexity and richness of the comment. Through experiments, we find that the proposed approach results in significant improvement in ranking quality versus alternative approaches.