Typifying Wikipedia Articles

Date

2011-03-03

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Computer Science & Engineering

Abstract

In Wikipedia, each article represents an entity. Entity can have different types like person, country, school, science etc. Although Wikipedia encapsulates category information for each page, sometimes it is not sufficient to deduce the type of a page just from its categories. But, incorporating the clear type information in a Wikipedia page is very important for the users, as it will help them to explore the pages in more organized way. Hence, in my thesis, we explore different standard classification techniques, mainly Naïve Bayes and Support Vector Machines and experiment how these techniques can be made more effective for typifying Wikipedia articles by using different feature selection methods. We proposed a method where Wikipedia categories are used as features. Moreover, we combine features to build a meta classifier which outperforms the other standard methods. To compare our methods we calculate the accuracy of different methods and used well known data mining tool "WEKA".

Description

Keywords

Citation