Wikipedia Mining
From Devwiki
Kotaro Nakayama
As a corpus for knowledge extraction, Wikipedia's impressive characteristics are not limited to the scale, but also include the dense link structure, well structured categories, word sense disambiguation based on URL and high-quality anchor texts. These characteristics make Wikipedia a promising corpus. In particular, a considerable number of researches on Wikipedia Mining such as semantic relatedness measurement, bilingual dictionary construction, and ontology construction have been conducted. In this paper, we take a comprehensive, panoramic view of Wikipedia as a Web corpus and introduce some practical research products and applications of Wikipedia Mining to show its capability.

