Discovery of Numerous Specific Topics via Term Co-occurrence Analysis
by Madani, O. and Yu, J.
in Conference on Information and Knowledge Management (CIKM)
2010.We describe efficient techniques for construction of large term co-occurrence graphs, and investigate an application to the discovery of numerous fine-grained (specific) topics. A topic is a small dense subgraph discovered by a random walk initiated at a term (node) in the graph. We observe that the discovered topics are highly interpretable, and reveal the different meanings of terms in the corpus. We show the information-theoretic utility of the topics when they are used as features in supervised learning. Such features lead to consistent improvements in classification accuracy over the standard bag-of-words representation, even at high training proportions. We explain how a layered pyramidal view of the term distribution helps in understanding the algorithms and in visualizing and interpreting the topics.
![]() Adobe PDF |
![]() BibTeX |
![]() EndNote |
Cognitive Assistant that Learns and OrganizesAs part of DARPAs Personalized Assistant that Learns (PAL) program, SRI and team members are working on developing a next-generation "Cognitive Agent that Learns and Organizes" (CALO). |
| Name | Title | ||
|---|---|---|---|
|
|
Madani, Omid | Senior Computer Scientist | |
| Yu, Jiye | Computer Scientist |
