Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Distributional Features for Text Categorization
#1

Text categorization is the task of assigning predefined categories to natural language text. With the widely used bag of words representation, previous researches usually assign a word with values such that whether this word appears in the document concerned or how frequently this word appears. Although these values are useful for text categorization, they have not fully expressed the abundant information contained in the document. This paper explores the effect of other types of values, which express the distribution of a word in the document. These novel values assigned to a word are called distributional features, which include the compactness of the appearances of the word and the position of the first appearance of the word. The proposed distributional features are exploited by a tfidf style equation and different features are combined using ensemble learning techniques. Experiments show that the distributional features are useful for text categorization. In contrast to using the traditional term frequency values solely, including the distributional features requires only a little additional cost, while the categorization performance can be significantly improved. Further analysis shows that the distributional features are especially useful when documents are long and the writing style is casual.
Reply

#2
[attachment=5248]
[u]Text Categorization
[/u]

Foundations of Statistical Natural Language Processing


Task Description

Goal: Given the classification scheme, the system can decide which class(es) a document is related to.
A mapping from document space to classification scheme.
1 to 1 / 1 to many
To build the mapping:
observe the known samples classified in the scheme,
Summarize the features and create rules/formula
Decide the classes for the new documents according to the rules.
Reply

#3
check below links to get ppt and pdf of Distributional Features for Text Categorization

http://cs.umass.edu/ ronb/papers/sigir.ppt
http://lamda.nju.edu.cn/xuexb/files/ecml...eature.pdf
http://ieexplore.ieeiel5/69/4358933/0458...er=4589210
Reply

#4

Hi friends,
I'm Rafi , doing my final year computer science and Engineering . I'm in need of a ppt or pdf file for Distributional of Text categorization .
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

Powered By MyBB, © 2002-2024 iAndrew & Melroy van den Berg.