Four different feature selection methods are discussed,including Document Frequency(DF),Mutual Information(MI),X2 test(CHI),Correlation Coefficient(CC),and the correction of text categorization is compared using the algorithm of K nearest neighbor.
英
美
- 考察了文檔頻率DF、互信息MI、CHI統計、CC統計四種不同的特征選擇方法;并結合K近鄰算法進(jìn)行分類(lèi)精度上的比較.