DSpace Repository

Text Classification Using Word-Based PPM Models

Show simple item record

dc.contributor.author BOBICEV, Victoria
dc.date.accessioned 2020-10-07T08:11:18Z
dc.date.available 2020-10-07T08:11:18Z
dc.date.issued 2006
dc.identifier.citation BOBICEV, Victoria. Text Classification Using Word-Based PPM Models. In: Computer Science Journal of Moldova. 2006, nr. 2(41), pp. 183-201. ISSN 1561-4042. en_US
dc.identifier.uri http://repository.utm.md/handle/5014/10493
dc.description.abstract Text classification is one of the most actual among the natural language processing problems. In this paper the application of word-based PPM (Prediction by Partial Matching) model for automatic content-based text classification is described. Our main idea is that words and especially word combinations are more relevant features for many text classification tasks. Key-words for a document in most cases are not just single words but combination of two or three words. The main result of the implemented experiments proved applicability of word-based PPM models for content-based text classification. Although in some cases the entropy difference which influenced the choice was rather small (several hundredths), most of the documents (up to 97%) were classified correctly. en_US
dc.language.iso en en_US
dc.publisher Institutul de Matematică şi Informatică al AŞM en_US
dc.rights Attribution-NonCommercial-NoDerivs 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/us/ *
dc.subject text classifications en_US
dc.subject natural languages en_US
dc.title Text Classification Using Word-Based PPM Models en_US
dc.type Article en_US


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Search DSpace


Advanced Search

Browse

My Account