dc.contributor.author | BOBICEV, Victoria | |
dc.contributor.author | POPESCU, Anatol | |
dc.contributor.author | ZIDRAŞCO, Tatiana | |
dc.date.accessioned | 2019-11-12T10:23:03Z | |
dc.date.available | 2019-11-12T10:23:03Z | |
dc.date.issued | 2005 | |
dc.identifier.citation | BOBICEV, Victoria, POPESCU, Anatol, ZIDRAŞCO, Tatiana. Statistical models of language and Zipf’s law. In: Microelectronics and Computer Science: proc. of the 4th intern. conf., September 15-17, 2005. Chişinău, 2005, vol. 2, pp. 133-136. ISBN 9975-66-038-X. | en_US |
dc.identifier.isbn | 9975-66-038-X | |
dc.identifier.uri | http://repository.utm.md/handle/5014/6693 | |
dc.description.abstract | Statistical models based on text words became very widespread for the last years. Estimation of words never met in corpus is one of word probability estimation subtasks. Attempts to find the number of never met words, using Zipf’s formula give rather big values for the words never met in corpus. Making several experiments we observed that the number of words never met in corpus is proportional to the number of words met only once and depends on the text vocabulary. If the following texts are of the same type with corpus, estimation of never met words is rather adequate. But if the following texts differ from the corpus, the number of never met words can either increase or decrease considerably. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Technical University of Moldova | en_US |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 United States | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/us/ | * |
dc.subject | Zipf law | en_US |
dc.subject | statistical language modelling | en_US |
dc.subject | statistical models | en_US |
dc.subject | zero frequency | en_US |
dc.title | Statistical models of language and Zipf’s law | en_US |
dc.type | Article | en_US |
The following license files are associated with this item: