DSpace Repository

Automatic error correction in text

Show simple item record

dc.contributor.author BOBICEV, Victoria
dc.contributor.author POPESCU, Anatol
dc.contributor.author ZIDRAŞCO, Tatiana
dc.date.accessioned 2019-11-12T12:00:43Z
dc.date.available 2019-11-12T12:00:43Z
dc.date.issued 2005
dc.identifier.citation BOBICEV, Victoria, POPESCU, Anatol, ZIDRAŞCO, Tatiana. Automatic error correction in text. In: Microelectronics and Computer Science: proc. of the 4th intern. conf., September 15-17, 2005. Chişinău, 2005, vol. 2, pp. 193-196. ISBN 9975-66-038-X. en_US
dc.identifier.isbn 9975-66-038-X
dc.identifier.uri http://repository.utm.md/handle/5014/6706
dc.description.abstract Letter sequence statistics in text is used to solve many text processing tasks. One of the most difficult tasks is automatic error correction in text. Errors appear in the text when it is either typed or scanned. Four types of errors can usually befound in a typed text: one letter missing, one letter extra, transposition of two letters and one letter wrong. Absolutely different types of errors appear when a text is scanned. Generally speaking error types vary and depend on documents quality, font type and text recognition program. To find the erroneous word a dictionary is usually used. Since many text elements are personal names, abbreviations, abridgements, firm names, they can not be found in the dictionary and do not need to be corrected. That is why a block determining these elements is necessary. We determined erroneous words according to their entropy on the letter trigram statistical model basis. We found that almost all words with the entropy higher than 4,5 are erroneous. When the most frequent errors were analyzed the confusion table was created to determine the correct word. The word with minimal entropy is considered to be correct. en_US
dc.language.iso en en_US
dc.publisher Technical University of Moldova en_US
dc.rights Attribution-NonCommercial-NoDerivs 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/us/ *
dc.subject errors en_US
dc.subject automatic errors correction en_US
dc.subject letter statistics en_US
dc.subject entropy en_US
dc.title Automatic error correction in text en_US
dc.type Article en_US


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Search DSpace


Advanced Search

Browse

My Account