Evaluating Asymmetric N-Grams as Spell-Checking Mechanism - Publikacja - MOST Wiedzy


Evaluating Asymmetric N-Grams as Spell-Checking Mechanism


Typical approaches to string comparing marks two strings as either different or equal without taking into account any similarity measures. Being able to judge similarity is however required for spelling error corrections, as we want to find the best match for a given word. In this paper we present a bi2quadro-grams method for spelling errors correction. The method proposed uses different n-grams dimension for the source (checked) and target (from the dictionary) words. For different types of errors proper weights were introduced. This way an increase in the quality and performance of the algorithm can be observed and the method becomes dedicated to the task of spelling errors correction. The results obtained so far suggest that the method is a viable solution competitive to other currently used approaches. The paper presents the proposed method, test suite and experimental results. Some discussion is also presented.


  • 0


  • 0

    Web of Science

  • 0


Cytuj jako

Pełna treść

pobierz publikację
pobrano 2 razy


Copyright (2018 IEEE)

Informacje szczegółowe

Aktywność konferencyjna
publikacja w wydawnictwie zbiorowym recenzowanym (także w materiałach konferencyjnych)
Tytuł wydania:
2018 11th International Conference on Human System Interaction (HSI) strony 356 - 361
Rok wydania:
Opis bibliograficzny:
Boiński T. M., ZIMNICKI A., Kujawski J., Draszawka K.: Evaluating Asymmetric N-Grams as Spell-Checking Mechanism// 2018 11th International Conference on Human System Interaction (HSI)/ : , 2018, s.356-361
Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1109/hsi.2018.8431345
Bibliografia: test
  1. C. D. Manning, P. Raghavan, and H. Schtze, Introduction to Information Retrieval. Cambridge University Press, 2008.
  2. T. Boiński and A. Chojnowski, "Towards facts extraction from text in Polish language," in INnovations in Intelligent SysTems and Applications (INISTA), 2017 IEEE International Conference on. IEEE, 2017, pp. 13-17. otwiera się w nowej karcie
  3. J. Szymański and W. Duch, "Semantic memory knowledge acquisition through active dialogues," in Neural Networks, 2007. IJCNN 2007. International Joint Conference on. IEEE, 2007, pp. 536-541. otwiera się w nowej karcie
  4. J. Szymański and T. Boiński, "Improvement of Imperfect String Match- ing Based on Asymmetric n-Grams," in Computational Collective Intel- ligence. Technologies and Applications. Springer, 2013, pp. 306-315. otwiera się w nowej karcie
  5. R. Hamming, "Error detecting and error correcting codes," Bell System technical journal, vol. 29, no. 2, pp. 147-160, 1950. otwiera się w nowej karcie
  6. V. I. Lcvenshtcin, "Binary codes capable of correcting deletions, inser- tions, and reversals," in Soviet Physics-Doklady, vol. 10, no. 8, 1966.
  7. C. Sulzberger, "Efficient Implementation of the Levenshtein-Algorithm," http://www.levenshtein.net/, 2009, [Online: 27.02.2018].
  8. F. J. Damerau, "A technique for computer detection and correction of spelling errors," Commun. ACM, vol. 7, pp. 171-176, March 1964. [Online]. Available: \url{http://doi.acm.org/10.1145/363958.363994} otwiera się w nowej karcie
  9. A. Boguszewski, J. Szymański, and K. Draszawka, "Towards increasing f-measure of approximate string matching in o (1) complexity," in Computer Science and Information Systems (FedCSIS), 2016 Federated Conference on. IEEE, 2016, pp. 527-532. otwiera się w nowej karcie
  10. S. L. Hantler, M. M. Laker, J. Lenchner, and D. Milch, "Methods and apparatus for performing spelling corrections using one or more variant hash tables," 2017, uS Patent 9,552,349.
  11. R. Udupa and S. Kumar, "Hashing-based approaches to spelling cor- rection of personal names," in Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2010, pp. 1256-1265.
  12. K. Draszawka and J. Szymański, "Analysis of denoising autoencoder properties through misspelling correction task," in Conference on Computational Collective Intelligence Technologies and Applications. Springer, 2017, pp. 438-447. otwiera się w nowej karcie
  13. A. M. Robertson and P. Willett, "Applications of n-grams in textual information systems," Journal of Documentation, vol. 54, no. 1, pp. 48- 67, 1998. otwiera się w nowej karcie
  14. P. Majumder, M. Mitra, and B. Chaudhuri, "N-gram: a language inde- pendent approach to ir and nlp," in International conference on universal knowledge and language, 2002. otwiera się w nowej karcie
  15. K. Atkinson, "GNU Aspell," http://aspell.net/, 2011, [Online: 28.02.2018]. otwiera się w nowej karcie
  16. G. Navarro, R. Baeza-Yates, E. Sutinen, and J. Tarhio, "Indexing methods for approximate string matching," IEEE Data Engineering Bulletin, vol. 24, no. 4, pp. 19-27, 2001.
  17. Wikipedia, "Wikipedia:Lists of common misspellings,"
Źródła finansowania:
  • Działalność statusowa
Politechnika Gdańska

wyświetlono 45 razy

Publikacje, które mogą cię zainteresować

Meta Tagi