Crowdsourcing-Based Evaluation of Automatic References Between WordNet and Wikipedia - Publikacja - MOST Wiedzy

Wyszukiwarka

Crowdsourcing-Based Evaluation of Automatic References Between WordNet and Wikipedia

Abstrakt

The paper presents an approach to build references (also called mappings) between WordNet and Wikipedia. We propose four algorithms used for automatic construction of the references. Then, based on an aggregation algorithm, we produce an initial set of mappings that has been evaluated in a cooperative way. For that purpose, we implement a system for the distribution of evaluation tasks, that have been solved by the user community. To make the tasks more attractive, we embed them into a game. Results show the initial mappings have good quality, and they have also been improved by the community. As a result, we deliver a high quality dataset of the mappings between two lexical repositories: WordNet and Wikipedia, that can be used in a wide range of NLP tasks. We also show that the framework for collaborative validation can be used in other tasks that require human judgments.

Cytowania

  • 4

    CrossRef

  • 2

    Web of Science

  • 3

    Scopus

Cytuj jako

Pełna treść

pobierz publikację
pobrano 1032 razy
Wersja publikacji
Accepted albo Published Version
Licencja
Copyright (World Scientific Publishing Company)

Słowa kluczowe

Informacje szczegółowe

Kategoria:
Publikacja w czasopiśmie
Typ:
artykuł w czasopiśmie wyróżnionym w JCR
Opublikowano w:
INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING nr 29, wydanie 03, strony 317 - 344,
ISSN: 0218-1940
Język:
angielski
Rok wydania:
2019
Opis bibliograficzny:
Szymański J., Boiński T.: Crowdsourcing-Based Evaluation of Automatic References Between WordNet and Wikipedia// INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING. -Vol. 29, iss. 03 (2019), s.317-344
DOI:
Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1142/s0218194019500141
Bibliografia: test
  1. J. Szymański, H. Krawczyk, and M. Deptula, Retrieval with semantic sieve, in Intel- ligent Information and Database Systems, ser. Lecture Notes in Computer Science, A. Selamat, N. Nguyen, and H. Haron, Eds. Springer Berlin Heidelberg, 2013, vol. 7802, pp. 236-245. otwiera się w nowej karcie
  2. T. Berners-Lee, J. Hendler, O. Lassila et al., The semantic web, Scientific american, vol. 284, no. 5, pp. 28-37, 2001. otwiera się w nowej karcie
  3. Y. Ding, D. Fensel, M. Klein, and B. Omelayenko, The semantic web: yet another hip? Data & Knowledge Engineering, vol. 41, no. 2-3, pp. 205-227, 2002. otwiera się w nowej karcie
  4. A. Maedche and S. Staab, Ontology learning for the semantic web, Intelligent Sys- tems, IEEE, vol. 16, no. 2, pp. 72-79, 2001. otwiera się w nowej karcie
  5. K. Goczy la, T. Grabowska, W. Waloszek, and M. Zawadzki, The knowledge April 11, 2018 11:53 WSPC/INSTRUCTION FILE cooperWiki-WN otwiera się w nowej karcie
  6. Crowdsourcing Based Evaluation of Automatic References 25 otwiera się w nowej karcie
  7. cartography-a new approach to reasoning over description logics ontologies, SOF- SEM 2006: Theory and Practice of Computer Science, pp. 293-302, 2006. otwiera się w nowej karcie
  8. H. Sun, W. Fan, W. Shen, and T. Xiao, Ontology-based interoperation model of collaborative product development, Journal of Network and Computer Applications, vol. 35, no. 1, pp. 132-144, 2011. otwiera się w nowej karcie
  9. D. Vallet, M. Fernández, and P. Castells, An ontology-based information retrieval model, The Semantic Web: Research and Applications, pp. 103-110, 2005. otwiera się w nowej karcie
  10. J. Sowa, Principles of semantic networks. Morgan Kaufmann, 1991. otwiera się w nowej karcie
  11. C. Bizer, T. Heath, and T. Berners-Lee, Linked data -the story so far, International journal on semantic web and information systems, vol. 5, no. 3, pp. 1-22, 2009. otwiera się w nowej karcie
  12. A. Gomez-Perez, M. Fernández-López, and O. Corcho, Ontological engineering. Springer Heidelberg, 2004, vol. 139.
  13. L. Specia and E. Motta, Integrating folksonomies with the semantic web, in The semantic web: research and applications. Springer, 2007, pp. 624-639. otwiera się w nowej karcie
  14. R. Studer, V. R. Benjamins, and D. Fensel, Knowledge engineering: principles and methods, Data & knowledge engineering, vol. 25, no. 1, pp. 161-197, 1998. otwiera się w nowej karcie
  15. L. Von Ahn, Games with a purpose, Computer, vol. 39, no. 6, pp. 92-94, 2006.
  16. M. Ruiz-Casado, E. Alfonseca, and P. Castells, Automatic assignment of wikipedia encyclopedic entries to wordnet synsets, Advances in Web Intelligence, pp. 380-386, 2005. otwiera się w nowej karcie
  17. J. Szymański and D. Kilanowski, Wikipedia and WordNet integration based on words co-occurrences, Proceedings of 30th International Conference Information Systems, Architecture and Technology, vol. 1, pp. 93-103, 2009.
  18. R. Schenkel, F. M. Suchanek, and G. Kasneci, Yawn: A semantically annotated wikipedia xml corpus, Proceedings of 12th Symposium on Database Systems for Busi- ness, 2007.
  19. F. M. Suchanek, G. Kasneci, and G. Weikum, Yago: A large ontology from wikipedia and wordnet, Web Semantics: Science, Services and Agents on the World Wide Web, vol. 6, no. 3, pp. 203-217, 2008. otwiera się w nowej karcie
  20. F. Suchanek, G. Kasneci, and G. Weikum, Yago: a core of semantic knowledge, in Proceedings of the 16th international conference on World Wide Web. ACM, 2007, pp. 697-706. otwiera się w nowej karcie
  21. R. Mihalcea, T. Chklovski, and A. Kilgarriff, The Senseval-3 English lexical sample task, in Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. Barcelona, Spain;, 2004, pp. 25-28.
  22. D. Nadeau and S. Sekine, A survey of named entity recognition and classification, Lingvisticae Investigationes, vol. 30, no. 1, pp. 3-26, 2007. otwiera się w nowej karcie
  23. D. Lenat, Cyc: A large-scale investment in knowledge infrastructure, Communications of the ACM, vol. 38, no. 11, pp. 33-38, 1995. otwiera się w nowej karcie
  24. D. A. Ferrucci, Introduction to this is watson, IBM Journal of Research and Devel- opment, vol. 56, no. 3.4, pp. 1-1, 2012. otwiera się w nowej karcie
  25. S. P. Ponzetto and R. Navigli, Large-scale taxonomy mapping for restructuring and integrating wikipedia. in IJCAI, vol. 9, 2009, pp. 2083-2088.
  26. N. Reiter, M. Hartung, and A. Frank, A resource-poor approach for linking ontology classes to wikipedia articles, in Proceedings of the 2008 Conference on Semantics in Text Processing. Association for Computational Linguistics, 2008, pp. 381-387. otwiera się w nowej karcie
  27. D. Milne and I. H. Witten, Learning to link with wikipedia, in Proceedings of the 17th ACM conference on Information and knowledge management. ACM, 2008, pp. 509-518. otwiera się w nowej karcie
  28. E. Niemann and I. Gurevych, The people's web meets linguistic knowledge: automatic April 11, 2018 11:53 WSPC/INSTRUCTION FILE cooperWiki-WN
  29. Julian Szymański and Tomasz Boiński sense alignment of wikipedia and wordnet, in Proceedings of the Ninth International Conference on Computational Semantics. Association for Computational Linguistics, 2011, pp. 205-214.
  30. D. P. Anderson and G. Fedak, The computational and storage potential of volun- teer computing, in Cluster Computing and the Grid, 2006. CCGRID 06. Sixth IEEE International Symposium on, vol. 1. IEEE, 2006, pp. 73-80. otwiera się w nowej karcie
  31. J. Howe. (2006) Crowdsourcing: A definition. http://www.crowdsourcing.com/cs/ 2006/06/crowdsourcing\_a.html.[Online, accessed: 10.10.2017].
  32. A. Kosorukoff, Human based genetic algorithm, in Systems, Man, and Cybernetics, 2001 IEEE International Conference on, vol. 5. IEEE, 2001, pp. 3464-3469. otwiera się w nowej karcie
  33. D. Wightman, Crowdsourcing human-based computation, in Proceedings of the 6th Nordic Conference on Human-Computer Interaction: Extending Boundaries. ACM, 2010, pp. 551-560. otwiera się w nowej karcie
  34. J. Simko and M. Bieliková, Games with a purpose: User generated valid metadata for personal archives, in Semantic Media Adaptation and Personalization (SMAP), 2011 Sixth International Workshop on. IEEE, 2011, pp. 45-50. otwiera się w nowej karcie
  35. A. Swanson, M. Kosmala, C. Lintott, R. Simpson, A. Smith, and C. Packer, Snapshot serengeti, high-frequency annotated camera trap images of 40 mammalian species in an african savanna, Scientific data, vol. 2, p. 150026, 2015. otwiera się w nowej karcie
  36. L. Von Ahn and L. Dabbish, Designing games with a purpose, Communications of the ACM, vol. 51, no. 8, pp. 58-67, 2008. otwiera się w nowej karcie
  37. L. Von Ahn and L. Dabbish, Labeling images with a computer game, in Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 2004, pp. 319-326. otwiera się w nowej karcie
  38. L. Von Ahn, S. Ginosar, M. Kedia, and M. Blum, Improving image search with phetch, in Acoustics, speech and signal processing, 2007. icassp 2007. ieee interna- tional conference on, vol. 4. IEEE, 2007, pp. IV-1209. otwiera się w nowej karcie
  39. L. Von Ahn, R. Liu, and M. Blum, Peekaboom: a game for locating objects in images, in Proceedings of the SIGCHI conference on Human Factors in computing systems. ACM, 2006, pp. 55-64. otwiera się w nowej karcie
  40. E. L. Law, L. Von Ahn, R. B. Dannenberg, and M. Crawford, TagATune: A game for music and sound annotation, in ISMIR, vol. 3, 2007, p. 2. otwiera się w nowej karcie
  41. J. Simko, Semantics discovery via human computation games, Semantic Web: Ontol- ogy and Knowledge Base Enabled Tools, Services, and Applications, p. 286, 2013. otwiera się w nowej karcie
  42. L. Von Ahn, B. Maurer, C. McMillen, D. Abraham, and M. Blum, recaptcha: Human- based character recognition via web security measures, Science, vol. 321, no. 5895, pp. 1465-1468, 2008. otwiera się w nowej karcie
  43. M. Foley, prove you're human: Fetishizing material embodiment and immaterial labor in information networks, Critical Studies in Media Communication, vol. 31, no. 5, pp. 365-379, 2014. otwiera się w nowej karcie
  44. AJT, Lessons from Duolingo's Effort to Support Free Language Learn- ing from Crowdsourcing, https://digit.hbs.org/submission/lessons-from\ \-duolingos-effort-to-support-free-language-learning-from-crowdsourcing, 2015. [Online, accessed: 12.05.2017] otwiera się w nowej karcie
  45. D. Vannella, D. Jurgens, D. Scarfini, D. Toscani, and R. Navigli, Validating and extending semantic knowledge bases using video games with a purpose. in ACL (1), 2014, pp. 1294-1304. otwiera się w nowej karcie
  46. D. Jurgens and R. Navigli, It's all fun and games until someone annotates: Video games with a purpose for linguistic annotation, Transactions of the Association of Computational Linguistics, vol. 2, no. 1, pp. 449-464, 2014. otwiera się w nowej karcie
  47. J. Szymański and T. Boiński, Improvement of imperfect string matching based on asymmetric n-grams, in Computational Collective Intelligence. Technologies and Ap- plications, ser. Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2013, vol. 8083, pp. 306-315. otwiera się w nowej karcie
  48. F. J. Damerau, A technique for computer detection and correction of spelling errors, Commun. ACM, vol. 7, pp. 171-176, March 1964. otwiera się w nowej karcie
  49. G. Hripcsak and A. Rothschild, Agreement, the f-measure, and reliability in infor- mation retrieval, Journal of the American Medical Informatics Association, vol. 12, no. 3, pp. 296-298, 2005. otwiera się w nowej karcie
  50. R. Korytkowski and J. Szymanski, Collaborative approach to WordNet and Wikipedia integration, in The Second International Conference on Advanced Collaborative Net- works, Systems and Applications, COLLA, 2012, pp. 23-28.
  51. J. Szymański, Mining relations between Wikipedia categories, in Networked Digital Technologies. Springer, 2010, pp. 248-255. otwiera się w nowej karcie
  52. J. Szymański, Words context analysis for improvement of information retrieval, in Computational Collective Intelligence. Technologies and Applications. Springer, 2012, pp. 318-325. otwiera się w nowej karcie
  53. J. Szymański and W. Duch, Self organizing maps for visualization of categories, in Neural Information Processing. Springer, 2012, pp. 160-167. otwiera się w nowej karcie
  54. J. Szymański et al. (2012, Jun.) Computational Wikipedia project. http://kask. eti.pg.gda.pl/CompWiki/index.php?page=wordnet\&.
  55. J. Szymański and W. Duch, Representation of hypertext documents based on terms, links and text compressibility, Proceedings of ICONIP, pp. 282-289, 2010. otwiera się w nowej karcie
  56. T. Boiński, Game with a purpose for mappings verification, in Computer Science and Information Systems (FedCSIS), 2016 Federated Conference on. IEEE, 2016, pp. 405-409. otwiera się w nowej karcie
  57. O. Medelyan, D. Milne, C. Legg, and I. Witten, Mining meaning from wikipedia, International Journal of Human-Computer Studies, vol. 67, no. 9, pp. 716-754, 2009. otwiera się w nowej karcie
  58. M. Ruiz-Casado, E. Alfonseca, and P. Castells, Automatising the learning of lexical patterns: An application to the enrichment of wordnet by extracting semantic rela- tionships from wikipedia, Data & Knowledge Engineering, vol. 61, no. 3, pp. 484-499, 2007. otwiera się w nowej karcie
  59. J. Szymański and W. Duch, Context search algorithm for lexical knowledge acquisi- tion, Control and Cybernetics, vol. 41, no. 1, pp. 81-97, 2012.
Źródła finansowania:
  • Działalność statusowa
Weryfikacja:
Politechnika Gdańska

wyświetlono 125 razy

Publikacje, które mogą cię zainteresować

Meta Tagi