Search results for: named entity disambiguation
-
Elgold: gold standard, multi-genre dataset for named entity recognition and linking
Open Research DataThe dataset contains 276 multi-genre texts with marked named entities, which are linked to corresponding Wikipedia articles if available. Each entity was manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
-
Elgold intermediate: annotated raw
Open Research DataThe dataset contains a subset of texts from Elgold intermediate: raw texts with named entities marked and linked to corresponding Wikipedia articles. The texts were annotated by 31 participants during the 1.5-hour session.
-
Elgold partial: News
Open Research DataThe dataset contains 37 English texts scrapped from news websites. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking...
-
Elgold intermediate: verified by the authors
Open Research DataThe dataset contains the texts from Elgold intermediate: verified by verification team additionaly verified by the dataset authors but before the final validation step with the elgold toolset.
-
Elgold intermediate: verified by verification team
Open Research DataThe dataset contains the texts from Elgold intermediate: annotated raw additionaly verified by the five-person verification team. arly 25% of the mentions were corrected in some aspect.
-
Elgold partial: Scientific papers' abstracts
Open Research DataThe dataset contains 87 Scientific papers' abstracts in English randomly chosen from the folowing scientific disciplines: Biomedicine, Life Sciences, Mathematics, Medicine, Science, Humanities, Social Science.
-
Elgold partial: Amazon product reviews
Open Research DataThe dataset contains 34 Amazon product reviews in English. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
-
Elgold partial: Automotive blogs
Open Research DataThe dataset contains 34 English texts scrapped from automotive blogs. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and...
-
Elgold partial: Movie reviews
Open Research DataThe dataset contains 37 English texts with movie reviews. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
-
Elgold partial: Job offers
Open Research DataThe dataset contains 34 English texts scrapped from the web portals offering job offers. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity...
-
Elgold partial: History blogs
Open Research DataThe dataset contains 13 texts from English history blogs. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
-
Semantic URL Analytics to Support Efficient Annotation of Large Scale Web Archives
PublicationLong-term Web archives comprise Web documents gathered over longer time periods and can easily reach hundreds of terabytes in size. Semantic annotations such as named entities can facilitate intelligent access to the Web archive data. However, the annotation of the entire archive content on this scale is often infeasible. The most efficient way to access the documents within Web archives is provided through their URLs, which are...
-
OntoValidate: OntoNotes 5.0 NER validation dataset
Open Research DataOntoValidate dataset consists of 603 randomly chosen raw textsfrom the original OntoNote 5.0 dataset (3637 raw texts in total).
-
Szymon Olewniczak mgr inż.
PeopleI've been a part of the Gdansk University of Technology since 2013, when I started my bachelor's degree in computer science at the Faculty of Electronics, Telecommunications and Informatics. After receiving my master's degree in 2019, I've been working as an assistant at the Department of Computer Architecture. Since 2024, I am also the deputy head of my department. My research interests revolve around various NLP related topics,...
-
Towards Facts Extraction From Texts in Polish Language
PublicationThe Polish language differs from English in many ways. It has more complicated conjugation and declination. Because of that automatic facts extraction from texts is difficult. In this paper we present basic differences between those languages. The paper presents an algorithm for extraction of facts from articles from Polish Wikipedia. The algorithm is based on 7 proposed facts schemes that are searched for in the analyzed text....
-
DBpedia and YAGO Based System for Answering Questions in Natural Language
PublicationIn this paper we propose a method for answering class 1 and class 2 questions (out of 5 classes defined by Moldovan for TREC conference) based on DBpedia and YAGO. Our method is based on generating dependency trees for the query. In the dependency tree we look for paths leading from the root to the named entity of interest. These paths (referenced further as fibers) are candidates for representation of actual user intention. The...
-
Named Property Graphs
Publication -
Revitalized Mill Island in Bydgoszcz - the identity of the place created by the Brda River and its tributary named Młynowka
PublicationMill Island in Bydgoszcz, Poland is an example of downtown public space where a meander of the Młynówka creates the identity of the area.Before 2005, the only outstanding local feature was the fact that its south and west ends resembled the Venetian canals. The way the other parts of Mill Island were managed was inappropriate for a downtown.A comprehensive revitalization programme is returning Mill Island public space to residents,...
-
Euroregion as an Entity Stimulating the Sustainable Development of the Cross-Border Market for Cultural Services in a City Divided by a Border
Publication -
Euroregion as an Entity Stimulating the Sustainable Development of the Cross-Border Market for Cultural Services in a City Divided by a Border
Publication -
Stable nanoconjugates of transferrin with alloyed quaternary nanocrystals Ag–In–Zn–S as a biological entity for tumor recognition
PublicationOne way to limit the negative effects of anti-tumor drugs on healthy cells is targeted therapy employing functionalized drug carriers. Here we present a biocompatible and stable nanoconjugate of transferrin anchored to Ag-In-Zn-S quantum dots modified with 11-mercaptoundecanoic acid (Tf-QD) as a drug carrier versus typical anticancer drug, doxorubicin. Detailed investigations of Tf-QD nanoconjugates without and with doxorubicin...
-
Sylwester Kaczmarek dr hab. inż.
PeopleSylwester Kaczmarek received his M.Sc in electronics engineering, Ph.D. and D.Sc. in switching and teletraffic science from the Gdansk University of Technology, Gdansk, Poland, in 1972, 1981 and 1994, respectively. His research interests include: IP QoS and GMPLS and SDN networks, switching, QoS routing, teletraffic, multimedia services and quality of services. Currently, his research is focused on developing and applicability...
-
ITL International Journal of Applied Linguistics (formerly named ITL Review of Applied Linguistics)
Journals -
Annotating Words Using WordNet Semantic Glosses
PublicationAn approach to the word sense disambiguation (WSD) relaying onthe WordNet synsets is proposed. The method uses semantically tagged glosses to perform a process similar to the spreading activation in semantic network, creating ranking of the most probable meanings for word annotation. Preliminary evaluation shows quite promising results. Comparison with the state-of-theart WSD methods indicates that the use of WordNet relations...
-
Grzegorz Zieliński dr inż.
PeopleAuthor of over 100 scientific publications (both in Polish and English) in the field of service management, entity improvement, including medical entities. Scientific and research interests include areas related to the maturity and excellence of enterprises in various aspects of their activities. He participated in research projects of the National Science Center and projects implemented by international consortia under the European...
-
Towards semantic-rich word embeddings
PublicationIn recent years, word embeddings have been shown to improve the performance in NLP tasks such as syntactic parsing or sentiment analysis. While useful, they are problematic in representing ambiguous words with multiple meanings, since they keep a single representation for each word in the vocabulary. Constructing separate embeddings for meanings of ambiguous words could be useful for solving the Word Sense Disambiguation (WSD)...
-
Implementation of Business Intelligence in an IT organization - the concept of an evaluation model
PublicationThis paper presents the issue of assessing the validity and effectiveness of implementing a Business Intelligence system in an IT Support Organization. This entity provides IT services to external clients involving, in particular, the storage and processing of large amounts of data. The vast amount of realized projects and also incidents reported in connection with those projects prevented effective decisions from being made without...
-
Karol Daliga dr inż.
PeopleIn 2005, he graduated from a mathematics and physics class and passed his secondary school-leaving examination at 1st Secondary School named after Władysław Gebik, former Polish grammar school in Kwidzyn. In the years 2005-2010 he completed master's studies at Faculty of Applied Physics and Mathematics of Gdańsk University of Technology, and in 2008-2012 he completed engineering studies at Faculty of Civil and Environmental Engineering...
-
Wikipedia and WordNet integration based on words co-occurrences
PublicationThe article presents a method for automatic integration of two lexical resources: semantic dictionary WordNet and electronic encyclopaedia Wikipedia. Our goal is to add automatically an semantic tags - a WordNet synset identifier to the title of the Wikipedia article. We've analyze several different ap-proaches to these problem and implement our own solution, based on word occurrences in synsets descriptions and the article body....
-
MODEL FOR MEASUREMENT OF FLOW INSTALLATION TIME IN SDN SWITCH
PublicationSDN is the approach in telecommunication networks that separates control plane from data forwarding plane by specifying a single network entity as a controller that defines rules (called flows) of traffic forwarding for the switches connected to it. The time that is required for installation of these rules might be a hindrance for the overall performance of SDN network. In the paper, a model for testing and evaluating the influence...
-
Angelica Pegani mgr
PeopleA graduate of the Faculty of Management and Economics of the Gdańsk University of Technology. She completed a postgraduate management studies and the Entrepreneurship Program at the Massachusetts Institute of Technology. She has started a PhD studies and wrote a doctoral thesis based on social sciences. She holds numerous certificates confirming her knowledge of English language, including from the British Council and the University...
-
Linking music data in executable documents
PublicationThis paper presents the application of Interactive Open Document Architecture (IODA) to music and video data. This architecture was design to create multilayer documents which consist of many files. The paper shows the method of creating media documents on the basis of IODA. These kind of documents were called IODA Media Documents (IMD). IMD have links that connect many different kinds of files containing music and video data....
-
Fast Approximate String Search for Wikification
PublicationThe paper presents a novel method for fast approximate string search based on neural distance metrics embeddings. Our research is focused primarily on applying the proposed method for entity retrieval in the Wikification process, which is similar to edit distance-based similarity search on the typical dictionary. The proposed method has been compared with symmetric delete spelling correction algorithm and proven to be more efficient...
-
Elgold intermediate: raw texts
Open Research DataThe dataset contains raw texts scrapped from various internet sources which were used for creating the Elgold dataset.
-
Social networks as a context for small business? A new look at an enterprise in the context of a smallness and newness liability syndrome
PublicationIn this paper we aim to propose and outline key ingredients to a small enterprise success, emerging from the social capital of small business owner-managers and their business networks. We employ resource based view of an organization as well as an embeddedness perspective along with new approach transaction costs to outline the pillars of an advantage of a small business entity. The analysis of survey data leads us to conclusion,...
-
Is it all about networking? Building a sustainable value of a small enterprise in Polish context
PublicationIn this paper we aim to propose and outline key ingredients to a small enterprise success, emerging from the social capital of small business owner-managers and their business networks. We employ resource based view of an organization as well as an embeddedness perspective along with new approach transaction costs to outline the pillars of an advantage of a small business entity. The analysis of survey data leads us to conclusion,...
-
Using Decisional DNA to Enhance Industrial and Manufacturing Design: Conceptual Approach
PublicationDuring recent years, manufacturing organizations are facing market changes such as the need for short product life cycles, technological advancement, intense pressure from competitors and the continuous customers’ expectation for high quality products at lower costs. In this scenario, knowledge and its associated engineering/management of every stage involved in the industrial design has become increasingly important for manufacturing...
-
Towards the 4th industrial revolution: networks, virtuality, experience based collective computational intelligence, and deep learning
PublicationQuo vadis, Intelligent Enterprise? Where are you going? The authors of this paper aim at providing some answers to this fascinating question addressing emerging challenges related to the concept of semantically enhanced knowledge-based cyber-physical systems – the fourth industrial revolution named Industry 4.0.
-
Costs of privatization of the banking sector in Poland in 1997-2000
Open Research DataOn June 14, 1996, a special law was passed on the merger and grouping of certain banks in the form of joint-stock companies. Pursuant to these regulations, the PeKaO S.A. banking group was established, which was the only entity of this type established in this way. Additionally, by the end of 1996, four out of nine regional banks were sold, i.e. Wielkopolski...
-
Hedging Strategies of Derivatives Instruments for Commodity Trading Entities
PublicationHedging as an outcome of risk management arises to account several questions. Mentioned aspect of size of the hedging is one of them. Latter questioning refers to whether producer of manufacturer are willing to secure entire exposure, when the hedging should start, now or later in the future, what is the vision on market like direction of market, time of interest, magnitude of exposure, what would be the preferred instruments of...
-
Sensorless Control of Induction Machine Supplied by Current Source Inverter
PublicationThe paper describes the voltage control technique of induction machines supplied by a current source inverter. The control system is based on proposed new multi-scalar variables, which are named “r.” The control system contains the output filter capacitor's model. In the sensorless control system the Z type backstepping speed observer was applied. The mathematical dependences are confirmed by simulation and experimental research.
-
Virtual touchpad - video-based multimodal interface
PublicationA new computer interface named Virtual-Touchpad (VTP) is presented. The Virtual-Touchpad provides a multimodal interface which enables controlling computer applications by hand gestures captured with a typical webcam. The video stream is processed in the software layer of the interface. Hitherto existing video-based interfaces analyzing frames of hand gestures are presented. Then, the hardware configuration and software features...
-
ArchBGal32cB 441Glu mutein gene analysis dataset
Open Research Data -
List of public benefit organizations that in 2020 received 1% of the tax due for 2019
Open Research DataThe possibility of transferring 1% of personal income tax was introduced by the Act on Public Benefit and Volunteer Work in 2003, and specific provisions specifying who and how can transfer 1% of tax are included in the Personal Income Tax Act. In order to be able to accept 1% of income tax, first of all, the organization (or other authorized entity)...
-
Activated Sludge Process Development
PublicationThis paper summarizes the most significant steps in the activated sludge process development and recognizes key contributors. Recognition of the roles of oxygen and living organisms was the first step (1882-1914). Ardern and Lockett (1914) named the accumulated olids "activated sludge". The process was rapidly accepted and applied in the period 1914-1930. The most dramatic changes in the activated sludge process understanding and...
-
XRD-TiO2 and SiO2
Open Research DataData contain results from XRD measurements of amorphous silica and TiO2 of antase and rutile phases. The commercial TiO2 named as P25 produced by Evonik was also analyzed.
-
Potential of Polish R&D industry in the context of prototyping, design, development and control of a dedicated national satellite SAR system for marine ecosystem monitoring. Technical paper - preliminary study
Publicationpace technology is currently one of the most important elements in the advance of information societies and knowledge-based economies all over the world. The European Space Agency (ESA) is in the focal point of European space activities, while the European Union provides strong financial support for the development of space technologies and applications in its flagship programs. In a domestic scope, the Polish Space Agency (POLSA)...
-
ECONOMICAL AND SAFE METHOD OF GRANULAR MATERIAL STORAGE IN SILOS IN OFFSHORE PORT TERMINALS
PublicationThe article discusses issues related with storage of granular materials in silos made of corrugated sheets and reinforced with vertical ribs. Advantages and disadvantages of these structures are named, and typical technological solutions used by largest silo producers are presented. Moreover, basic assumptions of Eurocode 3 are discussed in the context of determining the buckling load capacity of a ribbed jacket. Alternative methods...
-
QoS Resource Reservation Mechanisms for Switched Optical Networks
PublicationThe paper regards the problem of resource reservation mechanisms for Quality of Service support in switched optical networks. The authors propose modifications and extensions for resources reservation strategy algorithms with resources pools, link capacity threshold and adaptive advance reservation approach. They examine proposed solutions in Automatically Switched Optical Network with Generalized Multi-Protocol Label Switching...
-
New type T-Source inverter
PublicationThis paper presents different topologies of voltage inverters with alternative input LC networks. The basic topology is known in the literature as a Z-source inverter (ZSI). Alternative passive networks were named by the authors as T-sources. T-source inverter has fewer reactive components in comparison to conventional Z-source inverter. The most significant advantage of the T-source inverter (TSI) is its use of a common voltage...