Filtry
wszystkich: 626
wybranych: 389
-
Katalog
Filtry wybranego katalogu
Wyniki wyszukiwania dla: NATURAL LANGUAGE PROCESSING, LARGE LANGUAGE MODELS, DATA MINING, QUANTUM PHYSICS
-
Extracting concepts from the software requirements specification using natural language processing
PublikacjaExtracting concepts from the software require¬ments is one of the first step on the way to automating the software development process. This task is difficult due to the ambiguity of the natural language used to express the requirements specification. The methods used so far consist mainly of statistical analysis of words and matching expressions with a specific ontology of the domain in which the planned software will be applicable....
-
Evaluation of ChatGPT Applicability to Learning Quantum Physics
PublikacjaChatGPT is an application that uses a large language model. Its purpose is to generate answers to various questions as well as provide information, help solve problems and participate in conversations on a wide range of topics. This application is also widely used by students for the purposes of learning or cheating (e.g., writing essays or programming codes). Therefore, in this contribution, we evaluate the ability of ChatGPT...
-
Big Data i 5V – nowe wyzwania w świecie danych (Big Data and 5V – New Challenges in the World of Data)
PublikacjaRodzaje danych, składające się na zbiory typu Big Data, to m.in. dane generowane przez użytkowników portali internetowych, dane opisujące transakcje dokonywane poprzez Internet, dane naukowe (biologiczne, astronomiczne, pomiary fizyczne itp.), dane generowane przez roboty w wyniku automatycznego przeszukiwania przez nie Internetu (Web mining, Web crawling), dane grafowe obrazujące powiązania pomiędzy stronami WWW itd. Zazwyczaj,...
-
Language Models in Speech Recognition
PublikacjaThis chapter describes language models used in speech recognition, It starts by indicating the role and the place of language models in speech recognition. Mesures used to compare language models follow. An overview of n-gram, syntactic, semantic, and neural models is given. It is accompanied by a list of popular software.
-
Music Data Processing and Mining in Large Databases for Active Media
PublikacjaThe aim of this paper was to investigate the problem of music data processing and mining in large databases. Tests were performed on a large data-base that included approximately 30000 audio files divided into 11 classes cor-responding to music genres with different cardinalities. Every audio file was de-scribed by a 173-element feature vector. To reduce the dimensionality of data the Principal Component Analysis (PCA) with variable...
-
Knowledge Base Suitable for Answering Questions in Natural Language
PublikacjaThis paper presents three knowledge bases widely used by researchers coping with natural language processing: OpenCyc, DBpedia and YAGO. They are characterized from the point of view of questions answering system. In this paper a short description of the aforementioned system implementation is also presented.
-
OrphaGPT: An Adapted Large Language Model for Orphan Diseases Classification
PublikacjaOrphan diseases (OD) represent a category of rare conditions that affect only a relatively small number of individuals. These conditions are often neglected in research due to the challenges posed by their scarcity, making medical advancements difficult. Then, the ever-evolving medical research and diagnosis landscape calls for more attention and innovative approaches to address the complex challenges of rare diseases and OD. Pre-trained...
-
Information Extraction from Polish Radiology Reports using Language Models
PublikacjaRadiology reports are vital elements of directing patient care. They are usually delivered in free text form, which makes them prone to errors, such as omission in reporting radiological findings and using difficult-to-comprehend mental shortcuts. Although structured reporting is the recommended method, its adoption continues to be limited. Radiologists find structured reports too limiting and burdensome. In this paper, we propose...
-
DBpedia and YAGO as Knowledge Base for Natural Language Based Question Answering—The Evaluation
PublikacjaThe idea of automatic question answering system has a very long history. Despite constant improvement of the systems asking questions in the natural language requires very complex solutions. In this paper the DBpedia and YAGO are evaluated as a knowledge bases for simple class 1 and 2 question answering system. For this purpose a question answering system was designed and implemented. The proposed solution and the knowledge bases...
-
Natural language dictionaries implemented as finite automata
PublikacjaRozdział przedstawia wykorzystanie automatów skończonych jako słowników języka naturalnego. Podane są podstawy teoretyczne. Omówione są zastosowania: realizacja doskonałej funkcji mieszającej, analizy i syntezy morfologicznej, poprawiania pisowni i dopisywania znaków diakrytycznych, wydobywanie informacji. Podano algorytmy tworzenia automatów oraz omówiono sposoby reprezentacji automatów z uwzględnieniem kompresji.
-
Fluent Editor and Controlled Natural Language in Ontology Development
Publikacja -
Semantic rules representation in controlled natural language in FluentEditor
PublikacjaThis paper presents a way of representation of semantic rules (SWRL) in controlled English in order to facilitate understanding the rules by humans interacting with a machine. This approach (implemented in FluentEditor) may be applied in many domains, where the understandability of the rules used to support a decision process is of great importance.
-
Finite automata for compact representation of language models in NLP
PublikacjaPrzedstawiona zostaje technika reprezentacji modeli języka w przetwarzaniu języka naturalnego wymagająca mało pamięci. Po krótkim omówieniu przyczyn poszukiwania oszczędnej reprezentacji takich modeli języka, pokazane jest, jak automaty skończone mogą być użyte w tym celu. Technika może być postrzegana jako zastosowanie i rozszerzenie doskonałej funkcji mieszającej z wykorzystaniem automatów skończonych. Pierwsze doświadczenia...
-
DBpedia and YAGO Based System for Answering Questions in Natural Language
PublikacjaIn this paper we propose a method for answering class 1 and class 2 questions (out of 5 classes defined by Moldovan for TREC conference) based on DBpedia and YAGO. Our method is based on generating dependency trees for the query. In the dependency tree we look for paths leading from the root to the named entity of interest. These paths (referenced further as fibers) are candidates for representation of actual user intention. The...
-
Visual Low-Code Language for Orchestrating Large-Scale Distributed Computing
Publikacja -
Evaluation of Multimedia Stream Processing Modeling Language from the Perspective of Cognitive Dimensions
PublikacjaW referacie zawarto opis zastosowania wymiarów poznawczych do oceny języka modelowania przetwarzania strumieni multimedialnych, nazwanego MSP-ML, w trakcie tworzenia tego języka. Poszczególne części referatu prezentują kontekst i motywacje oceny MSP-ML, metodę oceny, rezultaty oceny oraz porównanie rezultatów oceny z wynikami otrzymanymi za pomocą innych metod oceny języków modelowania wizualnego.
-
Using LSTM networks to predict engine condition on large scale data processing framework
PublikacjaAs the Internet of Things technology is developing rapidly, companies have an ability to observe the health of engine components and constructed systems through collecting signals from sensors. According to output of IoT sensors, companies can build systems to predict the conditions of components. Practically the components are required to be maintained or replaced before the end of life in performing their assigned task. Predicting...
-
An Adversarial Machine Learning Approach on Securing Large Language Model with Vigil, an Open-Source Initiative
PublikacjaSeveral security concerns and efforts to breach system security and prompt safety concerns have been brought to light as a result of the expanding use of LLMs. These vulnerabilities are evident and LLM models have been showing many signs of hallucination, repetitive content generation, and biases, which makes them vulnerable to malicious prompts that raise substantial concerns in regard to the dependability and efficiency of such...
-
Comparison of Language Models Trained on Written Texts and Speech Transcripts in the Context of Automatic Speech Recognition
Publikacja -
Path integrals formulations leading to propagator evaluation for coupled linear physics in large geometric models
PublikacjaReformulating linear physics using second kind Fredholm equations is very standard practice. One of the straightforward consequences is that the resulting integrals can be expanded (when the Neumann expansion converges) and probabilized, leading to path statistics and Monte Carlo estimations. An essential feature of these algorithms is that they also allow to estimate propagators for all types of sources, including initial conditions....
-
Data Acquisition and Processing for GeoAI Models to Support Sustainable Agricultural Practices
PublikacjaThere are growing opportunities to leverage new technologies and data sources to address global problems related to sustainability, climate change, and biodiversity loss. The emerging discipline of GeoAI resulting from the convergence of AI and Geospatial science (Geo-AI) is enabling the possibility to harness the increasingly available open Earth Observation data collected from different constellations of satellites and sensors...
-
Using LSTM networks to predict engine condition on large scale data processing framework
Publikacja -
Ontology-Aided Software Engineering
PublikacjaThis thesis is located between the fields of research on Artificial Intelligence (AI), Knowledge Representation and Reasoning (KRR), Computer-Aided Software Engineering (CASE) and Model Driven Engineering (MDE). The modern offspring of KRR - Description Logic (DL) [Baad03] is considered here as a formalization of the software engineering Methods & Tools. The bridge between the world of formal specification (governed by the mathematics)...
-
Towards facts extraction from text in Polish language
PublikacjaNatural Language Processing (NLP) finds many usages in different fields of endeavor. Many tools exists allowing analysis of English language. For Polish language the situation is different as the language itself is more complicated. In this paper we show differences between NLP of Polish and English language. Existing solutions are presented and TEAMS software for facts extraction is described. The paper shows also evaluation of...
-
Text-mining Similarity Approximation Operators for Opinion Mining in BI tools
PublikacjaThe concept of the Text-mining Similarity Approximation Operators for Opinion Mining as extensions to Natural Language Interface Database is defined. The new operators: “keywords of” dimension; subsetting operator “about C is q”; aggregation operator “by similar C” are proposed. These operators are based on the Latent Semantic Analysis and Social Network Analysis
-
Semantic OLAP with FluentEditor and Ontorion Semantic Excel Toolchain
PublikacjaSemantic technologies appear as a step on the way to creating systems capable of representing the physical world as real time computational processes. In this context, the paper presents a toolchain for an ontology based knowledge management system. It consists of the ontology editor, FluentEditor and the distributed knowledge representation system, Ontorion. FluentEditor is a comprehensive tool for editing and manipulating complex...
-
Evaluation of a company’s image on social media using the Net Sentiment Rate
PublikacjaVast amounts of new types of data are constantly being created as a result of dynamic digitization in all areas of our lives. One of the most important and valuable categories for business is data from social networks such as Facebook. Feedback resulting from the sharing of thoughts and emotions, expressed in comments on various products and services, is becoming the key factor on which modern business is based. This feedback is...
-
Question Answering System to Answer Questions About Technical Documentation
PublikacjaThis article ventures into the realm of specialized AI systems for question answering, with a specific focus on programming languages, using Rust as the case study. Our research harnesses the capabilities of BERT, a leading model in natural language processing, to explore its effectiveness in interpreting and responding to complex, domain-specific queries. We have developed a novel dataset, derived from Rust's detailed documentation,...
-
Agile Commerce in the light of Text Mining
PublikacjaThe survey conducted for this study reveals that more than 84% of respondents have never encountered the term “agile commerce” and do not understand its meaning. At the same time, they are active participants of this strategy. Using digital channels as customers more often than ever before, they have already been included in the agile philosophy. Based on the above, the purpose of the study is to analyse major text sets containing...
-
From Sequential to Parallel Implementation of NLP Using the Actor Model
PublikacjaThe article focuses on presenting methods allowing easy parallelization of an existing, sequential Natural Language Processing (NLP) application within a multi-core system. The actor-based solution implemented with the Akka framework has been applied and compared to an application based on Task Parallel Library (TPL) and to the original sequential application. Architectures, data and control flows are described along with execution...
-
A new library for construction of automata
PublikacjaWe present a new library of functions that construct minimal, acyclic, deterministic, finite-state automata in the same format as the author's fsa package, and also accepted by the author's fadd library of functions that use finite-state automata as dictionaries in natural language processing.
-
Previous Opinions is All You Need - Legal Information Retrieval System
PublikacjaWe present a system for retrieving the most relevant legal opinions to a given legal case or question. To this end, we checked several state-of-the-art neural language models. As a training and testing data, we use tens of thousands of legal cases as question-opinion pairs. Text data has been subjected to advanced pre-processing adapted to the specifics of the legal domain. We empirically chose the BERT-based HerBERT model to perform...
-
Quantifying inconsistencies in the Hamburg Sign Language Notation System
PublikacjaThe advent of machine learning (ML) has significantly advanced the recognition and translation of sign languages, bridging communication gaps for hearing-impaired communities. At the heart of these technologies is data labeling, crucial for training ML algorithms on a huge amount of consistently labeled data to achieve models that generalize well. The adoption of language-agnostic annotations is essential to connect different sign...
-
A Model-Driven Solution for Development of Multimedia Stream Processing Applications
PublikacjaThis paper presents results of action research related to model-driven solutions in the area of multimedia stream processing. The practical problem to be solved was the need to support application developers who make their multimedia stream processing applications in a supercomputer environment. The solution consists of a domain-specific visual language for composing complex services from simple services called Multimedia Stream...
-
Utilising AI Models to Analyse the Relationship between Battlefield Developments in the Russian-Ukrainian War and Fluctuations in Stock Market Values
PublikacjaThis study examines the impact of battlefield developments in the ongoing Russian–Ukrainian war, which to date has lasted over 1000 days, on the stock prices of defence corporations such as BAE Systems, Booz Allen Hamilton, Huntington Ingalls, and Rheinmetall AG. Stock prices were analysed alongside sentiment data extracted from news articles, and processed using machine learning models leveraging natural...
-
An efficient incremental DFA minimization algorithm
PublikacjaW tym artykule przedstawiamy nowy algorytm minimalizacji deterministycznego automatu skończonego. Algorytm jest przyrostowy - może być zatrzymany w dowolnym momencie, dając częściowo zminimalizowany automat. Wszystkie inne (znane) algorytmy minimalizacji dają wyniki pośrednie nieprzydatne dla częściowej minimalizacji. Ponieważ pierwszy algorytm jest łatwo zrozumiały ale mało wydajny, rozważamy trzy praktyczne, znaczące usprawnienia....
-
SMAQ - A Semantic Model for Analitical Queries
PublikacjaWhile the Self-Service Business Intelligence (BI) becomes an important part of organizational BI solutions there is a great need for new tools allowing to construct ad-hoc queries by users with various responsibilities and skills. The paper presents a Semantic Model for Analytical Queries – SMAQ allowing to construct queries by users familiar with business events and terms, but being unaware of database or data warehouse concepts...
-
Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej
PublikacjaThe bi-modal speech recognition system requires a 2-sample language input for training and for testing algorithms which precisely depicts natural English speech. For the purposes of the audio-visual recordings, a training data base of 264 sentences (1730 words without repetitions; 5685 sounds) has been created. The language sample reflects vowel and consonant frequencies in natural speech. The recording material reflects both the...
-
Badania empiryczne związane z ewolucją języków - wybrane zagadnienia
PublikacjaAlthough language evolution is an area in science yet to be developed, its foundations lay on empirical research. The aim of this article is to present three categories of ways to get empirical data on language evolution: observing language in laboratory, monitoring animal communication and analysing pidgins and creoles. The part of the paper about language in laboratory bases on English-language articles presenting the experiments...
-
Asking Data in a Controlled Way with Ask Data Anything NQL
PublikacjaWhile to collect data, it is necessary to store it, to understand its structure it is necessary to do data-mining. Business Intelligence (BI) enables us to make intelligent, data-driven decisions by the mean of a set of tools that allows the creation of a potentially unlimited number of machine-generated, data-driven reports, which are calculated by a machine as a response to queries specified by humans. Natural Query Languages...
-
Collaborative approach to WordNet and Wikipedia integration
PublikacjaIn this article we present a collaborative approach tocreating mappings between WordNet and Wikipedia. Wikipediaarticles have been first matched with WordNet synsets in anautomatic way. Then such associations have been evaluated andcomplemented in a collaborative way using a web application.We describe algorithms used for creating automatic mappingsas well as a system for their collaborative development. Theoutcome enables further...
-
Scoreboard Architectural Pattern and Integration of Emotion Recognition Results
PublikacjaThis paper proposes a new design pattern, named Scoreboard , dedicated for applications solving complex, multi-stage, non-deterministic problems. The pattern provides a computational framework for the design and implementation of systems that integrate a large number of diverse specialized modules that may vary in accuracy, solution level, and modality. The Scoreboard is an extension of Blackboard design pattern and comes under...
-
Geometric Algebra Model of Distributed Representations
PublikacjaFormalism based on GA is an alternative to distributed representation models developed so far-Smolensky's tensor product, Holographic Reduced Representations (HRR) and Binary Spatter Code (BSC). Convolutions are replaced by geometric products, interpretable in terms of geometry which seems to be the most natural language for visualization of higher concepts. This paper recalls the main ideas behind the GA model and investigates...
-
Review of the Complexity of Managing Big Data of the Internet of Things
PublikacjaTere is a growing awareness that the complexity of managing Big Data is one of the main challenges in the developing feld of the Internet of Tings (IoT). Complexity arises from several aspects of the Big Data life cycle, such as gathering data, storing them onto cloud servers, cleaning and integrating the data, a process involving the last advances in ontologies, such as Extensible Markup Language (XML) and Resource Description...
-
The Impact of Foreign Accents on the Performance of Whisper Family Models Using Medical Speech in Polish
PublikacjaThe article presents preliminary experiments investigating the impact of accent on the performance of the Whisper automatic speech recognition (ASR) system, specifically for the Polish language and medical data. The literature review revealed a scarcity of studies on the influence of accents on speech recognition systems in Polish, especially concerning medical terminology. The experiments involved voice cloning of selected individuals...
-
English Language Learning Employing Developments in Multimedia IS
PublikacjaIn the realm of the development of information systems related to education, integrating multimedia technologies offers novel ways to enhance foreign language learning. This study investigates audio-video processing methods that leverage real-time speech rate adjustment and dynamic captioning to support English language acquisition. Through a mixed-methods analysis involving participants from a language school, we explore the impact...
-
Threat intelligence platform for the energy sector
PublikacjaIn recent years, critical infrastructures and power systems in particular have been subjected to sophisticated cyberthreats, including targeted attacks and advanced persistent threats. A promising response to this challenging situation is building up enhanced threat intelligence that interlinks information sharing and fine-grained situation awareness. In this paper a framework which integrates all levels of threat intelligence...
-
Exact-match Based Wikipedia-WordNet Integration
PublikacjaAbility to link between WordNet synsets and Wikipedia articles allows usage of those resources by computers during natural language processing. A lot of work was done in this field, however most of the approaches focus on similarity between Wikipedia articles and WordNet synsets rather than creation of perfect matches. In this paper we proposed a set of methods for automatic perfect matching generation. The proposed methods were...
-
Knowledge base views
PublikacjaThe paper introduces an extension to the NeeK language. In the current shape NeeK allows for selection of fragments of a given ontology. The selected part is automatically mapped to a database schema by Data Views implementation. Experience with a real system using Data Views has shown that the resulting database schema does not necessarily reflect the needs of the business logic of an application that uses a specific Data View....
-
Viability of decisional DNA in robotics
PublikacjaThe Decisional DNA is an artificial intelligence system that uses prior experiences to shape future decisions. Decisional DNA is written in the Set Of Experience Knowledge Structure (SOEKS) and is capable of capturing and reusing a broad range of data. Decisional DNA has been implemented in several fields including Alzheimer’s diagnosis, geothermal energy and smart TV. Decisional DNA is well suited to use in robotics due to the...