Filters
total: 230
filtered: 211
Search results for: LANGUAGE MODELS
-
Language Models in Speech Recognition
PublicationThis chapter describes language models used in speech recognition, It starts by indicating the role and the place of language models in speech recognition. Mesures used to compare language models follow. An overview of n-gram, syntactic, semantic, and neural models is given. It is accompanied by a list of popular software.
-
Finite automata for compact representation of language models in NLP
PublicationPrzedstawiona zostaje technika reprezentacji modeli języka w przetwarzaniu języka naturalnego wymagająca mało pamięci. Po krótkim omówieniu przyczyn poszukiwania oszczędnej reprezentacji takich modeli języka, pokazane jest, jak automaty skończone mogą być użyte w tym celu. Technika może być postrzegana jako zastosowanie i rozszerzenie doskonałej funkcji mieszającej z wykorzystaniem automatów skończonych. Pierwsze doświadczenia...
-
Information Extraction from Polish Radiology Reports using Language Models
PublicationRadiology reports are vital elements of directing patient care. They are usually delivered in free text form, which makes them prone to errors, such as omission in reporting radiological findings and using difficult-to-comprehend mental shortcuts. Although structured reporting is the recommended method, its adoption continues to be limited. Radiologists find structured reports too limiting and burdensome. In this paper, we propose...
-
Comparison of Language Models Trained on Written Texts and Speech Transcripts in the Context of Automatic Speech Recognition
Publication -
Quantifying inconsistencies in the Hamburg Sign Language Notation System
PublicationThe advent of machine learning (ML) has significantly advanced the recognition and translation of sign languages, bridging communication gaps for hearing-impaired communities. At the heart of these technologies is data labeling, crucial for training ML algorithms on a huge amount of consistently labeled data to achieve models that generalize well. The adoption of language-agnostic annotations is essential to connect different sign...
-
DEVELOPMENT OF THE ALGORITHM OF POLISH LANGUAGE FILM REVIEWS PREPROCESSING
PublicationThe algorithm and the software for conducting the procedure of Preprocessing of the reviews of films in the Polish language were developed. This algorithm contains the following steps: Text Adaptation Procedure; Procedure of Tokenization; Procedure of Transforming Words into the Byte Format; Part-of-Speech Tagging; Stemming / Lemmatization Procedure; Presentation of Documents in the Vector Form (Vector Space Model) Procedure; Forming...
-
The Impact of Foreign Accents on the Performance of Whisper Family Models Using Medical Speech in Polish
PublicationThe article presents preliminary experiments investigating the impact of accent on the performance of the Whisper automatic speech recognition (ASR) system, specifically for the Polish language and medical data. The literature review revealed a scarcity of studies on the influence of accents on speech recognition systems in Polish, especially concerning medical terminology. The experiments involved voice cloning of selected individuals...
-
A survey of automatic speech recognition deep models performance for Polish medical terms
PublicationAmong the numerous applications of speech-to-text technology is the support of documentation created by medical personnel. There are many available speech recognition systems for doctors. Their effectiveness in languages such as Polish should be verified. In connection with our project in this field, we decided to check how well the popular speech recognition systems work, employing models trained for the general Polish language....
-
An Approach to Trust Case Development
PublicationIn the paper we present an approach to the architectural trust case development for DRIVE, the IT infrastructure supporting the processes of drugs distribution and application. The objectives of DRIVE included safer and cheaper drugs distribution and application. A trust case represents an argument supporting the trustworthiness of the system. It is decomposed into claims that postulate some trust related properties. Claims differ...
-
A Model-Driven Solution for Development of Multimedia Stream Processing Applications
PublicationThis paper presents results of action research related to model-driven solutions in the area of multimedia stream processing. The practical problem to be solved was the need to support application developers who make their multimedia stream processing applications in a supercomputer environment. The solution consists of a domain-specific visual language for composing complex services from simple services called Multimedia Stream...
-
Geometric Algebra Model of Distributed Representations
PublicationFormalism based on GA is an alternative to distributed representation models developed so far-Smolensky's tensor product, Holographic Reduced Representations (HRR) and Binary Spatter Code (BSC). Convolutions are replaced by geometric products, interpretable in terms of geometry which seems to be the most natural language for visualization of higher concepts. This paper recalls the main ideas behind the GA model and investigates...
-
Optimizing Medical Personnel Speech Recognition Models Using Speech Synthesis and Reinforcement Learning
PublicationText-to-Speech synthesis (TTS) can be used to generate training data for building Automatic Speech Recognition models (ASR). Access to medical speech data is because it is sensitive data that is difficult to obtain for privacy reasons; TTS can help expand the data set. Speech can be synthesized by mimicking different accents, dialects, and speaking styles that may occur in a medical language. Reinforcement Learning (RL), in the...
-
Modelling of the High Speed Multi-Pole Synchronous Generator for Application in More Electric Aircraft Power Systems
PublicationIn this paper different models of the synchronous generator are presented. The simulation results compared with the measurements are shown. Certain physical phenomena are included in described models for the porpoise of adequate analysis of the more electric aircraft power system. For different modelling levels, such as functional level or behavioural level, different physical phenomena have been included. Simulation results for...
-
Previous Opinions is All You Need - Legal Information Retrieval System
PublicationWe present a system for retrieving the most relevant legal opinions to a given legal case or question. To this end, we checked several state-of-the-art neural language models. As a training and testing data, we use tens of thousands of legal cases as question-opinion pairs. Text data has been subjected to advanced pre-processing adapted to the specifics of the legal domain. We empirically chose the BERT-based HerBERT model to perform...
-
Using FreeFEM open software for modelling the vibrations of piezoelectric devices
PublicationModelling vibrations of piezoelectric transducers has been a topic discussed in the literature for many decades. The first models - so-called one-dimensional - describe the vibrations only near operating frequency and near its harmonics. Attempts to introduce two-dimensional models were related to the possibility of one transducer working at several frequencies, including both thickness vibrations and those resulting from the transducer...
-
The Algorithm of Modelling and Analysis of Latent Semantic Relations: Linear Algebra vs. Probabilistic Topic Models
PublicationThis paper presents the algorithm of modelling and analysis of Latent Semantic Relations inside the argumentative type of documents collection. The novelty of the algorithm consists in using a systematic approach: in the combination of the probabilistic Latent Dirichlet Allocation (LDA) and Linear Algebra based Latent Semantic Analysis (LSA) methods; in considering each document as a complex of topics, defined on the basis of separate...
-
SMAQ - A Semantic Model for Analitical Queries
PublicationWhile the Self-Service Business Intelligence (BI) becomes an important part of organizational BI solutions there is a great need for new tools allowing to construct ad-hoc queries by users with various responsibilities and skills. The paper presents a Semantic Model for Analytical Queries – SMAQ allowing to construct queries by users familiar with business events and terms, but being unaware of database or data warehouse concepts...
-
SYNTHESIZING MEDICAL TERMS – QUALITY AND NATURALNESS OF THE DEEP TEXT-TO-SPEECH ALGORITHM
PublicationThe main purpose of this study is to develop a deep text-to-speech (TTS) algorithm designated for an embedded system device. First, a critical literature review of state-of-the-art speech synthesis deep models is provided. The algorithm implementation covers both hardware and algorithmic solutions. The algorithm is designed for use with the Raspberry Pi 4 board. 80 synthesized sentences were prepared based on medical and everyday...
-
Prediction of fracture toughness in fibre-reinforced concrete, mortar, and rocks using various Machine learning techniques
PublicationMachine Learning (ML) method is widely used in engineering applications such as fracture mechanics. In this study, twenty different ML algorithms were employed and compared for the prediction of the fracture toughness and fracture load in modes I, II, and mixed-mode (I-II) of various materials, including fibre-reinforced concrete, cement mortar, sandstone, white travertine, marble, and granite. A set of 401 specimens of “Brazilian...
-
News that Moves the Market: DSEX-News Dataset for Forecasting DSE Using BERT
PublicationStock market is a complex and dynamic industry that has always presented challenges for stakeholders and investors due to its unpredictable nature. This unpredictability motivates the need for more accurate prediction models. Traditional prediction models have limitations in handling the dynamic nature of the stock market. Additionally, previous methods have used less relevant data, leading to suboptimal performance. This study...
-
Beyond Traditional Learning: The LLM Revolution in BPM Education at University
PublicationLarge Language Models (LLMs) significantly impact higher education, requiring changes in educational processes, especially in Business Process Management (BPM) practical exercises. The research aims to evaluate the effectiveness of LLMs in BPM education to determine if LLMs can supplement educators. The study involved 33 master’s degree students. Students’ works were manually evaluated and compared to LLM-generated responses. Results...
-
Modelling and simulation of GPU processing in the MERPSYS environment
PublicationIn this work, we evaluate an analytical GPU performance model based on Little's law, that expresses the kernel execution time in terms of latency bound, throughput bound, and achieved occupancy. We then combine it with the results of several research papers, introduce equations for data transfer time estimation, and finally incorporate it into the MERPSYS framework, which is a general-purpose simulator for parallel and distributed...
-
An Analysis of Neural Word Representations for Wikipedia Articles Classification
PublicationOne of the current popular methods of generating word representations is an approach based on the analysis of large document collections with neural networks. It creates so-called word-embeddings that attempt to learn relationships between words and encode this information in the form of a low-dimensional vector. The goal of this paper is to examine the differences between the most popular embedding models and the typical bag-of-words...
-
Genre-Based Music Language Modeling with Latent Hierarchical Pitman-Yor Process Allocation
PublicationIn this work we present a new Bayesian topic model: latent hierarchical Pitman-Yor process allocation (LHPYA), which uses hierarchical Pitman-Yor pr ocess priors for both word and topic distributions, and generalizes a few of the existing topic models, including the latent Dirichlet allocation (LDA), the bi- gram topic model and the hierarchical Pitman-Yor topic model. Using such priors allows for integration of -grams with a topic model,...
-
Genre-Based Music Language Modeling with Latent Hierarchical Pitman-Yor Process Allocation
PublicationIn this work we present a new Bayesian topic model: latent hierarchical Pitman-Yor process allocation (LHPYA), which uses hierarchical Pitman-Yor pr ocess priors for both word and topic distributions, and generalizes a few of the existing topic models, including the latent Dirichlet allocation (LDA), the bi- gram topic model and the hierarchical Pitman-Yor topic model. Using such priors allows for integration of -grams with a topic...
-
Semantic modeling of contextual augmented reality environments
PublicationDespite significant progress in the field of augmented reality (AR), regarding both hardware and software, there is still a lack of universal models and methods that would enable building ubiquitous AR systems that could be used anywhere and anytime, covering different application areas. This dissertation describes a new approach to building AR systems, called the Contextual Augmented Reality Environment (CARE). The CARE approach...
-
Badania empiryczne związane z ewolucją języków - wybrane zagadnienia
PublicationAlthough language evolution is an area in science yet to be developed, its foundations lay on empirical research. The aim of this article is to present three categories of ways to get empirical data on language evolution: observing language in laboratory, monitoring animal communication and analysing pidgins and creoles. The part of the paper about language in laboratory bases on English-language articles presenting the experiments...
-
Towards facts extraction from text in Polish language
PublicationNatural Language Processing (NLP) finds many usages in different fields of endeavor. Many tools exists allowing analysis of English language. For Polish language the situation is different as the language itself is more complicated. In this paper we show differences between NLP of Polish and English language. Existing solutions are presented and TEAMS software for facts extraction is described. The paper shows also evaluation of...
-
Ontology of the Design Pattern Language for Smart Cities Systems
PublicationThe paper presents the definition of the design pattern language of Smart Cities in the form of an ontology. Since the implementation of a Smart City system is difficult, expensive and closely linked with the problems concerning a given city, the knowledge acquired during a single implementation is extremely valuable. The language we defined supports the management of such knowledge as it allows for the expression of a solution...
-
Semantic OLAP with FluentEditor and Ontorion Semantic Excel Toolchain
PublicationSemantic technologies appear as a step on the way to creating systems capable of representing the physical world as real time computational processes. In this context, the paper presents a toolchain for an ontology based knowledge management system. It consists of the ontology editor, FluentEditor and the distributed knowledge representation system, Ontorion. FluentEditor is a comprehensive tool for editing and manipulating complex...
-
Comparison of Lithuanian and Polish Consonant Phonemes Based on Acoustic Analysis – Preliminary Results
PublicationThe goal of this research is to find a set of acoustic parameters that are related to differences between Polish and Lithuanian language consonants. In order to identify these differences, an acoustic analysis is performed, and the phoneme sounds are described as the vectors of acoustic parameters. Parameters known from the speech domain as well as those from the music information retrieval area are employed. These parameters are...
-
Ontology-Aided Software Engineering
PublicationThis thesis is located between the fields of research on Artificial Intelligence (AI), Knowledge Representation and Reasoning (KRR), Computer-Aided Software Engineering (CASE) and Model Driven Engineering (MDE). The modern offspring of KRR - Description Logic (DL) [Baad03] is considered here as a formalization of the software engineering Methods & Tools. The bridge between the world of formal specification (governed by the mathematics)...
-
Techno-economic evaluation of combined cycle gas turbine and a diabatic compressed air energy storage integration concept
PublicationMore and more operational flexibility is required from conventional power plants due to the increasing share of weather-dependent renewable energy sources (RES) generation in the power system. One way to increase power plant’s flexibility is integrating it with energy storage. The energy storage facility can be used to minimize ramping or shutdowns and therefore should lower overall generating costs and CO2 emissions. In this article,...
-
Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej
PublicationThe bi-modal speech recognition system requires a 2-sample language input for training and for testing algorithms which precisely depicts natural English speech. For the purposes of the audio-visual recordings, a training data base of 264 sentences (1730 words without repetitions; 5685 sounds) has been created. The language sample reflects vowel and consonant frequencies in natural speech. The recording material reflects both the...
-
Improving flexibility and performance of PVM applications by distributed partial evaluation
PublicationA new framework for developing both flexible and efficient PVM applications is described. We propose Architecture Templates Interface (ATI) that allows to control application granularity and parallelism. To ensure high application efficiency we extend partial evaluation strategy into domain of distributed applications obtaining Distributed Partial Evaluation (DPE). Both ATI and DPE were implemented using a new distributed programming...
-
Knowledge base views
PublicationThe paper introduces an extension to the NeeK language. In the current shape NeeK allows for selection of fragments of a given ontology. The selected part is automatically mapped to a database schema by Data Views implementation. Experience with a real system using Data Views has shown that the resulting database schema does not necessarily reflect the needs of the business logic of an application that uses a specific Data View....
-
Exploring the preferences of Polish EFL teachers towards the accents of English
PublicationThis language attitudes study investigates the preferences of EFL (English as a foreign language) teachers from Poland towards the accents of English they speak and teach. Despite the substantial amount of research on EFL learners, little has been done to investigate the impact of preferences of Polish teachers for different variations of English language on their...
-
The Principles of Model Building Concepts Which Are Applied to the Design Patterns for Smart Cities
PublicationThe involvement of citizens into decision-making processes is one of the main features of smart cities. Such commitment is reflected in the form of requirements towards the city, and the benefits which are expected from the city. Requirements and benefits are thus the primary language of communication between decision-makers and urban residents. To develop such a language, it becomes necessary to develop design patterns for Smart...
-
Modern Platform for Parallel Algorithms Testing: Java on Intel Xeon Phi
PublicationParallel algorithms are popular method of increasing system performance. Apart from showing their properties using asymptotic analysis, proof-of-concept implementation and practical experiments are often required. In order to speed up the development and provide simple and easily accessible testing environment that enables execution of reliable experiments, the paper proposes a platform with multi-core computational accelerator:...
-
S’attaquer à la suprématie du masculin sur le féminin : le français inclusif dans les publications des universités françaises dans les réseaux sociaux
PublicationThis paper aims to examine the use of inclusive French in the Internet publications of Paris universities on their social media. Three higher education institutions were selected: Paris Dauphine-PSL University, Gustave Eiffel University, and Sorbonne Paris North University. The publications were obtained from Facebook, Instagram, and LinkedIn. Firstly, the groups of people to whom the use of inclusive French referred...
-
Individual Resources and Intercultural Interactions
PublicationThe work environment in multinational corporations (MNCs) is specific and demanding including intercultural interactions with co-workers and clients and using a foreign language. Some individual resources can help in dealing with these circumstances. Individual resources refer to personal dispositions, competencies and prior experiences. With regard to previous studies, a caravan of personal resources, namely Psychological Capital...
-
Towards Facts Extraction From Texts in Polish Language
PublicationThe Polish language differs from English in many ways. It has more complicated conjugation and declination. Because of that automatic facts extraction from texts is difficult. In this paper we present basic differences between those languages. The paper presents an algorithm for extraction of facts from articles from Polish Wikipedia. The algorithm is based on 7 proposed facts schemes that are searched for in the analyzed text....
-
CHALK & TALK OR SWIPE & SKYPE?
PublicationTechnology in classroom is a matter of heated discussions in the field of education development, especially when multidisciplinary education goes along with language skills. Engineers’ education requires theoretical and practical knowledge. Moreover, dedicated computer skills become crucial for both young graduates and experienced educators on the labor market. Teaching online with or without using different Learning Management...
-
Extracting concepts from the software requirements specification using natural language processing
PublicationExtracting concepts from the software require¬ments is one of the first step on the way to automating the software development process. This task is difficult due to the ambiguity of the natural language used to express the requirements specification. The methods used so far consist mainly of statistical analysis of words and matching expressions with a specific ontology of the domain in which the planned software will be applicable....
-
Ontology clustering by directions algorithm to expand ontology queries
PublicationThis paper concerns formulating ontology queries. It describes existing languages in which ontologies can be queried. It focuses on languages which are intended to be easily understood by users who are willing to retrieve information from ontologies. Such a language can be, for example, a type of controlled natural language (CNL). In this paper a novel algorithm called Ontology Clustering by Directions is presented. The algorithm...
-
Workflow patterns applicable to virtual knowledge-based organizations
PublicationWorkflow is a term specifying how to automate a business process, in whole or part during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules. Workflow is therefore directly applicable in virtual knowledge-based organizations, where information is exchanged via electronic documents. In the literature, is presented a complete list of workflow control-flow...
-
Knowledge Base Suitable for Answering Questions in Natural Language
PublicationThis paper presents three knowledge bases widely used by researchers coping with natural language processing: OpenCyc, DBpedia and YAGO. They are characterized from the point of view of questions answering system. In this paper a short description of the aforementioned system implementation is also presented.
-
A new library for construction of automata
PublicationWe present a new library of functions that construct minimal, acyclic, deterministic, finite-state automata in the same format as the author's fsa package, and also accepted by the author's fadd library of functions that use finite-state automata as dictionaries in natural language processing.
-
Learning design of a blended course in technical writing
PublicationBlending face-to-face classes with e-learning components can lead to a very successful outcome if the blend of approaches, methods, content, space, time, media and activities is carefully structured and approached from both the student’s and the tutor’s perspective. In order to blend synchronous and asynchronous e-learning activities with traditional ones, educators should make them inter-dependent and develop them according to...
-
Testing for conformance of parallel programming pattern languages
PublicationThis paper reports on the project being run by TUG and IMAG, aimed at reducing the volume of tests required to exercise parallel programming language compilers and libraries. The idea is to use the ISO STEP standard scheme for conformance testing of software products. A detailed example illustrating the ongoing work is presented.