Search results for: MODALITY CORPUS · ENGLISH LANGUAGE CORPUS · SPEECH RECOGNITION · AVSR

PHONEME DISTORTION IN PUBLIC ADDRESS SYSTEMS

Publication

- Year 2015

The quality of voice messages in speech reinforcement and public address systems is often poor. The sound engineering projects of such systems take care of sound intensity and possible reverberation phenomena in public space without, however, considering the influence of acoustic interference related to the number and distribution of loudspeakers. This paper presents the results of measurements and numerical simulations of the...

Tensor Decomposition for Imagined Speech Discrimination in EEG

Publication

J. S. Garcia Salinas
L. Villaseñor-Pineda
C. A. Reyes-Garćia
A. A. Torres-García

- LECTURE NOTES IN COMPUTER SCIENCE - Year 2018

Most of the researches in Electroencephalogram(EEG)-based Brain-Computer Interfaces (BCI) are focused on the use of motor imagery. As an attempt to improve the control of these interfaces, the use of language instead of movement has been recently explored, in the form of imagined speech. This work aims for the discrimination of imagined words in electroencephalogram signals. For this purpose, the analysis of multiple variables...

Full text to download in external service

DEVELOPMENT OF THE ALGORITHM OF POLISH LANGUAGE FILM REVIEWS PREPROCESSING

Publication

- Rocznik Naukowy Wydzialu Zarzadzania w Ciechanowie - Year 2017

The algorithm and the software for conducting the procedure of Preprocessing of the reviews of films in the Polish language were developed. This algorithm contains the following steps: Text Adaptation Procedure; Procedure of Tokenization; Procedure of Transforming Words into the Byte Format; Part-of-Speech Tagging; Stemming / Lemmatization Procedure; Presentation of Documents in the Vector Form (Vector Space Model) Procedure; Forming...

Full text available to download

Secured wired BPL voice transmission system

Publication

G. Debita
P. Falkowski-Gilski
M. Habrych
B. Miedziński
J. Wandzio
P. Jedlikowski

- Scientific Journal of the Military University of Land Forces - Year 2020

Designing a secured voice transmission system is not a trivial task. Wired media, thanks to their reliability and resistance to mechanical damage, seem an ideal solution. The BPL (Broadband over Power Line) cable is resistant to electricity stoppage and partial damage of phase conductors, ensuring continuity of transmission in case of an emergency. It seems an appropriate tool for delivering critical data, mostly clear and understandable...

Full text available to download

Database of speech and facial expressions recorded with optimized face motion capture settings

Publication

- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Year 2019

The broad objective of the present research is the analysis of spoken English employing a multiplicity of modalities. An important stage of this process, discussed in the paper, is creating a database of speech accompanied with facial expressions. Recordings of speakers were made using an advanced system for capturing facial muscle motion. A brief historical outline, current applications, limitations and the ways of capturing face...

Full text available to download

Subjective Quality Evaluation of Speech Signals Transmitted via BPL-PLC Wired System

Publication

P. Falkowski-Gilski
G. Debita
M. Habrych
B. Miedziński
P. Jedlikowski
B. Polnik
J. Wandzio
X. Wang

- Year 2020

The broadband over power line – power line communication (BPL-PLC) cable is resistant to electricity stoppage and partial damage of phase conductors. It maintains continuity of transmission in case of an emergency. These features make it an ideal solution for delivering data, e.g. in an underground mine environment, especially clear and easily understandable voice messages. This paper describes a subjective quality evaluation of...

Full text to download in external service

Geometric Algebra Model of Distributed Representations

Publication

A. Patyk-Łońska

- Year 2010

Formalism based on GA is an alternative to distributed representation models developed so far-Smolensky's tensor product, Holographic Reduced Representations (HRR) and Binary Spatter Code (BSC). Convolutions are replaced by geometric products, interpretable in terms of geometry which seems to be the most natural language for visualization of higher concepts. This paper recalls the main ideas behind the GA model and investigates...

In search of the new: American volunteers’ opinions about their participation in the Teaching English in Poland (TEIP) Program

Publication

I. Nowakowska

- Year 2021

The Teaching English in Poland (TEIP) program relies on summer camps during which native English speakers, American volunteers, teach Polish children and adolescents using the language immersion method – during everyday activities, sports and art classes, and similar occasions. A vital aspect of the evaluation of the program is researching its impact on the young people; however, the opinions of the volunteers regarding their...

Full text to download in external service

The Innovative Faculty for Innovative Technologies

Publication

- Year 2013

A leaflet describing Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology. Multimedia Systems Department described laboratories and prototypes of: Auditory-visual attention stimulator, Automatic video event detection, Object re-identification application for multi-camera surveillance systems, Object Tracking and Automatic Master-Slave PTZ Camera Positioning System, Passive Acoustic Radar,...

Full text to download in external service

Cross-Lingual Knowledge Distillation via Flow-Based Voice Conversion for Robust Polyglot Text-to-Speech

Publication

D. Piotrowski
R. Korzeniowski
A. Falai
S. Cygert
K. Pokora
G. Tinchev
Z. Zhang
K. Yanagisawa

- Year 2023

In this work, we introduce a framework for cross-lingual speech synthesis, which involves an upstream Voice Conversion (VC) model and a downstream Text-To-Speech (TTS) model. The proposed framework consists of 4 stages. In the first two stages, we use a VC model to convert utterances in the target locale to the voice of the target speaker. In the third stage, the converted data is combined with the linguistic features and durations...

Full text to download in external service

Speech Analytics Based on Machine Learning

Publication

- Year 2019

In this chapter, the process of speech data preparation for machine learning is discussed in detail. Examples of speech analytics methods applied to phonemes and allophones are shown. Further, an approach to automatic phoneme recognition involving optimized parametrization and a classifier belonging to machine learning algorithms is discussed. Feature vectors are built on the basis of descriptors coming from the music information...

Full text to download in external service

Examining Feature Vector for Phoneme Recognition

Publication

G. Korvel
B. Kostek

- Year 2018

The aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...

Elimination of Impulsive Disturbances From Stereo Audio Recordings Using Vector Autoregressive Modeling and Variable-order Kalman Filtering

Publication

- IEEE Transactions on Audio Speech and Language Processing - Year 2015

This paper presents a new approach to elimination of impulsive disturbances from stereo audio recordings. The proposed solution is based on vector autoregressive modeling of audio signals. Online tracking of signal model parameters is performed using the exponential ly weighted least squares algo- rithm. Detection of noise pulses an d model-based interpolation of the irrevocably distorted sampl es is realized using an adaptive, variable-order...

Full text available to download

Elimination of Impulsive Disturbances From Archive Audio Signals Using Bidirectional Processing

Publication

- IEEE Transactions on Audio Speech and Language Processing - Year 2013

In this application-oriented paper we consider the problem of elimination of impulsive disturbances, such as clicks, pops and record scratches, from archive audio recordings. The proposed approach is based on bidirectional processing—noise pulses are localized by combining the results of forward-time and backward-time signal analysis. Based on the results of specially designed empirical tests (rather than on the results of theoretical analysis),...

Full text available to download

Dynamic Bayesian Networks for Symbolic Polyphonic Pitch Modeling

Publication

S. Raczyński
E. Vincent
S. Sagayama

- IEEE Transactions on Audio Speech and Language Processing - Year 2013

Symbolic pitch modeling is a way of incorporating knowledge about relations between pitches into the process of an- alyzing musical information or signals. In this paper, we propose a family of probabilistic symbolic polyphonic pitch models, which account for both the “horizontal” and the “vertical” pitch struc- ture. These models are formulated as linear or log-linear interpo- lations of up to fi ve sub-models, each of which is...

Full text to download in external service

Genre-Based Music Language Modeling with Latent Hierarchical Pitman-Yor Process Allocation

Publication

S. Raczyński
E. Vincent

- IEEE Transactions on Audio Speech and Language Processing - Year 2014

In this work we present a new Bayesian topic model: latent hierarchical Pitman-Yor process allocation (LHPYA), which uses hierarchical Pitman-Yor pr ocess priors for both word and topic distributions, and generalizes a few of the existing topic models, including the latent Dirichlet allocation (LDA), the bi- gram topic model and the hierarchical Pitman-Yor topic model. Using such priors allows for integration of -grams with a topic model,...

Full text to download in external service

Estimation of the short-term predictor parameters of speech under noisy conditions

Publication

M. Kuropatwinski
W. Kleijn
M. Kuropatwiński

- IEEE Transactions on Audio Speech and Language Processing - Year 2006

Full text to download in external service

Determining Pronunciation Differences in English Allophones Utilizing Audio Signal Parameterization

Publication

- Year 2017

An allophonic description of English plosive consonants, based on audio-visual recordings of 600 specially selected words, was developed. First, several speakers were recorded while reading words from a teleprompter. Then, every word was played back from the previously recorded sample read by a phonology expert and each examined speaker repeated a particular word trying to imitate correct pronunciation. The next step consisted...

Reaktywny system oddziaływania ze środowiskiem oparty na inteligentnym systemie decyzyjnym

Publication

Z. Kowalczuk

- Year 2009

Procesy poznawcze zachodzące w umyśle człowieka, po matematycznym zamodelowaniu i algorytmizacji, mogą by wykorzystane do konstruowania inteligentnych systemów decyzyjnych. Systemy takie mają wielorakie zastosowania. Znaleźć można je między innymi w rozmaitych autonomicznych systemach informatyki, automatyki i robotyki: począwszy od 'inteligentnego' strażnika, kamerdynera, itp., a skończywszy na opiekunie - wirtualnym towarzyszu...

Objectivization of phonological evaluation of speech elements by means of audio parametrization

Publication

- Year 2018

This study addresses two issues related to both machine- and subjective-based speech evaluation by investigating five phonological phenomena related to allophone production. Its aim is to use objective parametrization and phonological classification of the recorded allophones. These allophones were selected as specifically difficult for Polish speakers of English: aspiration, final obstruent devoicing, dark lateral /l/, velar nasal...

Difference in Perceived Speech Signal Quality Assessment Among Monolingual and Bilingual Teenage Students

Publication

P. Falkowski-Gilski

- Year 2021

The user perceived quality is a mixture of factors, including the background of an individual. The process of auditory perception is discussed in a wide variety of fields, ranging from engineering to medicine. Many studies examine the difference between musicians and non-musicians. Since musical training develops musical hearing and other various auditory capabilities, similar enhancements should be observable in case of bilingual...

Full text to download in external service

Quality Analysis of Audio-Video Transmission in an OFDM-Based Communication System

Publication

M. Zamłyńska
G. Debita
P. Falkowski-Gilski

- Year 2022

Application of a reliable audio-video communication system, brings many advantages. With the spoken word we can exchange ideas, provide descriptive information, as well as aid to another person. With the availability of visual information one can monitor the surrounding, working environment, etc. As the amount of available bandwidth continues to shrink, researchers focus on novel types of transmission. Currently, orthogonal frequency...

Full text to download in external service

MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES

Publication

M. Piotrowska
G. Korvel
B. Kostek
T. Ciszewski
A. Czyżewski

- International Journal of Applied Mathematics and Computer Science - Year 2019

Automatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and selforganizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’...

Full text available to download

Examining Feature Vector for Phoneme Recognition / Analiza parametrów w kontekście automatycznej klasyfikacji fonemów

Publication

- Year 2017

The aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...

Modeling and Simulation for Exploring Power/Time Trade-off of Parallel Deep Neural Network Training

Publication

P. Rościszewski

- Procedia Computer Science - Year 2017

In the paper we tackle bi-objective execution time and power consumption optimization problem concerning execution of parallel applications. We propose using a discrete-event simulation environment for exploring this power/time trade-off in the form of a Pareto front. The solution is verified by a case study based on a real deep neural network training application for automatic speech recognition. A simulation lasting over 2 hours...

Full text available to download

Usability study of various biometric techniques in bank branches

Publication

- Year 2023

The purpose of the presented research was to evaluate the performance of the prepared biometric algorithms and obtain information on the opinions and preferences of their users in bank branches. The study aimed to determine users' attitudes towards particular modalities and preferences on how to use biometrics after the bank customers had practical experience with the operation of the prototype solutions. The research results...

Full text available to download

Automatic Emotion Recognition in Children with Autism: A Systematic Literature Review

Publication

A. Landowska
A. Karpus
T. Zawadzka
B. Robins
D. Erol Barkana
H. Kose
T. Zorcec
N. Cummins

- SENSORS - Year 2022

The automatic emotion recognition domain brings new methods and technologies that might be used to enhance therapy of children with autism. The paper aims at the exploration of methods and tools used to recognize emotions in children. It presents a literature review study that was performed using a systematic approach and PRISMA methodology for reporting quantitative and qualitative results. Diverse observation channels and modalities...

Full text available to download

Analysis of allophones based on audio signal recordings and parameterization

Publication

- Journal of the Acoustical Society of America - Year 2017

The aim of this study is to develop an allophonic description of English plosive consonants based on recordings of 600 specially selected words. Allophonic variations addressed in the study may have two sources: positional and contextual. The former one depends on the syllabic or prosodic position in which a particular phoneme occurs. Contextual allophony is conditioned by the local phonetic environment. Co-articulation overlapping...

Full text to download in external service

Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging

Publication

- Year 2017

In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modiﬁcation of the training program which minimizes the...

Full text to download in external service

Automatic Watercraft Recognition and Identification on Water Areas Covered by Video Monitoring as Extension for Sea and River Traffic Supervision Systems

Publication

N. Wawrzyniak
A. Stateczny

- Polish Maritime Research - Year 2018

The article presents the watercraft recognition and identification system as an extension for the presently used visual water area monitoring systems, such as VTS (Vessel Traffic Service) or RIS (River Information Service). The watercraft identification systems (AIS - Automatic Identification Systems) which are presently used in both sea and inland navigation require purchase and installation of relatively expensive transceivers...

Full text to download in external service

Noise profiling for speech enhancement employing machine learning models

Publication

K. Kąkol
G. Korvel
B. Kostek

- Journal of the Acoustical Society of America - Year 2022

This paper aims to propose a noise profiling method that can be performed in near real-time based on machine learning (ML). To address challenges related to noise profiling effectively, we start with a critical review of the literature background. Then, we outline the experiment performed consisting of two parts. The first part concerns the noise recognition model built upon several baseline classifiers and noise signal features...

Full text available to download

Integrating heterogeneous systems with high-dependability requirements by means of web services

Publication

M. Kania
W. Korłub
J. Krajewski

- Year 2012

Web services are commonly used on boundaries of heterogeneous components in Service Oriented Architecture (SOA) as they provide a universal communication channel not bound to any particular programming language or run-time platform. This paper describes how web services can be used to integrate heterogeneous systems which serve purposes requiring high dependability, reliability and availability. Examples of such systems include...

The role of EMG module in hybrid interface of prosthetic arm

Publication

- Year 2017

Nearly 10% of all upper limb amputations concern the whole arm. It affects the mobility and reduces the productivity of such a person. These two factors can be restored by using prosthetics. However, the complexity of human arm makes restoring its basic functions quite difficult. When the osseointegration and/or targeted muscle reinnervation (TMR) are not possible, different modalities can be used to control the prosthesis. In...

Full text to download in external service

Modeling the Customer’s Contextual Expectations Based on Latent Semantic Analysis Algorithms

Publication

- Year 2017

Nowadays, in the age of Internet, access to open data detects the huge possibilities for information retrieval. More and more often we hear about the concept of open data which is unrestricted access, in addition to reuse and analysis by external institutions, organizations and people. It’s such information that can be freely processed, add another data (so-called remix) and then published. More and more data are available in text...

Full text available to download

BPL-PLC Voice Communication System for the Oil and Mining Industry

Publication

G. Debita
P. Falkowski-Gilski
M. Habrych
G. Wiśniewski
B. Miedziński
P. Jedlikowski
A. Waniewska
J. Wandzio
B. Polnik

- ENERGIES - Year 2020

Application of a high-efficiency voice communication systems based on broadband over power line-power line communication (BPL-PLC) technology in medium voltage networks, including hazardous areas (like the oil and mining industry), as a redundant mean of wired communication (apart from traditional fiber optics and electrical wires) can be beneficial. Due to the possibility of utilizing existing electrical infrastructure, it can...

Full text available to download

Glossary [Intellectual Output 1] Glossary as a method for reflection on complex research questions

Publication

- Year 2022

Globalization and digitization are strongly influencing the process of shaping the built environment. The latter is causing the new design tools to emerge faster than ever before in history, while the former is speeding up not only the development, but also the broad roll-out of more agile and interdisciplinary methodologies and work approaches. The design process is also becoming more and more inter- and trans-disciplinary. This...

Full text to download in external service

Linear revitalization - problems and challenges. Discursive article

Publication

A. Sas-Bojarska

- PROBLEMY ROZWOJU MIAST Kwartalnik Naukowy - Year 2017

The aim of the article, defined by the author as discursive, is to give the answer as to whether within ‘revitalization’ we should distinguish the notion of ‘linear revitalization’ – not yet defined in Polish and English-language literature. The author presents the thesis that we should do so by presenting the idea, its specific character and its role. This kind of action seems to have, in the author’s opinion, a positive influence...

Full text to download in external service

An Analysis of Neural Word Representations for Wikipedia Articles Classification

Publication

J. Szymański
N. Kawalec

- CYBERNETICS AND SYSTEMS - Year 2019

One of the current popular methods of generating word representations is an approach based on the analysis of large document collections with neural networks. It creates so-called word-embeddings that attempt to learn relationships between words and encode this information in the form of a low-dimensional vector. The goal of this paper is to examine the differences between the most popular embedding models and the typical bag-of-words...

Full text to download in external service

Separability Assessment of Selected Types of Vehicle-Associated Noise

Publication

- Advances in Intelligent Systems and Computing - Year 2016

Music Information Retrieval (MIR) area as well as development of speech and environmental information recognition techniques brought various tools in-tended for recognizing low-level features of acoustic signals based on a set of calculated parameters. In this study, the MIRtoolbox MATLAB tool, designed for music parameter extraction, is used to obtain a vector of parameters to check whether they are suitable for separation of...

Full text to download in external service

Contactless hearing aid designed for infants

Publication

- Archives of Acoustics - Year 2006

It is a well known fact that language development through home intervention for a hearing-impaired infant should start in the early months of a newborn baby's life. The aim of this paper is to present a concept of a contactless digital hearing aid designed especially for infants. In contrast to all typical wearable hearing aid solutions (ITC, ITE, BTE), the proposed device is mounted in the infant's bed with any parts of its set-up...

Full text available to download

Learning design of a blended course in technical writing

Publication

I. Mokwa-Tarnowska

- Beyond Philology: An International Journal of Linguistics, Literary Studies and English Language Teaching - Year 2013

Blending face-to-face classes with e-learning components can lead to a very successful outcome if the blend of approaches, methods, content, space, time, media and activities is carefully structured and approached from both the student’s and the tutor’s perspective. In order to blend synchronous and asynchronous e-learning activities with traditional ones, educators should make them inter-dependent and develop them according to...

Full text available to download

Circumlocutions with the noun peopo ‘people’ in Hawai’i Creole English

Publication

K. Radomyski

- Beyond Philology: An International Journal of Linguistics, Literary Studies and English Language Teaching - Year 2020

Full text to download in external service

MODERNIST, 1920S AND 1930S INDUSTRIAL ARCHITECTURE OF THE PORT OF GDYNIA - IN SEARCH OF AN AESTHETIC LANGUAGE FOR UTILITARIAN BUILDINGS OF THE POLISH GATEWAY TO THE WORLD

Publication

A. Orchowska

- Year 2016

The purpose of the article is to present the results of the research on the aspects of the Port of Gdynia modernist architecture aesthetics. Its construction was one of the two major projects carried out in the interwar period in Poland. In the course of analyses it has been attempted to answer the question whether an individual aesthetic language has been created in the 1920s and 1930s for the industrial architecture of the Polish...

Full text to download in external service

Towards More Realistic Probabilistic Models for Data Structures: The External Path Length in Tries under the Markov Model

Publication

K. Leckey
R. Neininger
W. Szpankowski

- Year 2013

Tries are among the most versatile and widely used data structures on words. They are pertinent to the (internal) structure of (stored) words and several splitting procedures used in diverse contexts ranging from document taxonomy to IP addresses lookup, from data compression (i.e., Lempel- Ziv'77 scheme) to dynamic hashing, from partial-match queries to speech recognition, from leader election algorithms to distributed hashing...

Information retrieval with semantic memory model

Publication

J. Szymański

- Cognitive Systems Research - Year 2011

Psycholinguistic theories of semantic memory form the basis of understanding of natural language concepts. These theories are used here as an inspiration for implementing a computational model of semantic memory in the form of semantic network. Combining this network with a vector-based object-relation-feature value representation of concepts that includes also weights for confidence and support, allows for recognition of concepts...

Full text to download in external service

The Bridge to Knowledge – Open Access to Scientific Research Results on Multidisciplinary Open System Transferring Knowledge Platform

Publication

- TASK Quarterly - Year 2017

The European policy of Open Access to scientific research is now one of the key issues discussed in public debates on the future development of scientific communication. The implementation of Open Access tools has significant impact on scientific and economic growth. On the one hand, Open Access accelerates disseminating new research findings and facilitates recognition of authors on a more global scale. On the other hand, Open...

Full text available to download

The Russian Federation in European Union Programmes

Publication

K. Gomółka

- Annales Universitatis Mariae-Curie Skłodowska, sectio K – Politologia - Year 2017

Since the early 1990s, the European Union has been supporting socio-economic transformations in the former Soviet Union states, including the Russian Federation. Initially, this assistance was provided in the framework of the TACIS Programme, offering long-term, non-repayable aid. In 1991–2006 Russia received EUR 2.7bn for the restructuring of the state enterprise sector, establishment of private companies, state administration...

Full text available to download

Search

Filters

Catalog

Category

Year

Options

Search results for: MODALITY CORPUS · ENGLISH LANGUAGE CORPUS · SPEECH RECOGNITION · AVSR