Wyniki wyszukiwania dla: BIMODAL SPEECH RECOGNITION

Wyniki wyszukiwania dla: BIMODAL SPEECH RECOGNITION

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 1051

wyczyść wszystkie filtry niedostępne

wyświetlamy 1000 najlepszych wyników Pomoc

Andrzej Stateczny prof. dr hab. inż.

Osoby

Prof. dr hab. inż. Andrzej Stateczny jest profesorem Politechniki Gdańskiej i prezesem firmy Marine Technology Ltd. Jego zainteresowania naukowe koncentrują się głównie wokół nawigacji, hydrografii i geoinformatyki. Obecnie prowadzone badania obejmują nawigację radarową, nawigację porównawczą, hydrografię, metody sztucznej inteligencji w zakresie przetwarzania obrazów i fuzji danych wielosensorycznych. Był kierownikiem lub głównym...
Metoda i algorytmy modyfikacji sygnału do celu wspomagania rozumienia mowy przez osoby z pogorszoną rozdzielczością czasową słuchu
Publikacja
- A. Kupryjanow
- Rok 2013
Przedmiotem badań przeprowadzonych w ramach rozprawy są metody modyfikacji czasu trwania sygnału (ang. Time Scale Modification –TSM) mowy operujące w czasie rzeczywistym oraz ocena ich wpływu na rozumienie wypowiedzi przez osoby z pogorszoną rozdzielczością czasową słuchu. Pogorszona rozdzielczość słuchu jest jednym z symptomów związanych z ośrodkowymi zaburzeniami słuchu (ang. Cetnral Auditory Processing Disorder – CAPD). W odróżnieniu...
Australian Pattern Recognition Society Conference

Konferencje
International Conference on Frontiers of Handwriting Recognition

Konferencje
International Conference on Image Analysis and Recognition

Konferencje
Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention
Publikacja
- D. Korzekwa
- R. Barra-Chicote
- S. Zaporowski
- G. Beringer
- J. Lorenzo-trueba
- A. Serafinowicz
- J. Droppo
- T. Drugman
- B. Kostek
- Rok 2021
This paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de...

Pełny tekst do pobrania w portalu
Zastosowanie spowalniania wypowiedzi w celu poprawy rozumienia mowy przez dzieci w szkole
Publikacja
- A. Kupryjanow
- A. Czyżewski
- Rok 2009
This paper presents a time-scale modification algorithms that could be used for hearing impairment therapy supported by real-time speech stretching. In this paper the OLA based algorithms and Phase Vocoder were described. In the experimental part usability of those algorithms for real-time speech stretching was discussed
IEEE Conference on Computer Vision and Pattern Recognition

Konferencje
International Workshop on Pattern Recognition in Information Systems

Konferencje
International Conference on Pattern Recognition Applications and Methods

Konferencje
International Conference on Artificial Intelligence and Pattern Recognition

Konferencje
IEEE International Conference on Document Analysis and Recognition

Konferencje
Instantaneous complex frequency for pipeline pitch estimation
Publikacja
- M. [. Kaniewska
- Rok 2010
In the paper a pipeline algorithm for estimating the pitch of speech signal is proposed. The algorithm uses instantaneous complex frequencies estimated for four waveforms obtained by filtering the original speech signal through four bandpass complex Hilbert filters. The imaginary parts of ICFs from each channel give four candidates for pitch estimates. The decision regarding the final estimate is made based on the real parts of...
Simultaneous determination of thermodynamic and kinetic parameters of aminopolycarbonate complexes of cobalt(II) and nickel(II) based on isothermal titration calorimetry data
Publikacja
- A. Tesmar
- D. Wyrzykowski
- E. Muñoz
- B. Pilarski
- J. Pranczk
- D. Jacewicz
- L. Chmurzyński
- JOURNAL OF MOLECULAR RECOGNITION - Rok 2017
Pełny tekst do pobrania w serwisie zewnętrznym
Zinc(II) complexation by some biologically relevant pH buffers
Publikacja
- D. Wyrzykowski
- A. Tesmar
- D. Jacewicz
- J. Pranczk
- L. Chmurzyński
- JOURNAL OF MOLECULAR RECOGNITION - Rok 2014
Pełny tekst do pobrania w serwisie zewnętrznym
Digital fingerprinting for color images based on the quaternion encryption scheme
Publikacja
- PATTERN RECOGNITION LETTERS - Rok 2014
In this paper we present a new quaternion-based encryption technique for color images. In the proposed encryption method, images are written as quaternions and are rotated in a three-dimensional space around another quaternion, which is an encryption key. The encryption process uses the cipher block chaining (CBC) mode. Further, this paper shows that our encryption algorithm enables digital fingerprinting as an additional feature....

Pełny tekst do pobrania w serwisie zewnętrznym
Engineering Candida albicans glucosamine-6-phosphate synthase for efficient enzyme purification
Publikacja
- J. Czarnecka
- K. Kwiatkowska
- I. Gabriel
- M. Wojciechowski
- S. Milewski
- JOURNAL OF MOLECULAR RECOGNITION - Rok 2012
Rationally designed muteins of Candida albicans glucosamine-6-phosphate synthase, an enzyme known as a promising target for antifungal chemotherapy, were constructed, overexpressed in Escherichia coli and purified to near homogeneity. To facilitate and to optimize the purification of the enzyme, three recombinant versionscontaining internal oligoHis fragments were constructed: (i) by substituting residues 343 - 348...

Pełny tekst do pobrania w serwisie zewnętrznym
Bridging challenges of clinical decision support systems with a semantic approach. A case study on breast cancer
Publikacja
- E. Szczerbicki
- C. Sanin
- C. Toro
- PATTERN RECOGNITION LETTERS - Rok 2013
The integration of Clinical Decision Support Systems (CDSS) in nowadays clinical environments has not been fully achieved yet. Although numerous approaches and technologies have been proposed since 1960, there are still open gaps that need to be bridged. In this work we present advances from the established state of the art, overcoming some of the most notorious reported difficulties in: (i) automating CDSS, (ii) clinical workflow...

Pełny tekst do pobrania w serwisie zewnętrznym
XVIII Międzynarodowe Sympozjum Inżynierii i Reżyserii Dźwięku
Publikacja
- P. Falkowski-Gilski
- S. Brachmański
- A. Dobrucki
- M. Kin
- Rok 2021
The subjective assessment of speech signals takes into account previous experiences and habits of an individual. Since the perception process deteriorates with age, differences should be noticeable among people from dissimilar age groups. In this work, we investigated the difference of speech quality assessment between high school students and university students. The study involved 60 participants, with 30 people in both the adolescents...

Pełny tekst do pobrania w serwisie zewnętrznym
Creating new voices using normalizing flows
Publikacja
- P. Biliński
- T. Merritt
- A. Ezzerg
- K. Pokora
- S. Cygert
- K. Yanagisawa
- R. Barra-Chicote
- D. Korzekwa
- Rok 2022
Creating realistic and natural-sounding synthetic speech remains a big challenge for voice identities unseen during training. As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of normalizing flows in text-to-speech (TTS) and voice conversion (VC) modes to extrapolate from speakers observed during training to create unseen speaker identities. Firstly, we create an approach for TTS...

Pełny tekst do pobrania w portalu
Human voice modification using instantaneous complex frequency
Publikacja
- M. Kaniewska
- Rok 2010
The paper presents the possibilities of changing human voice by modifying instantaneous complex frequency (ICF) of the speech signal. The proposed method provides a flexible way of altering voice without the necessity of finding fundamental frequency and formants' positions or detecting voiced and unvoiced fragments of speech. The algorithm is simple and fast. Apart from ICF it uses signal factorization into two factors: one fully...
Strategie treningu neuronowego estymatora częstotliwości tonu krtaniowego z użyciem generatora syntetycznych samogłosek
Publikacja
- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Rok 2022
W wielu zastosowaniach telekomunikacyjnych pojawia się problem przetwarzania lub analizy sygnału mowy, w ramach którego, często w obszarze podstawowych algorytmów, stosuje się estymator częstotliwości tonu krtaniowego. Estymator rozpatrywany w tej pracy bazuje na neuronowym klasyfikatorze podejmującym decyzje na podstawie częstotliwości oraz mocy chwilowej wyznaczanych w podpasmach analizowanego sygnału mowy. W pracy rozważamy...

Pełny tekst do pobrania w portalu
Auditory-visual attention stimulator
Publikacja
- Rok 2013
New approach to lateralization irregularities formation was proposed. The emphasis is put on the relationship between visual and auditory attention stimulation. In this approach hearing is stimulated using time scale modified speech and sight is stimulated by rendering the text of the currently heard speech. Moreover, displayed text is modified using several techniques i.e. zooming, highlighting etc. In the experimental part of...

Pełny tekst do pobrania w serwisie zewnętrznym
International Conference on Advances in Pattern Recognition and Digital Techniques

Konferencje
IEEE International Conference on Automatic Face and Gesture Recognition

Konferencje
INVESTIGATION OF THE LOMBARD EFFECT BASED ON A MACHINE LEARNING APPROACH
Publikacja
- G. Korvel
- P. Treigys
- K. Kąkol
- B. Kostek
- International Journal of Applied Mathematics and Computer Science - Rok 2023
The Lombard effect is an involuntary increase in the speaker’s pitch, intensity, and duration in the presence of noise. It makes it possible to communicate in noisy environments more effectively. This study aims to investigate an efficient method for detecting the Lombard effect in uttered speech. The influence of interfering noise, room type, and the gender of the person on the detection process is examined. First, acoustic parameters...

Pełny tekst do pobrania w portalu
Audio-visual aspect of the Lombard effect and comparison with recordings depicting emotional states.
Publikacja
- Rok 2018
In this paper an analysis of audio-visual recordings of the Lombard effect is shown. First, audio signal is analyzed indicating the presence of this phenomenon in the recorded sessions. The principal aim, however, was to discuss problems related to extracting differences caused by the Lombard effect, present in the video , i.e. visible as tension and work of facial muscles aligned to an increase in the intensity of the articulated...

Pełny tekst do pobrania w serwisie zewnętrznym
Auditory Brainstem Responses recorded employing Audio ABR device
Dane Badawcze
open access
- P. Odya
- A. Czyżewski
The dataset consists of ABR measurements employing click, burst and speech stimuli. Parameters of the particular stimuli were as follows:
Pracujący w czasie rzeczywistym system detekcji gazów wykorzystujący przenośny komputer Raspberry PI oraz matrycę półprzewodnikowych czujników gazu
Publikacja
- Elektronika : konstrukcje, technologie, zastosowania - Rok 2014
The gas-analyzing systems based on the array of partially selective gas sensors and pattern-recognition techniques are potentially fast and lowcost alternative for other devices, like gas‑analysers. They give the possibility of recognition the type and the concentration of measured volatile compounds in their working environment. In this work we present the implementation of gas recognition system, in which the signals from an...

Pełny tekst do pobrania w serwisie zewnętrznym
Enhanced Mechanical and Electromechanical Properties of Compositionally Complex Zirconia Zr1–x(Gd1/5Pr1/5Nd1/5Sm1/5Y1/5)xO2−δ Ceramics
Publikacja
- A. Kabir
- B. Lemieszek
- M. Varenik
- V. Buratto Tinti
- S. Molin
- I. Lubomirsky
- V. Esposito
- F. Kern
- ACS Applied Materials & Interfaces - Rok 2024
Compositionally complex oxides (CCOs) or high-entropy oxides (HEOs) are new multi-element oxides with unexplored physical and functional properties. In this work, we report fluorite structure derived compositionally complex zirconia with composition Zr1- x(Gd1/5Pr1/5Nd1/5Sm1/5Y1/5)xO2-δ (x = 0.1 and 0.2) synthesized in solid-state reaction route and sintered via hot pressing at 1350 °C. We explore the evolution of these oxides'...

Pełny tekst do pobrania w serwisie zewnętrznym
Variable Ratio Sample Rate Conversion Based on Fractional Delay Filter
Publikacja
- M. Blok
- P. Drózda
- Archives of Acoustics - Rok 2014
In this paper a sample rate conversion algorithm which allows for continuously changing resampling ratio has been presented. The proposed implementation is based on a variable fractional delay filter which is implemented by means of a Farrow structure. Coefficients of this structure are computed on the basis of fractional delay filters which are designed using the offset window method. The proposed approach allows us to freely...

Pełny tekst do pobrania w portalu
Interactions with recognized patients using smart glasses
Publikacja
- J. Rumiński
- M. Smiatacz
- A. Bujnowski
- A. Andrushevich
- M. Biallas
- R. Kistler
- Rok 2015
Recently, different smart glasses solutions have been proposed on the market. The rapid development of this wearable technology has led to several research projects related to applications of smart glasses in healthcare. In this paper we propose a general architecture of the system enabling data integration for the recognized person. In the proposed system smart glasses integrates data obtained for the recognized patient from health...

Pełny tekst do pobrania w serwisie zewnętrznym
Prof. Haitham Abu-Rub - A Visit to Poland's Gdansk University of Technology
Publikacja
- J. Guziński
- IEEE Industrial Electronics Magazine - Rok 2015
Report on visit of Prof. Haitham Abu-Rub in Gdansk University of Technology. Speech on the Smart Grid Centre. Visit in the new smart grid laboratory of the GUT, the Laboratory for Innovative Power Technologies and Integration of Renewable Energy Sources (LINTE^2).

Pełny tekst do pobrania w portalu
A Comparison of STI Measured by Direct and Indirect Methods for Interiors Coupled with Sound Reinforcement Systems
Publikacja
- Rok 2018
This paper presents a comparison of STI (Speech Transmission Index) coefficient measurement results carried out by direct and indirect methods. First, acoustic parameters important in the context of public address and sound reinforcement systems are recalled. A measurement methodology is presented that employs various test signals to determine impulse responses. The process of evaluating sound system performance, signals enabling...

Pełny tekst do pobrania w serwisie zewnętrznym
Investigation of educational processes with affective computing methods
Publikacja
- A. Landowska
- G. Brodny
- e-mentor - Rok 2017
This paper concerns the monitoring of educational processes with the use of new technologies for the recognition of human emotions. This paper summarizes results from three experiments, aimed at the validation of applying emotion recognition to e-learning. An analysis of the experiments’ executions provides an evaluation of the emotion elicitation methods used to monitor learners. The comparison of affect recognition algorithms...

Pełny tekst do pobrania w portalu
Rediscovering Automatic Detection of Stuttering and Its Subclasses through Machine Learning—The Impact of Changing Deep Model Architecture and Amount of Data in the Training Set
Publikacja
- P. Filipowicz
- B. Kostek
- Applied Sciences-Basel - Rok 2023
This work deals with automatically detecting stuttering and its subclasses. An effective classification of stuttering along with its subclasses could find wide application in determining the severity of stuttering by speech therapists, preliminary patient diagnosis, and enabling communication with the previously mentioned voice assistants. The first part of this work provides an overview of examples of classical and deep learning...

Pełny tekst do pobrania w portalu
Gesture-based computer control system
Publikacja
- Elektronika : konstrukcje, technologie, zastosowania - Rok 2010
In the paper a system for controlling computer applications by hand gestures is presented. First, selected methods used for gesture recognition are described. The system hardware and a way of controlling a computer by gestures are described. The architecture of the software along with hand gesture recognition methods and algorithms used are presented. Examples of basic and complex gestures recognized by the system are given.

Pełny tekst do pobrania w serwisie zewnętrznym
Automatic Classification of Polish Sign Language Words
Publikacja
- T. Dziubich
- J. Szymański
- Przegląd Elektrotechniczny - Rok 2014
In the article we present the approach to automatic recognition of hand gestures using eGlove device. We present the research results of the system for detection and classification of static and dynamic words of Polish language. The results indicate the usage of eGlove allows to gain good recognition quality that additionally can be improved using additional data sources such as RGB cameras.

Pełny tekst do pobrania w portalu
Comparative analysis of various transformation techniques for voiceless consonants modeling
Publikacja
- G. Korvel
- B. Kostek
- O. Kurasova
- International Journal of Computers Communications & Control - Rok 2018
In this paper, a comparison of various transformation techniques, namely Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT) and Discrete Walsh Hadamard Transform (DWHT) are performed in the context of their application to voiceless consonant modeling. Speech features based on these transformation techniques are extracted. These features are mean and derivative values of cepstrum coefficients, derived from each transformation....

Pełny tekst do pobrania w portalu
Modeling and Designing Acoustical Conditions of the Interior – Case Study
Publikacja
- Archives of Acoustics - Rok 2016
The primary aim of this research study was to model acoustic conditions of the Courtyard of the Gdańsk University of Technology Main Building, and then to design a sound reinforcement system for this interior. First, results of measurements of the parameters of the acoustic field are presented. Then, the comparison between measured and predicted values using the ODEON program is shown. Collected data indicate a long reverberation...

Pełny tekst do pobrania w portalu
Automatic music set organizatio based on mood of music / Automatyczna organizacja bazy muzycznej na podstawie nastroju muzyki
Publikacja
- M. Piotrowska
- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Rok 2017
This work is focused on an approach based on the emotional content of music and its automatic recognition. A vector of features describing emotional content of music was proposed. Additionally, a graphical model dedicated to the subjective evaluation of mood of music was created. A series of listening tests was carried out, and results were compared with automatic mood recognition employing SOM (Self Organizing Maps) and ANN (Artificial...

Pełny tekst do pobrania w serwisie zewnętrznym
Employing a biofeedback method based on hemispheric synchronization in effective learning
Publikacja
- Rok 2012
In this paper an approach to build a brain computer-based hemispheric synchronization system is presented. The concept utilizes the wireless EEG signal registration and acquisition as well as advanced pre-processing methods. The influence of various filtration techniques of EOG artifacts on brain state recognition is examined. The emphasis is put on brain state recognition using band pass filtration for separation of individual...

Pełny tekst do pobrania w serwisie zewnętrznym
Krzysztof Goczyła prof. dr hab. inż.

Osoby

Katedra Inżynierii Oprogramowania

Krzysztof Goczyła, profesor zwyczajny Politechniki Gdańskiej, informatyk, specjalista z inżynierii oprogramowania, inżynierii wiedzy i baz danych. Ukończył studia wyższe na Wydziale Elektroniki Politechniki Gdańskiej w 1976 r. jako magister inżynier elektronik w specjalności automatyka. Na Politechnice Gdańskiej pracuje od 1976. Na Wydziale Elektroniki PG w 1982 r. uzyskał doktorat z informatyki, a w 1999 r. habilitację. W 2012...
Endoscopic Video Classification with the Consideration of Temporal Patterns
Publikacja
- Rok 2012
The article describes a novel approach to automatic recognition and classification of diseases in endoscopic videos. Current directions of research in this field are discussed. Most presented methods focus on processing single frames and do not take into consideration the temporal relationship between continuous classifications. Existing approaches that consider the temporal structure of an incoming frame sequence are focused on...
Wykorzystanie sztucznych sieci neuronowych do wykrywania i rozpoznawania tablic rejestracyjnych na zdjęciach pojazdów
Publikacja
- M. Huzarek
- T. A. Rutkowski
- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Rok 2015
W artykule przedstawiono koncepcję algorytmu wykrywania i rozpoznawania tablic rejestracyjnych (AWiRTR) na obrazach cyfrowych pojazdów. Detekcja i lokalizacja tablic rejestracyjnych oraz wyodrębnienie z obrazu tablicy rejestracyjnej poszczególnych znaków odbywa się z wykorzystaniem podstawowych technik przetwarzania obrazu (przekształcenia morfologiczne, wykrywanie krawędzi) jak i podstawowych danych statystycznych obiektów wykrytych...

Pełny tekst do pobrania w portalu
FEEDB: A multimodal database of facial expressions and emotions
Publikacja
- M. Szwoch
- Rok 2013
In this paper a first version of a multimodal FEEDB database of facial expressions and emotions is presented. The database contains labeled RGB-D recordings of people expressing a specific set of expressions that have been recorded using Microsoft Kinect sensor. Such a database can be used for classifier training and testing in face recognition as well as in recognition of facial expressions and human emotions. Also initial experiences...

Pełny tekst do pobrania w serwisie zewnętrznym
A video monitoring system using ontology-driven identification of threats
Publikacja
- P. Kaczmarek
- P. Zielonka
- Rok 2009
In this paper, we present a video monitoring systemthat leverages image recognition and ontological reasoningabout threats. In the solution, an image processing subsystemuses video recording of a monitored area and recognizesknown concepts in scenes. Then, a reasoning subsystem uses anontological description of security conditions and informationfrom image recognition to check if a violation of a conditionhas occurred. If a threat...

Pełny tekst do pobrania w serwisie zewnętrznym
Towards New Mappings between Emotion Representation Models
Publikacja
- A. Landowska
- Applied Sciences-Basel - Rok 2018
There are several models for representing emotions in affect-aware applications, and available emotion recognition solutions provide results using diverse emotion models. As multimodal fusion is beneficial in terms of both accuracy and reliability of emotion recognition, one of the challenges is mapping between the models of affect representation. This paper addresses this issue by: proposing a procedure to elaborate new mappings,...

Pełny tekst do pobrania w portalu
Playback detection using machine learning with spectrogram features approach
Publikacja
- J. Dembski
- J. Rumiński
- Rok 2017
This paper presents 2D image processing approach to playback detection in automatic speaker verification (ASV) systems using spectrograms as speech signal representation. Three feature extraction and classification methods: histograms of oriented gradients (HOG) with support vector machines (SVM), HAAR wavelets with AdaBoost classifier and deep convolutional neural networks (CNN) were compared on different data partitions in respect...

Pełny tekst do pobrania w portalu
Endoscopic Videos Deinterlacing and On-Screen Text and Light Flashes Removal and Its Influence on Image Analysis Algorithms' Efficiency
Publikacja
- International Journal of Image Processing and Visual Communication - Rok 2013
In this article, deinterlacing and removing on- screen text and light flashes methods on endoscopic video images are discussed. The research is intended to improve disease recognition algorithms' performance. In the article, four configurations of deinterlacing methods and another four configurations of text and flashes removal methods are described and examined. The efficiency of endoscopic video analysis algorithms is measured...

Pełny tekst do pobrania w serwisie zewnętrznym

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: BIMODAL SPEECH RECOGNITION

Andrzej Stateczny prof. dr hab. inż.

Krzysztof Goczyła prof. dr hab. inż.