Wyniki wyszukiwania dla: speech processing

Impact of the glazed roof on acoustics of historic interiors

Publikacja

A. Kulowski

- Rok 2018

The paper discusses the adverse acoustic phenomena occurring in the semi-open interiors (courtyards, yards) covered with a glass roof. Particularly negative is the rever-beration noise, which leads to the degradation of the utility functions of the resulting spaces. It involves the drastically reducing the intelligibility of speech, loss of natural sounding of music, problems with the sound system, as well as disturbances in the...

Subjective and Objective Comparative Study of DAB+ Broadcast System

Publikacja

- Archives of Acoustics - Rok 2017

Broadcasting services seek to optimize their use of bandwidth in order to maximize user’s quality of experience. They aim to transmit high-quality digital speech and music signals at the lowest bitrate. They intend to offer the best quality under available conditions. Due to bandwidth limitations, audio quality is in conflict with the number of transmitted radio programs. This paper analyzes whether the quality of real-time digital...

Pełny tekst do pobrania w portalu

Difference in Perceived Speech Signal Quality Assessment Among Monolingual and Bilingual Teenage Students

Publikacja

P. Falkowski-Gilski

- Rok 2021

The user perceived quality is a mixture of factors, including the background of an individual. The process of auditory perception is discussed in a wide variety of fields, ranging from engineering to medicine. Many studies examine the difference between musicians and non-musicians. Since musical training develops musical hearing and other various auditory capabilities, similar enhancements should be observable in case of bilingual...

Pełny tekst do pobrania w serwisie zewnętrznym

Analysis of a caustic formed by a spherical reflector: Impact of a caustic on architectural acoustics

Publikacja

A. Kulowski

- APPLIED ACOUSTICS - Rok 2020

Focusing sound in rooms intended for listening to music or speech is an acoustic defect. Design recommendations provide remedial steps to effectively prevent this. However, there is a category of objects of high historical or architectural value in which the sound focus correction is limited or even abandoned. This also applies to indoor or outdoor concert shells, installations for teaching and acoustic presentations, etc. The...

Pełny tekst do pobrania w portalu

Smartphone application supporting independent movement of the blind

Publikacja

- Rok 2011

Improving comfort of life of blind people is a problem of great importance. Neither a white canenor a guide dog, although both very useful, can be considered as a tool for achieving fullindependence in everyday movement around the city. On the market there are some navigation toolsinspired by car navigation systems, but they have many flaws, ranging from positioninginaccuracies to high prices. The authors present their own solution...

ANALIZA PARAMETRÓW SYGNAŁU MOWY W KONTEKŚCIE ICH PRZYDATNOŚCI W AUTOMATYCZNEJ OCENIE JAKOŚCI EKSPRESJI ŚPIEWU

Publikacja

- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Rok 2019

Praca dotyczy podejścia do parametryzacji w przypadku klasyfikacji emocji w śpiewie oraz porównania z klasyfikacją emocji w mowie. Do tego celu wykorzystano bazę mowy i śpiewu nacechowanego emocjonalnie RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song), zawierającą nagrania profesjonalnych aktorów prezentujących sześć różnych emocji. Następnie obliczono współczynniki mel-cepstralne (MFCC) oraz wybrane deskryptory...

Pełny tekst do pobrania w portalu

Analyzing the relationship between sound, color, and emotion based on subjective and machine-learning approaches

Publikacja

- Rok 2024

The aim of the research is to analyze the relationship between sound, color, and emotion. For this purpose, a survey application was prepared, enabling the assignment of a color to a given speaker’s/singer’s voice recordings. Subjective tests were then conducted, enabling the respondents to assign colors to voice/singing samples. In addition, a database of voice/singing recordings of people speaking in a natural way and with expressed...

Pełny tekst do pobrania w portalu

Highlighting interlanguage phoneme differences based on similarity matrices and convolutional neural network

Publikacja

G. Korvel
P. Treigys
B. Kostek

- Journal of the Acoustical Society of America - Rok 2021

The goal of this research is to find a way of highlighting the acoustic differences between consonant phonemes of the Polish and Lithuanian languages. For this purpose, similarity matrices are employed based on speech acoustic parameters combined with a convolutional neural network (CNN). In the first experiment, we compare the effectiveness of the similarity matrices applied to discerning acoustic differences between consonant...

Pełny tekst do pobrania w portalu

Pursuing Analytically the Influence of Hearing Aid Use on Auditory Perception in Various Acoustic Situations

Publikacja

P. Szymański
T. Poremski
B. Kostek

- Vibrations in Physical Systems - Rok 2022

The paper presents the development of a method for assessing auditory perception and the effectiveness of applying hearing aids for hard-of-hearing people during short-term (up to 7 days) and longer-term (up to 3 months) use. The method consists of a survey based on the APHAB questionnaire. Additional criteria such as the degree of hearing loss, technological level of hearing aids used, as well as the user experience are taken...

Pełny tekst do pobrania w portalu

Discovery of Stylistic Patterns in Business Process Textual Descriptions: IT Ticket Case

Publikacja

N. Rizun
A. Revina
V. Maister

- Rok 2019

Growing IT complexity and related problems, which are reflected in IT tickets,create a need for new qualitative approaches. The goal isto automate the extraction of main topics described in tickets in order to provide high quality support for the IT process workers and enablea smooth service delivery to the end user. Present paper proposes a method of knowledge extraction in a form of stylistic patterns in business...

Pełny tekst do pobrania w portalu

Analysis of allophones based on audio signal recordings and parameterization

Publikacja

- Journal of the Acoustical Society of America - Rok 2017

The aim of this study is to develop an allophonic description of English plosive consonants based on recordings of 600 specially selected words. Allophonic variations addressed in the study may have two sources: positional and contextual. The former one depends on the syllabic or prosodic position in which a particular phoneme occurs. Contextual allophony is conditioned by the local phonetic environment. Co-articulation overlapping...

Pełny tekst do pobrania w serwisie zewnętrznym

Separability Assessment of Selected Types of Vehicle-Associated Noise

Publikacja

- Advances in Intelligent Systems and Computing - Rok 2016

Music Information Retrieval (MIR) area as well as development of speech and environmental information recognition techniques brought various tools in-tended for recognizing low-level features of acoustic signals based on a set of calculated parameters. In this study, the MIRtoolbox MATLAB tool, designed for music parameter extraction, is used to obtain a vector of parameters to check whether they are suitable for separation of...

Pełny tekst do pobrania w serwisie zewnętrznym

The influence of time of hearing aid use on auditory perception in various acoustic situations

Publikacja

P. Szymański
T. Poremski
B. Kostek

- Journal of the Acoustical Society of America - Rok 2018

The assessment of sound perception in hearing aids, especially in the context of benefits that a prosthesis can bring, is a complex issue. The objective parameters of the hearing aids can easily be determined. These parameters, however, do not always have a direct and decisive influence on the subjective assessment of quality of the patient’s hearing while using a hearing aid. The paper presents the development of a method for...

Pełny tekst do pobrania w serwisie zewnętrznym

Automatic Emotion Recognition in Children with Autism: A Systematic Literature Review

Publikacja

A. Landowska
A. Karpus
T. Zawadzka
B. Robins
D. Erol Barkana
H. Kose
T. Zorcec
N. Cummins

- SENSORS - Rok 2022

The automatic emotion recognition domain brings new methods and technologies that might be used to enhance therapy of children with autism. The paper aims at the exploration of methods and tools used to recognize emotions in children. It presents a literature review study that was performed using a systematic approach and PRISMA methodology for reporting quantitative and qualitative results. Diverse observation channels and modalities...

Pełny tekst do pobrania w portalu

Towards More Realistic Probabilistic Models for Data Structures: The External Path Length in Tries under the Markov Model

Publikacja

K. Leckey
R. Neininger
W. Szpankowski

- Rok 2013

Tries are among the most versatile and widely used data structures on words. They are pertinent to the (internal) structure of (stored) words and several splitting procedures used in diverse contexts ranging from document taxonomy to IP addresses lookup, from data compression (i.e., Lempel- Ziv'77 scheme) to dynamic hashing, from partial-match queries to speech recognition, from leader election algorithms to distributed hashing...

Noise profiling for speech enhancement employing machine learning models

Publikacja

K. Kąkol
G. Korvel
B. Kostek

- Journal of the Acoustical Society of America - Rok 2022

This paper aims to propose a noise profiling method that can be performed in near real-time based on machine learning (ML). To address challenges related to noise profiling effectively, we start with a critical review of the literature background. Then, we outline the experiment performed consisting of two parts. The first part concerns the noise recognition model built upon several baseline classifiers and noise signal features...

Pełny tekst do pobrania w portalu

A palatal prosthesis from archaeological research in the St Francis of Assisi church in Cracow (Poland)

Publikacja

A. E. Spinek
M. Kurek
K. Demidziuk
M. Nowak
M. Śliwka-Kaszyńska
A. Drążkowska

- Journal of Archaeological Science-Reports - Rok 2024

The hard palate is a septum that not only prevents food from entering between the oral and nasal cavity, but also plays an important role during breathing or speech. The presence of cavities within it negatively affects the comfort of life of people with this type of impairment. Hence, in the literature one can find examples of the use of hard palate prostheses to restore the separation between the nasal and oral cavity. During...

Pełny tekst do pobrania w serwisie zewnętrznym

Assessment of hearing in coma patients employing auditory brainstem response, electroencephalography, and eye-gaze-tracking

Publikacja

- Journal of the Acoustical Society of America - Rok 2017

The results of the study conducted by Tagliaferri et al. in 12 European countries indicate that the ratio of registered brain injury cases in Europe amounts to 150-300 per 100 000 people, with the European mean value of 235 cases per 100 000 people. The project presented in the paper assumes development of a combined metric of patients’ state remaining in coma by intelligent fusion of GCS (subjective Glasgow Coma Scale or its derivatives)...

Pełny tekst do pobrania w serwisie zewnętrznym

Voiceless Stop Consonant Modelling and Synthesis Framework Based on MISO Dynamic System

Publikacja

- Archives of Acoustics - Rok 2017

A voiceless stop consonant phoneme modelling and synthesis framework based on a phoneme modelling in low-frequency range and high-frequency range separately is proposed. The phoneme signal is decomposed into the sums of simpler basic components and described as the output of a linear multiple-input and single-output (MISO) system. The impulse response of each channel is a third order quasi-polynomial. Using this framework, the...

Pełny tekst do pobrania w portalu

Contactless hearing aid designed for infants

Publikacja

- Archives of Acoustics - Rok 2006

It is a well known fact that language development through home intervention for a hearing-impaired infant should start in the early months of a newborn baby's life. The aim of this paper is to present a concept of a contactless digital hearing aid designed especially for infants. In contrast to all typical wearable hearing aid solutions (ITC, ITE, BTE), the proposed device is mounted in the infant's bed with any parts of its set-up...

Pełny tekst do pobrania w portalu

Multimodal human-computer interfaces based on advanced video and audio analysis

Publikacja

- Rok 2013

Multimodal interfaces development history is reviewed briefly in the introduction. Examples of applications of multimodal interfaces to education software and for the disabled people are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with mouth gestures and the audio interface for speech stretching for hearing impaired and stuttering people. The Smart...

Pełny tekst do pobrania w serwisie zewnętrznym

Digital Transformation of Terrestrial Radio: An Analysis of Simulcasted Broadcasts in FM and DAB+ for a Smart and Successful Switchover

Publikacja

P. Falkowski-Gilski

- Applied Sciences-Basel - Rok 2021

The process of digitizing radio is far from over. It is an important interdisciplinary aspect, involving Big Data and AI (Artificial Intelligence) when it comes to classifying and handling content, and an organizational challenge in the Industry 4.0 concept. There exist several methods for delivering audio signals, including terrestrial broadcasting and internet streaming. Among them, the DAB+ (Digital Audio Broadcasting plus)...

Pełny tekst do pobrania w portalu

BPL-PLC Voice Communication System for the Oil and Mining Industry

Publikacja

G. Debita
P. Falkowski-Gilski
M. Habrych
G. Wiśniewski
B. Miedziński
P. Jedlikowski
A. Waniewska
J. Wandzio
B. Polnik

- ENERGIES - Rok 2020

Application of a high-efficiency voice communication systems based on broadband over power line-power line communication (BPL-PLC) technology in medium voltage networks, including hazardous areas (like the oil and mining industry), as a redundant mean of wired communication (apart from traditional fiber optics and electrical wires) can be beneficial. Due to the possibility of utilizing existing electrical infrastructure, it can...

Pełny tekst do pobrania w portalu

Waveguide model of the hearing aid earmold system

Publikacja

- Rok 2006

Background The earmold system of the Behind-The-Ear hearing aid is an acoustic system that modifies the spectrum of the propagated sound waves. Improper selection of the earmold system may result in deterioration of sound quality and speech intelligibility. Computer modeling methods may be useful in the process of hearing aid fitting, allowing physician to examine various earmold system configurations and choose the optimum one...

Pełny tekst do pobrania w serwisie zewnętrznym

Waveguide model of the hearing aid earmold system

Publikacja

- Diagnostic Pathology - Rok 2006

Background The earmold system of the Behind-The-Ear hearing aid is an acoustic system that modifies the spectrum of the propagated sound waves. Improper selection of the earmold system may result in deterioration of sound quality and speech intelligibility. Computer modeling methods may be useful in the process of hearing aid fitting, allowing physician to examine various earmold system configurations and choose the optimum one...

Pełny tekst do pobrania w portalu

Multimodal learning application with interactive animated character. [Multimodalna aplikacja edukacyjna wykorzystująca interaktywną animowaną postać]

Publikacja

P. Szczuko

- Rok 2006

The aim of this study is to design a computer application that may assist teachers and therapists in multimodal manner in their work with impaired or disabled children. The application can be operated in many different ways, giving to a child with special educational needs a possibility to learn and train many skills or treat speech disorders. The main stress in this research is on the creation of animated character that will serve...

Voice command recognition using hybrid genetic algorithm

Publikacja

- TASK Quarterly - Rok 2010

Abstract: Speech recognition is a process of converting the acoustic signal into a set of words, whereas voice command recognition consists in the correct identification of voice commands, usually single words. Voice command recognition systems are widely used in the military, control systems, electronic devices, such as cellular phones, or by people with disabilities (e.g., for controlling a wheelchair or operating a computer...

Pełny tekst do pobrania w portalu

Trzej prorocy: Sołżenicyn, Friedman, Dugin. Część pierwsza: Sołżenicyn

Publikacja

Z. Kaźmierczyk

- Rok 2023

Artykuł przedstawia na tle biograficznym dzieło i myśl profetyczną Aleksandra Sołżenicyna. Podstawą jej analizy jest mowa z okazji przyznania autorowi Oddziału chorych na raka literackiej Nagrody Nobla oraz jego wykład na temat stanu cywilizacji Zachodu wygłoszony na Uniwersytecie Harvarda – zatytułowany Zmierzch odwagi. Proroctwa Sołżenicyna dotyczące Zachodu pokazane są w kontekście jego pracy Jak odbudować Rosję? W artykule...

Pełny tekst do pobrania w serwisie zewnętrznym

New Applications of Multimodal Human-Computer Interfaces

Publikacja

A. Czyżewski

- Rok 2012

Multimodal computer interfaces and examples of their applications to education software and for the disabled people are presented. The proposed interfaces include the interactive electronic whiteboard based on video image analysis, application for controlling computers with gestures and the audio interface for speech stretching for hearing impaired and stuttering people. Application of the eye-gaze tracking system to awareness...

Improving listeners' experience for movie playback through enhancing dialogue clarity in soundtracks

Publikacja

- DIGITAL SIGNAL PROCESSING - Rok 2016

This paper presents a method for improving users' quality of experience through processing of movie soundtracks. The dialogue clarity enhancement algorithms were introduced for detecting dialogue in movie soundtrack mixes and then for amplifying the dialogue components. The front channel signals (left, right, center) are analyzed in the frequency domain. The selected partials in the center channel signal, which yield high disparity...

Pełny tekst do pobrania w serwisie zewnętrznym

Decoding soundscape stimuli and their impact on ASMR studies

Publikacja

- International Journal of Electronics and Telecommunications - Rok 2024

This paper focuses on extracting and understanding the acoustical features embedded in the soundscape used in ASMR (Autonomous Sensory Meridian Response) studies. To this aim, a dataset of the most common sound effects employed in ASMR studies is gathered, containing whispering stimuli but also sound effects such as tapping and scratching. Further, a comparative analytical survey is performed based on various acoustical features...

Pełny tekst do pobrania w serwisie zewnętrznym

Developing a Low SNR Resistant, Text Independent Speaker Recognition System for Intercom Solutions - A Case Study

Publikacja

- Rok 2024

This article presents a case study on the development of a biometric voice verification system for an intercom solution, utilizing the DeepSpeaker neural network architecture. Despite the variety of solutions available in the literature, there is a noted lack of evaluations for "text-independent" systems under real conditions and with varying distances between the speaker and the microphone. This article aims to bridge this gap....

Pełny tekst do pobrania w portalu

Comparison of the Ability of Neural Network Model and Humans to Detect a Cloned Voice

Publikacja

- Electronics - Rok 2023

The vulnerability of the speaker identity verification system to attacks using voice cloning was examined. The research project assumed creating a model for verifying the speaker’s identity based on voice biometrics and then testing its resistance to potential attacks using voice cloning. The Deep Speaker Neural Speaker Embedding System was trained, and the Real-Time Voice Cloning system was employed based on the SV2TTS, Tacotron,...

Pełny tekst do pobrania w portalu

Detection of dialogue in movie soundtrack for speech intelligibility enhancement

Publikacja

K. Łopatka

- Rok 2014

A method for detecting dialogue in 5.1 movie soundtrack based on interchannel spectral disparity is presented. The front channel signals (left, right, center) are analyzed in the frequency domain. The selected partials in the center channel signal, which yield high disparity with left and right channels, are detected as dialogue. Subsequently, the dialogue frequency components are boosted to achieve increased dialogue intelligibility....

Pełny tekst do pobrania w serwisie zewnętrznym

Comparative Study of Self-Organizing Maps vs. Subjective Evaluation of Quality of Allophone Pronunciation for Nonnative English Speakers

Publikacja

- Rok 2017

The purpose of this study was to apply Self-Organizing Maps to differentiate between the correct and the incorrect allophone pronunciations and to compare the results with subjective evaluation. Recordings of a list of target words, containing selected allophones of English plosive consonants, the velar nasal and the lateral consonant, were made twice. First, the target words were read from the list by 9 non-native speakers and...

Filtry

Katalog

Kategoria

Rok

Opcje

Impact of the glazed roof on acoustics of historic interiors

Subjective and Objective Comparative Study of DAB+ Broadcast System

Difference in Perceived Speech Signal Quality Assessment Among Monolingual and Bilingual Teenage Students

Analysis of a caustic formed by a spherical reflector: Impact of a caustic on architectural acoustics

Smartphone application supporting independent movement of the blind

ANALIZA PARAMETRÓW SYGNAŁU MOWY W KONTEKŚCIE ICH PRZYDATNOŚCI W AUTOMATYCZNEJ OCENIE JAKOŚCI EKSPRESJI ŚPIEWU

Analyzing the relationship between sound, color, and emotion based on subjective and machine-learning approaches

Highlighting interlanguage phoneme differences based on similarity matrices and convolutional neural network

Pursuing Analytically the Influence of Hearing Aid Use on Auditory Perception in Various Acoustic Situations

Discovery of Stylistic Patterns in Business Process Textual Descriptions: IT Ticket Case

Analysis of allophones based on audio signal recordings and parameterization

Separability Assessment of Selected Types of Vehicle-Associated Noise

The influence of time of hearing aid use on auditory perception in various acoustic situations

Automatic Emotion Recognition in Children with Autism: A Systematic Literature Review

Towards More Realistic Probabilistic Models for Data Structures: The External Path Length in Tries under the Markov Model

Noise profiling for speech enhancement employing machine learning models

A palatal prosthesis from archaeological research in the St Francis of Assisi church in Cracow (Poland)

Assessment of hearing in coma patients employing auditory brainstem response, electroencephalography, and eye-gaze-tracking

Voiceless Stop Consonant Modelling and Synthesis Framework Based on MISO Dynamic System

Contactless hearing aid designed for infants

Multimodal human-computer interfaces based on advanced video and audio analysis

Digital Transformation of Terrestrial Radio: An Analysis of Simulcasted Broadcasts in FM and DAB+ for a Smart and Successful Switchover

BPL-PLC Voice Communication System for the Oil and Mining Industry

Waveguide model of the hearing aid earmold system

Waveguide model of the hearing aid earmold system

Multimodal learning application with interactive animated character. [Multimodalna aplikacja edukacyjna wykorzystująca interaktywną animowaną postać]

Voice command recognition using hybrid genetic algorithm

Trzej prorocy: Sołżenicyn, Friedman, Dugin. Część pierwsza: Sołżenicyn

New Applications of Multimodal Human-Computer Interfaces

Improving listeners' experience for movie playback through enhancing dialogue clarity in soundtracks

Decoding soundscape stimuli and their impact on ASMR studies

Developing a Low SNR Resistant, Text Independent Speaker Recognition System for Intercom Solutions - A Case Study

Comparison of the Ability of Neural Network Model and Humans to Detect a Cloned Voice

Detection of dialogue in movie soundtrack for speech intelligibility enhancement

Comparative Study of Self-Organizing Maps vs. Subjective Evaluation of Quality of Allophone Pronunciation for Nonnative English Speakers

Wyszukiwarka

Filtry

Katalog

Kategoria

Rok

Opcje

Wyniki wyszukiwania dla: speech processing