Search results for: voice detection

Comparison of Acoustic and Visual Voice Activity Detection for Noisy Speech Recognition

Publication

- Year 2016

The problem of accurate differentiating between the speaker utterance and the noise parts in a speech signal is considered. The influence of utilizing a voice activity detection in speech signals on the accuracy of the automatic speech recognition (ASR) system is presented. The examined methods of voice activity detection are based on acoustic and visual modalities. The problem of detecting the voice activity in clean and noisy...

Voice Multilateration System

Publication

- SENSORS - Year 2021

This paper presents an innovative method of locating airplanes, which uses only voice communication between an air traffic controller and the pilot of an aircraft. The proposed method is described in detail along with its practical implementation in the form of a technology demonstrator (proof of concept), included in the voice communication system (VCS). A complete analysis of the performance of the developed method is presented,...

Full text available to download

System for automatic singing voice recognition

Publication

- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2008

W artykule przedstawiono system automatycznego rozpoznawania jakości i typu głosu śpiewaczego. Przedstawiono bazę danych oraz zaimplementowane parametry. Algorytmem decyzyjnym jest algorytm sztucznych sieci neuronowych. Wytrenowany system decyzyjny osiąga skuteczność ok. 90% w obydwu kategoriach rozpoznawania. Dodatkowo wykazano przy pomocy metod statystycznych, że wyniki działania systemu automatycznej oceny jakości technicznej...

Secured wired BPL voice transmission system

Publication

G. Debita
P. Falkowski-Gilski
M. Habrych
B. Miedziński
J. Wandzio
P. Jedlikowski

- Scientific Journal of the Military University of Land Forces - Year 2020

Designing a secured voice transmission system is not a trivial task. Wired media, thanks to their reliability and resistance to mechanical damage, seem an ideal solution. The BPL (Broadband over Power Line) cable is resistant to electricity stoppage and partial damage of phase conductors, ensuring continuity of transmission in case of an emergency. It seems an appropriate tool for delivering critical data, mostly clear and understandable...

Full text available to download

Automatic classification of singing voice quality

Publication

- Year 2005

W artykule przedstawiono zagadnienia związane z automatyczną klasyfikacją jakości i rodzajów głosów śpiewaczych. Na potrzebę takiej klasyfikacji stworzono bazę głosów śpiewaczych, w której dokonano parametryzacji nagrań samogłosech śpiewanych przez różnych wokalistów (zarówno profesjonalistów jak i amatorów) na różnych wysokościach i z różną głośnością. W celu ograniczenia wymiaru wektora opisu zastosowano statystykę Behrensa Fishera...

Voice command recognition using hybrid genetic algorithm

Publication

- TASK Quarterly - Year 2010

Abstract: Speech recognition is a process of converting the acoustic signal into a set of words, whereas voice command recognition consists in the correct identification of voice commands, usually single words. Voice command recognition systems are widely used in the military, control systems, electronic devices, such as cellular phones, or by people with disabilities (e.g., for controlling a wheelchair or operating a computer...

Full text available to download

Human voice modification using instantaneous complex frequency

Publication

M. Kaniewska

- Year 2010

The paper presents the possibilities of changing human voice by modifying instantaneous complex frequency (ICF) of the speech signal. The proposed method provides a flexible way of altering voice without the necessity of finding fundamental frequency and formants' positions or detecting voiced and unvoiced fragments of speech. The algorithm is simple and fast. Apart from ICF it uses signal factorization into two factors: one fully...

MEMS based voice message system for elevators

Publication

M. Kłosowski

- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Year 2007

W artykule przedstawiono implementację systemu głosowych komunikatów w windach. Prezentowany system posiada unikalną cechę polegającą na tym, że do działania nie potrzebuje połączenia z systemem sterującym windy. Zasilany z baterii lub akumulatorów może być zamontowany w ścianie windy, wymaga tylko prostej kalibracji. System oparty jest na akcelerometrach MEMS dokonujących pomiaru przeciążeń w kabinie windy. W artykule przedstawiono...

''Voice Maps'' - system supporting navigation of the blind

Publication

- HYDROACOUSTICS - Year 2012

Referat wygłoszony na Konferencji SHA 2012,Gołuń, 22-25.V.2012.

Full text available to download

Automatic Singing Voice Recognition EmployingNeural Networks and Rough Sets

Publication

- Year 2008

Celem badań jest automatyczne rozpoznawanie głosów śpiewaczych w kategorii rodzaju i jakości technicznej śpiewu. W artykule opisano stworzoną bazę danych głosów, która zawiera próbki głosu śpiewaków profesjonalnych i amatorskich. W dalszej części opisano parametry zdefiniowane w oparciu o zjawiska biomechaniczne w narządzie głosu podczas śpiewania. W oparciu o stworzone macierze parametrów wytrenowano i porównano automatyczne klasyfikatory...

BPL-PLC Voice Communication System for the Oil and Mining Industry

Publication

G. Debita
P. Falkowski-Gilski
M. Habrych
G. Wiśniewski
B. Miedziński
P. Jedlikowski
A. Waniewska
J. Wandzio
B. Polnik

- ENERGIES - Year 2020

Application of a high-efficiency voice communication systems based on broadband over power line-power line communication (BPL-PLC) technology in medium voltage networks, including hazardous areas (like the oil and mining industry), as a redundant mean of wired communication (apart from traditional fiber optics and electrical wires) can be beneficial. Due to the possibility of utilizing existing electrical infrastructure, it can...

Full text available to download

Comparison of the Ability of Neural Network Model and Humans to Detect a Cloned Voice

Publication

- Electronics - Year 2023

The vulnerability of the speaker identity verification system to attacks using voice cloning was examined. The research project assumed creating a model for verifying the speaker’s identity based on voice biometrics and then testing its resistance to potential attacks using voice cloning. The Deep Speaker Neural Speaker Embedding System was trained, and the Real-Time Voice Cloning system was employed based on the SV2TTS, Tacotron,...

Full text available to download

Subjective Quality Evaluation of Underground BPL-PLC Voice Communication System

Publication

G. Debita
P. Falkowski-Gilski
M. Habrych
B. Miedziński
B. Polnik
J. Wandzio
P. Jedlikowski

- Year 2020

Designing a reliable voice transmission system is not a trivial task. Wired media, thanks to their resistance to mechanical damage, seem an ideal solution. The BPL-PLC (Broadband over Power Line – Power Line Communication) cable is resilient to electricity stoppage and partial damage of phase conductors. It maintains continuity of transmission in case of an emergency situation, including paramedic rescue operations. These features...

Full text to download in external service

REAL-TIME VOICE QUALITY MONITORING TOOL FOR VOIP OVER IPV6 NETWORKS

Publication

- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Year 2013

The primary aim of this paper is to present a new application which is at this moment the only open source real-time VoIP quality monitoring tool that supports IPv6 networks. The application can keep VoIP system administrators provided at any time with up-to-date voice quality information. Multiple quality scores that are automatically obtained throughout each call reflect influence of variable packet losses and delays on voice...

Implementation Of The Innovative Radiolocalization System VCS-MLAT (Voice Communication System Multilateration)

Publication

- Year 2020

In the article the concept of the radiolocalization subsystem of the VHF communication for aviation VCS-MLAT (Voice Communication System – Multilateration) is presented. The distributed localization system can estimate the position of the aircraft using the audio signals from aircraft transmitters in the VHF band (118-136 MHz). This paper shows initial verification of the possibility to use voice airband communication to estimate...

Full text to download in external service

New approach for determining the QoS of MP3-coded voice signals in IP networks

Publication

T. Uhl
S. Paulsen
K. Nowicki

- EURASIP Journal on Audio Speech and Music Processing - Year 2017

Present-day IP transport platforms being what they are, it will never be possible to rule out conflicts between the available services. The logical consequence of this assertion is the inevitable conclusion that the quality of service (QoS) must always be quantifiable no matter what. This paper focuses on one method to determine QoS. It defines an innovative, simple model that can evaluate the QoS of MP3-coded voice data transported...

Full text available to download

Automatic singing voice recognition employing neural networks and rough sets

Publication

- Year 2007

Celem prac opisanych w referacie jest automatyczne rozpoznawanie głosów śpiewaczych. Do tego celu utworzona została baza nagrań próbek śpiewu profesjonalnego i amatorskiego. Próbki poddane zostały parametryzacji parametrami zaproponowanymi przez autorów ściśle do tego celu. Sposób wyznaczenia parametrów i ich interpretacja fizyczna przedstawione są w referacie. Parametry wprowadzane są do systemów decyzyjnych, klasyfikatorów opartych...

In Reference to Voice, Swallow and Airway Outcomes Following Tracheostomy for COVID-19

Publication

- LARYNGOSCOPE - Year 2021

Full text to download in external service

A low complexity double-talk detector based on the signal envelope

Publication

- SIGNAL PROCESSING - Year 2008

A new algorithm for double-talk detection, intended for use in the acoustic echo canceller for voice communication applications, is proposed. The communication system developed by the authors required the use of a double-talk detection algorithm with low complexity and good accuracy. The authors propose an approach to doubletalk detection based on the signal envelopes. For each of three signals: the far-end speech, the microphone...

Full text available to download

Quality Evaluation of Voice Transmission Using BPL Communication System in MV Mine Cable Network

Publication

G. Debita
P. Falkowski-Gilski
M. Habrych
B. Miedziński
J. Wandzio
P. Jedlikowski

- Elektronika Ir Elektrotechnika - Year 2019

This article presents results of a quality evaluation study, considering voice transmission in a 6 kV medium voltage cable network using the BPL (Broadband over Power Line) communication system. The tests are carried out under real mining conditions for the selected power cable without voltage, earthed at both sides. Such a method of monitoring work conditions is of great importance, especially during a disaster. Power cables are...

Full text available to download

Voice Maps - portable, dedicated GIS for supporting street navigtion and self-dependent movement of the blind

Publication

- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Year 2010

The concept and the prototype application of the system supporting the street navigation and independent, outdoor movement of the blind is presented. The system utilises the GIS database of geometric network of the pedestrian paths in the city and is capable of finding the route from the indicated source to destination. Subsequently, the system supports the movement of the blind along the found route. The information on the user's...

Cross-Lingual Knowledge Distillation via Flow-Based Voice Conversion for Robust Polyglot Text-to-Speech

Publication

D. Piotrowski
R. Korzeniowski
A. Falai
S. Cygert
K. Pokora
G. Tinchev
Z. Zhang
K. Yanagisawa

- Year 2023

In this work, we introduce a framework for cross-lingual speech synthesis, which involves an upstream Voice Conversion (VC) model and a downstream Text-To-Speech (TTS) model. The proposed framework consists of 4 stages. In the first two stages, we use a VC model to convert utterances in the target locale to the voice of the target speaker. In the third stage, the converted data is combined with the linguistic features and durations...

Full text to download in external service

502 - Diagnosis of dementia and post-diagnostic support – voice of people with dementia living in Poland

Publication

M. Maćkowiak
M. Ciułkowicz
M. Duda-Sikuła
D. Szcześniak
J. Rymaszewska

- International Psychogeriatrics - Year 2021

Full text to download in external service

Detection and localization of selected acoustic events in acoustic field for smart surveillance applications

Publication

- MULTIMEDIA TOOLS AND APPLICATIONS - Year 2014

A method for automatic determination of position of chosen sound events such as speech signals and impulse sounds in 3-dimensional space is presented. The evens are localized in the presence of sound reflections employing acoustic vector sensors. Human voice and impulsive sounds are detected using adaptive detectors based on modified peak-valley difference (PVD) parameter and sound pressure level. Localization based on signals...

Full text available to download

Enhanced voice user interface employing spatial filtration of signals from acoustic vector sensor

Publication

- Year 2015

Spatial filtration of sound is introduced to enhance speech recognition accuracy in noisy conditions. An acoustic vector sensor (AVS) is employed. The signals from the AVS probe are processed in order to attenuate the surrounding noise. As a result the signal to noise ratio is increased. An experiment is featured in which speech signals are disturbed by babble noise. The signals before and after spatial filtration are processed...

Full text to download in external service

Quality Evaluation of Speech Transmission via Two-way BPL-PLC Voice Communication System in an Underground Mine

Publication

P. Falkowski-Gilski
G. Debita

- Archives of Acoustics - Year 2023

In order to design a stable and reliable voice communication system, it is essential to know how many resources are necessary for conveying quality content. These parameters may include objective quality of service (QoS) metrics, such as: available bandwidth, bit error rate (BER), delay, latency as well as subjective quality of experience (QoE) related to user expectations. QoE is expressed as clarity of speech and the ability...

Full text available to download

Detection and localization of selected acoustic events in 3D acoustic field for smart surveillance applications

Publication

- Communications in Computer and Information Science - Year 2011

A method for automatic determination of position of chosen sound events such as speech signals and impulse sounds in 3-dimensional space is presented. The events are localized in the presence of sound reflections employing acoustic vector sensors. Human voice and impulsive sounds are detected using adaptive detectors based on modified peak-valley difference (PVD) parameter and sound pressure level. Localization based on signals...

Full text to download in external service

Improved method for real-time speech stretching

Publication

- Year 2012

n algorithm for real-time speech stretching is presented. It was designed to modify input signal dependently on its content and on its relation with the historical input data. The proposed algorithm is a combination of speech signal analysis algorithms, i.e. voice, vowels/consonants, stuttering detection and SOLA (Synchronous-Overlap-and-Add) based speech stretching algorithm. This approach enables stretching input speech signal...

Full text to download in external service

Biofilm Growth Causes Damage to Silicone Voice Prostheses in Patients after Surgical Treatment of Locally Advanced Laryngeal Cancer

Publication

J. Spałek
P. Deptuła
M. Cieśluk
A. Strzelecka
D. Łysik
J. Mystkowska
T. Daniluk
G. Król
S. Góźdź
R. Bucki... and 2 others

- Pathogens - Year 2020

Full text to download in external service

Playback Attack Detection: The Search for the Ultimate Set of Antispoof Features

Publication

M. Smiatacz

- Year 2017

Automatic speaker verification systems are vulnerable to several kinds of spoofing attacks. Some of them can be quite simple – for example, the playback of an eavesdropped recording does not require any specialized equipment nor knowledge, but still may pose a serious threat for a biometric identification module built into an e-banking application. In this paper we follow the recent approach and convert recordings to images, assuming...

Full text to download in external service

Rediscovering Automatic Detection of Stuttering and Its Subclasses through Machine Learning—The Impact of Changing Deep Model Architecture and Amount of Data in the Training Set

Publication

P. Filipowicz
B. Kostek

- Applied Sciences-Basel - Year 2023

This work deals with automatically detecting stuttering and its subclasses. An effective classification of stuttering along with its subclasses could find wide application in determining the severity of stuttering by speech therapists, preliminary patient diagnosis, and enabling communication with the previously mentioned voice assistants. The first part of this work provides an overview of examples of classical and deep learning...

Full text available to download

Analysis of the harmonic structure of the vowel /a/ taking into account the age and gender of the speaker

Publication

S. Drywa

- Year 2024

Sound waves are disturbances propagating through an elastic medium that, upon reaching the ear, elicit auditory sensations. Sounds generated by the surroundings can be captured by a transducer (microphone), which transforms them into an electrical signal. The signal from the microphone is then transmitted to a computer, where software allows for the extraction and analysis of individual tones. This process enables the description...

Full text to download in external service

Analyzing the relationship between sound, color, and emotion based on subjective and machine-learning approaches

Publication

- Year 2024

The aim of the research is to analyze the relationship between sound, color, and emotion. For this purpose, a survey application was prepared, enabling the assignment of a color to a given speaker’s/singer’s voice recordings. Subjective tests were then conducted, enabling the respondents to assign colors to voice/singing samples. In addition, a database of voice/singing recordings of people speaking in a natural way and with expressed...

Full text available to download

A non-uniform real-time speech time-scale stretching method

Publication

- Year 2011

An algorithm for non-uniform real-time speech stretching is presented. It provides a combination of typical SOLA algorithm (Synchronous Overlap and Add ) with the vowels, consonants and silence detectors. Based on the information about the content and the estimated value of the rate of speech (ROS), the algorithm adapts the scaling factor value. The ability of real-time speech stretching and the resultant quality of voice were...

Automatic singing quality recognition employing artificial neural networks

Publication

P. Żwan

- Archives of Acoustics - Year 2008

Celem artykułu jest udowodnienie możliwości automatycznej oceny jakości technicznej głosów śpiewaczych. Pokrótce zaprezentowano w nim stworzoną bazę danych głosów śpiewaczych oraz zaimplementowane parametry. Przy pomocy sztucznych sieci neuronowych zaprojektowano system decyzyjny, który oceniono w pięciostopniowej skali jakość techniczną głosu. Przy pomocy metod statystycznych udowodniono, że wyniki generowane przez ten system...

Full text available to download

Creating new voices using normalizing flows

Publication

P. Biliński
T. Merritt
A. Ezzerg
K. Pokora
S. Cygert
K. Yanagisawa
R. Barra-Chicote
D. Korzekwa

- Year 2022

Creating realistic and natural-sounding synthetic speech remains a big challenge for voice identities unseen during training. As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of normalizing flows in text-to-speech (TTS) and voice conversion (VC) modes to extrapolate from speakers observed during training to create unseen speaker identities. Firstly, we create an approach for TTS...

Full text available to download

Communication Platform for Evaluation of Transmitted Speech Quality

Publication

- Journal of Telecommunications and Information Technology - Year 2011

A voice communication system designed and implemented is described. The purpose of the presented platform was to enable a series of experiments related to the quality assessment of algorithms used in the coding and transmitting of speech. The system is equipped with tools for recording signals at each stage of processing, making it possible to subject them to subjective assessments by listening tests or, objective evaluation employing...

Full text available to download

Wearable system supporting navigation of the blind

Publication

- RED. ZAGR. ANGIELSKI - Year 2011

Improving blind people comfort of life is a problem ofgreat importance. Fortunately, new technolgies provide us withadditional methods to improve everyday life of the blind and visuallyimpaired. The paper presents experimental system made byresearchers from Department of Geoinformatics of Gdansk Universityof Technology, which is capable of finding the route from theindicated source to chosen destination, using dedicated digital...

Full text available to download

A system for singing training

Publication

- Year 2007

The system proposed is aimed at the vocal students and persons who want to improve emission of their voices. The goal is not to substituite a singing teacher but to provide a tool for automatic teaching of voice emission basics. In this way singers can develop their vocal skills and improve them. By a visual feedback a student can control and modify vocal tract maximas (resonances) of a chosen vowel to match the resonances of the...

PHONEME DISTORTION IN PUBLIC ADDRESS SYSTEMS

Publication

- Year 2015

The quality of voice messages in speech reinforcement and public address systems is often poor. The sound engineering projects of such systems take care of sound intensity and possible reverberation phenomena in public space without, however, considering the influence of acoustic interference related to the number and distribution of loudspeakers. This paper presents the results of measurements and numerical simulations of the...

Client-server Approach in the Navigation System for the Blind

Publication

- TransNav - The International Journal on Marine Navigation and Safety of Sea Transportation - Year 2013

The article presents the client‐server approach in the navigation system for the blind ‐ “Voice Maps”. The authors were among the main creators of the prototype and currently the commercialization phase is being finished. In the implemented prototype only exemplary, limited spatial data were used, therefore they could be stored and analysed (for path-finding process) in the mobile device’s memory without any difficulties. The...

Full text available to download

Subjective Quality Evaluation of Speech Signals Transmitted via BPL-PLC Wired System

Publication

P. Falkowski-Gilski
G. Debita
M. Habrych
B. Miedziński
P. Jedlikowski
B. Polnik
J. Wandzio
X. Wang

- Year 2020

The broadband over power line – power line communication (BPL-PLC) cable is resistant to electricity stoppage and partial damage of phase conductors. It maintains continuity of transmission in case of an emergency. These features make it an ideal solution for delivering data, e.g. in an underground mine environment, especially clear and easily understandable voice messages. This paper describes a subjective quality evaluation of...

Full text to download in external service

Subjective and Objective Quality Evaluation Study of BPL -PLC Wired Medium

Publication

G. Debita
P. Falkowski-Gilski
M. Habrych
B. Miedziński
B. Polnik
J. Wandzio
P. Jedlikowski

- Elektronika Ir Elektrotechnika - Year 2020

This paper presents results of research on the effectiveness of bi-directional voice transmission in a 6 kV mine cable network using BPL-PLC (Broadband over Power Line - Power Line Communication) technology. It concerns both emergency cable state (supply outage with cable shorted at both ends) and loaded with distorted current waveforms. The narrowband (0.5 MHz–15 MHz) and broadband (two different modes, frequency range of 3 MHz–7.5...

Full text available to download

Speaker Recognition Using Convolutional Neural Network with Minimal Training Data for Smart Home Solutions

Publication

M. Wang
T. Sirlapu
A. Kwaśniewska
M. Szankin
M. Bartscherer
R. Nicolas

- Year 2018

With the technology advancements in smart home sector, voice control and automation are key components that can make a real difference in people's lives. The voice recognition technology market continues to involve rapidly as almost all smart home devices are providing speaker recognition capability today. However, most of them provide cloud-based solutions or use very deep Neural Networks for speaker recognition task, which are...

Full text to download in external service

A survey of automatic speech recognition deep models performance for Polish medical terms

Publication

- Year 2023

Among the numerous applications of speech-to-text technology is the support of documentation created by medical personnel. There are many available speech recognition systems for doctors. Their effectiveness in languages such as Polish should be verified. In connection with our project in this field, we decided to check how well the popular speech recognition systems work, employing models trained for the general Polish language....

Full text to download in external service

Transmitting Alarm Information in DAB+ Broadcasting System

Publication

P. Falkowski-Gilski

- Year 2018

The main goal of digital broadcasting is to deliver high-quality content with the lowest possible bitrate. This paper is focused on transmitting alarm information, such as emergency warning and alerting, in the DAB+ (Digital Audio Broadcasting plus) broadcasting system. These additional services should be available at the lowest possible bitrate, in order to provide a clear and understandable voice message to people. Furthermore, additional...

Methods of navigation in the mobile application supporting movement of the blind

Publication

- HYDROACOUSTICS - Year 2013

This paper presents conclusions from the last development phase of the “Voice Maps” project. Authors specify newly found available options for navigation in affordable system for the blind with all their related aspects. System uses dedicated geographical data, built-in smartphone GPS receivers and DGPS external device in order to assist blind users in their everyday travelling. Authors also discuss new methods to improve positioning...

Full text available to download

Developing a Low SNR Resistant, Text Independent Speaker Recognition System for Intercom Solutions - A Case Study

Publication

- Year 2024

This article presents a case study on the development of a biometric voice verification system for an intercom solution, utilizing the DeepSpeaker neural network architecture. Despite the variety of solutions available in the literature, there is a noted lack of evaluations for "text-independent" systems under real conditions and with varying distances between the speaker and the microphone. This article aims to bridge this gap....

Full text available to download

The project IDENT: Multimodal biometric system for bank client identity verification

Publication

- Year 2017

Biometric identity verification methods are implemented inside a real banking environment comprising: dynamic handwritten signature verification, face recognition, bank cli-ent voice recognition and hand vein distribution verification. A secure communication system based on an intra-bank client-server architecture was designed for this purpose. Hitherto achieved progress within the project is reported in this paper with a focus...

Full text to download in external service

Smart Modeling of Maritime Vessels

Publication

- Journal of Shipping and Ocean Engineering - Year 2015

Currently, the market offers many visualization tools available to graphic designers, engineers, managers and academics working on maritime environments. The practice of visualization involves making and manipulating images that convey novel phenomena and ideas. Visual communication, together with virtual reality environments, is an emerging and rapidly evolving discipline. It brings great advantage over written word or voice alone,...

Full text available to download

Search

Filters

Catalog

Search results for: voice detection