Wyniki wyszukiwania dla: AUDIO PROCESSING

New Applications of Multimodal Human-Computer Interfaces

Publikacja

A. Czyżewski

- Rok 2012

Multimodal computer interfaces and examples of their applications to education software and for the disabled people are presented. The proposed interfaces include the interactive electronic whiteboard based on video image analysis, application for controlling computers with gestures and the audio interface for speech stretching for hearing impaired and stuttering people. Application of the eye-gaze tracking system to awareness...

Bimodal Emotion Recognition Based on Vocal and Facial Features

Publikacja

- Rok 2023

Emotion recognition is a crucial aspect of human communication, with applications in fields such as psychology, education, and healthcare. Identifying emotions accurately is challenging, as people use a variety of signals to express and perceive emotions. In this study, we address the problem of multimodal emotion recognition using both audio and video signals, to develop a robust and reliable system that can recognize emotions...

Pełny tekst do pobrania w portalu

Further developments of parameterization methods of audio stream analysis for secuirty purposes

Publikacja

- Rok 2009

The paper presents an automatic sound recognition algorithm intended for application in an audiovisual security monitoring system. A distributed character of security systems does not allow for simultaneous observation of multiple multimedia streams, thus an automatic recognition algorithm must be introduced. In the paper, a module for the parameterization and automatic detection of audio events is described. The spectral analyses...

Bass Enhancement Settings in Portable Devices Based on Music Genre Recognition

Publikacja

- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Rok 2015

The paper presents a novel approach to the Virtual Bass Synthesis (VBS) applied to mobile devices, called Smart VBS (SVBS). The proposed algorithm uses an intelligent, rule-based setting of bass synthesis parameters adjusted to the particular music genre. Harmonic generation is based on a nonlinear device (NLD) method with the intelligent controlling system adapting to the recognized music genre. To automatically classify music...

Pełny tekst do pobrania w portalu

Subjective and Objective Quality Evaluation Study of BPL -PLC Wired Medium

Publikacja

G. Debita
P. Falkowski-Gilski
M. Habrych
B. Miedziński
B. Polnik
J. Wandzio
P. Jedlikowski

- Elektronika Ir Elektrotechnika - Rok 2020

This paper presents results of research on the effectiveness of bi-directional voice transmission in a 6 kV mine cable network using BPL-PLC (Broadband over Power Line - Power Line Communication) technology. It concerns both emergency cable state (supply outage with cable shorted at both ends) and loaded with distorted current waveforms. The narrowband (0.5 MHz–15 MHz) and broadband (two different modes, frequency range of 3 MHz–7.5...

Pełny tekst do pobrania w portalu

Study on CPU and RAM Resource Consumption of Mobile Devices using Streaming Services

Publikacja

- Rok 2021

Streaming multimedia services have become very popular in recent years, due to the development of wireless networks. With the growing number of mobile devices worldwide, service providers offer dedicated applications that allow to deliver on-demand audio and video content anytime and everywhere. The aim of this study was to compare different streaming services and investigate their impact on the CPU and RAM resources, with respect...

Pełny tekst do pobrania w serwisie zewnętrznym

Musical Instrument Identification Using Deep Learning Approach

Publikacja

- SENSORS - Rok 2022

The work aims to propose a novel approach for automatically identifying all instruments present in an audio excerpt using sets of individual convolutional neural networks (CNNs) per tested instrument. The paper starts with a review of tasks related to musical instrument identification. It focuses on tasks performed, input type, algorithms employed, and metrics used. The paper starts with the background presentation, i.e., metadata...

Pełny tekst do pobrania w portalu

Architecture Design of a Networked Music Performance Platform for a Chamber Choir

Publikacja

- Communications in Computer and Information Science - Rok 2022

This paper describes an architecture design process for Networked Music Performance (NMP) platform for medium-sized conducted music ensembles, based on remote rehearsals of Academic Choir of Gdańsk University of Technology. The issues of real-time remote communication, in-person music performance, and NMP are described. Three iterative steps defining and extending the architecture of the NMP platform with additional features to...

Pełny tekst do pobrania w portalu

Speech Analytics Based on Machine Learning

Publikacja

- Rok 2019

In this chapter, the process of speech data preparation for machine learning is discussed in detail. Examples of speech analytics methods applied to phonemes and allophones are shown. Further, an approach to automatic phoneme recognition involving optimized parametrization and a classifier belonging to machine learning algorithms is discussed. Feature vectors are built on the basis of descriptors coming from the music information...

Pełny tekst do pobrania w serwisie zewnętrznym

Broadening the scope of measurement and analysis of vibrations of an organ pipe employing intensity probe, simulations, and highspeed camera

Publikacja

P. Bordoni
J. Kotus
P. Odya
F. Antonacci
B. Kostek

- Journal of the Acoustical Society of America - Rok 2022

This paper shows an integrated approach to measure, analyze, and model phenomena occurring in an organ pipe driven by pressurized air. The aim of this paper is two-fold, i.e., to measure the pressure signal and the intensity field around the mouth by means of an intensity probe and to visualize and observe the motion of the air jet, which represents the excitation mechanism of the system. This is realized through two techniques,...

Pełny tekst do pobrania w serwisie zewnętrznym

Automatic Breath Analysis System Using Convolutional Neural Networks

Publikacja

- Rok 2022

Diseases related to the human respiratory system have always been a burden for the entire society. The situation has become particularly difficult now after the outbreak of the COVID-19 pandemic. Even now, however, it is not uncommon for people to consult their doctor too late, after the disease has developed. To protect patients from severe disease, it is recommended that any symptoms disturbing the respiratory system be detected...

Pełny tekst do pobrania w serwisie zewnętrznym

Automatic Breath Analysis System Using Convolutional Neural Networks

Publikacja

- Rok 2022

Diseases related to the human respiratory system have always been a burden for the entire society. The situation has become particularly difficult now after the outbreak of the COVID-19 pandemic. Even now, however, it is common for people to consult their doctor too late, after the disease has developed. To protect patients from severe disease, it is recommended that any symptoms disturbing the respiratory system be detected as...

Pełny tekst do pobrania w serwisie zewnętrznym

TRANSPORT POSSIBILITY FOR MPEG-4/AVC- AND MPEG-2-ENCODED VIDEO DATA IN IPTV: A COMPARISON STUDY

Publikacja

T. Uhl
S. Paulsen
K. Nowicki

- Rok 2013

IPTV (Television over IP) is a modern service with a great potential to expand. It uses the IP transport platform, that is already in worldwide operation. At the time of writing, two techniques are used to transport the video and audio data of IPTV: MPEG-2 TS and Native RTP. The two techniques quite definitely have an influence on both quality of service (QoS) and quality of experience (QoE). This paper sets out to demonstrate...

Smart Virtual Bass Synthesis Algorithm Based on Music Genre Classification

Publikacja

- Rok 2014

The aim of this paper is to present a novel approach to the Virtual Bass Synthesis (VBS) algorithms applied to portable computers. The proposed algorithm employed automatic music genre recognition to determine the optimum parameters for the synthesis of additional frequencies. The synthesis was carried out using the non-linear device (NLD) and phase vocoder (PV) methods depending on the music excerpt genre. Classification of musical...

A Review of Emotion Recognition Methods Based on Data Acquired via Smartphone Sensors

Publikacja

- SENSORS - Rok 2020

In recent years, emotion recognition algorithms have achieved high efficiency, allowing the development of various affective and affect-aware applications. This advancement has taken place mainly in the environment of personal computers offering the appropriate hardware and sufficient power to process complex data from video, audio, and other channels. However, the increase in computing and communication capabilities of smartphones,...

Pełny tekst do pobrania w portalu

Enhancing voice biometric security: Evaluating neural network and human capabilities in detecting cloned voices

Publikacja

A. Czyżewski

- Journal of the Acoustical Society of America - Rok 2024

This study assesses speaker verification efficacy in detecting cloned voices, particularly in safety-critical applications such as healthcare documentation and banking biometrics. It compares deeply trained neural networks like the DeepSpeaker with human listeners in recognizing these cloned voices, underlining the severe implications of voice cloning in these sectors. Cloned voices in healthcare could endanger patient safety by...

Pełny tekst do pobrania w serwisie zewnętrznym

Creating a Remote Choir Performance Recording Based on an Ambisonic Approach

Publikacja

- Applied Sciences-Basel - Rok 2022

The aim of this paper is three-fold. First, the basics of binaural and ambisonic techniques are briefly presented. Then, details related to audio-visual recordings of a remote performance of the Academic Choir of the Gdańsk University of Technology are shown. Due to the COVID-19 pandemic, artists had a choice, namely, to stay at home and not perform or stay at home and perform. In fact, staying at home brought in the possibility...

Pełny tekst do pobrania w portalu

Comparing traffic intensity estimates employing passive acoustic radar and microwave Doppler radar sensor

Publikacja

A. Czyżewski

- Journal of the Acoustical Society of America - Rok 2020

The purpose of our applied research project is to develop an autonomous road sign with built-in radar devices of our design. In this paper, we show that it is possible to calibrate the acoustic vector sensor so that it can be used to measure traffic volume and count the vehicles involved in the traffic through the analysis of the noise emitted by them. Signals obtained from a Doppler radar are used as a reference source. Although...

Pełny tekst do pobrania w serwisie zewnętrznym

Objectivization of Audio-Visual Correlation analysis

Publikacja

- Archives of Acoustics - Rok 2012

Simultaneous perception of audio and visual stimuli often causes the concealment or misrepresentation of information actually contained in these stimuli. Such effects are called the ''image proximity effect'' or the ''ventriloquism effect'' in literature. Until recently, most research carried out to understand their nature was based on subjective assessments. The Authors of this paper propose a methodology based on both subjective...

Pełny tekst do pobrania w portalu

Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej

Publikacja

A. Czyżewski
B. Kostek
T. Ciszewski
D. Majewicz

- Rok 2013

The bi-modal speech recognition system requires a 2-sample language input for training and for testing algorithms which precisely depicts natural English speech. For the purposes of the audio-visual recordings, a training data base of 264 sentences (1730 words without repetitions; 5685 sounds) has been created. The language sample reflects vowel and consonant frequencies in natural speech. The recording material reflects both the...

Multimodal human-computer interfaces based on advanced video and audio analysis

Publikacja

- Rok 2013

Multimodal interfaces development history is reviewed briefly in the introduction. Examples of applications of multimodal interfaces to education software and for the disabled people are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with mouth gestures and the audio interface for speech stretching for hearing impaired and stuttering people. The Smart...

Pełny tekst do pobrania w serwisie zewnętrznym

Buzz-based honeybee colony fingerprint

Publikacja

- COMPUTERS AND ELECTRONICS IN AGRICULTURE - Rok 2021

Non-intrusive remote monitoring has its applications in a variety of areas. For industrial surveillance case, devices are capable of detecting anomalies that may threaten machine operation. Similarly, agricultural monitoring devices are used to supervise livestock or provide higher yields. Modern IoT devices are often coupled with Machine Learning models, which provide valuable insights into device operation. However, the data...

Pełny tekst do pobrania w portalu

Evaluation of aspiration problems in L2 English pronunciation employing machine learning

Publikacja

M. Piotrowska
A. Czyżewski
T. Ciszewski
G. Korvel
A. Kurowski
B. Kostek

- Journal of the Acoustical Society of America - Rok 2021

The approach proposed in this study includes methods specifically dedicated to the detection of allophonic variation in English. This study aims to find an efficient method for automatic evaluation of aspiration in the case of Polish second-language (L2) English speakers’ pronunciation when whole words are analyzed instead of particular allophones extracted from words. Sample words including aspirated and unaspirated allophones...

Pełny tekst do pobrania w portalu

Audio Feature Analysis for Precise Vocalic Segments Classification in English

Publikacja

- Rok 2020

An approach to identifying the most meaningful Mel-Frequency Cepstral Coefficients representing selected allophones and vocalic segments for their classification is presented in the paper. For this purpose, experiments were carried out using algorithms such as Principal Component Analysis, Feature Importance, and Recursive Parameter Elimination. The data used were recordings made within the ALOFON corpus containing audio signal...

Pełny tekst do pobrania w serwisie zewnętrznym

Fully Automated AI-powered Contactless Cough Detection based on Pixel Value Dynamics Occurring within Facial Regions

Publikacja

M. Szankin
A. Kwaśniewska
N. Kowalczyk
J. Rumiński
R. Nicolas
D. Gamba

- Rok 2021

Increased interest in non-contact evaluation of the health state has led to higher expectations for delivering automated and reliable solutions that can be conveniently used during daily activities. Although some solutions for cough detection exist, they suffer from a series of limitations. Some of them rely on gesture or body pose recognition, which might not be possible in cases of occlusions, closer camera distances or impediments...

Pełny tekst do pobrania w serwisie zewnętrznym

MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES

Publikacja

M. Piotrowska
G. Korvel
B. Kostek
T. Ciszewski
A. Czyżewski

- International Journal of Applied Mathematics and Computer Science - Rok 2019

Automatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and selforganizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’...

Pełny tekst do pobrania w portalu

Subjective quality evaluation of 8- and 10-bit MP4-coded video sequences from Netflix

Publikacja

P. Falkowski-Gilski
T. Uhl
P. B. Divakarachari

- Zeszyty Naukowe Akademii Morskiej w Szczecinie - Rok 2024

Recently, many researchers have been intensively conducting quality of service (QoS), quality of experience (QoE), and user experience (UX) studies in the field of video analysis. This paper is intended to make a new, complementary contribution to this field. Currently, streaming platforms are key products in relation to delivering video content online. Most often, they include the MP4 video format, which is most widely utilized...

Pełny tekst do pobrania w serwisie zewnętrznym

Analysis of Lombard speech using parameterization and the objective quality indicators in noise conditions

Publikacja

K. Kąkol
G. Korvel
B. Kostek

- Rok 2018

The aim of the work is to analyze Lombard speech effect in recordings and then modify the speech signal in order to obtain an increase in the improvement of objective speech quality indicators after mixing the useful signal with noise or with an interfering signal. The modifications made to the signal are based on the characteristics of the Lombard speech, and in particular on the effect of increasing the fundamental frequency...

ZINTEGROWANY SYSTEM DOMOWEGO MONITORINGU PARAMETRÓW MEDYCZNYCH OSÓB STARSZYCH I CHORYCH

Publikacja

- Rok 2019

Proponowane rozwiązania mają na celu wspomaganie osób starszych i chorych, tak by mogły jak najdłużej mieszkać i żyć samodzielnie ze zwiększonym poczuciem bezpieczeństwa, iż są nadzorowane i w razie nagłego zagrożenia życia nie pozostaną bez pomocy. System jednocześnie nie narusza poczucia zachowania prywatności i intymności, gdyż nie są używane do monitoringu kamery wizyjne czy też stały nasłuch audio. Dodatkowo gromadzone informacje...

Filtry

Katalog

New Applications of Multimodal Human-Computer Interfaces

Bimodal Emotion Recognition Based on Vocal and Facial Features

Further developments of parameterization methods of audio stream analysis for secuirty purposes

Bass Enhancement Settings in Portable Devices Based on Music Genre Recognition

Subjective and Objective Quality Evaluation Study of BPL -PLC Wired Medium

Study on CPU and RAM Resource Consumption of Mobile Devices using Streaming Services

Musical Instrument Identification Using Deep Learning Approach

Architecture Design of a Networked Music Performance Platform for a Chamber Choir

Speech Analytics Based on Machine Learning

Broadening the scope of measurement and analysis of vibrations of an organ pipe employing intensity probe, simulations, and highspeed camera

Automatic Breath Analysis System Using Convolutional Neural Networks

Automatic Breath Analysis System Using Convolutional Neural Networks

TRANSPORT POSSIBILITY FOR MPEG-4/AVC- AND MPEG-2-ENCODED VIDEO DATA IN IPTV: A COMPARISON STUDY

Smart Virtual Bass Synthesis Algorithm Based on Music Genre Classification

A Review of Emotion Recognition Methods Based on Data Acquired via Smartphone Sensors

Enhancing voice biometric security: Evaluating neural network and human capabilities in detecting cloned voices

Creating a Remote Choir Performance Recording Based on an Ambisonic Approach

Comparing traffic intensity estimates employing passive acoustic radar and microwave Doppler radar sensor

Objectivization of Audio-Visual Correlation analysis

Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej

Multimodal human-computer interfaces based on advanced video and audio analysis

Buzz-based honeybee colony fingerprint

Evaluation of aspiration problems in L2 English pronunciation employing machine learning

Audio Feature Analysis for Precise Vocalic Segments Classification in English

Fully Automated AI-powered Contactless Cough Detection based on Pixel Value Dynamics Occurring within Facial Regions

MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES

Subjective quality evaluation of 8- and 10-bit MP4-coded video sequences from Netflix

Analysis of Lombard speech using parameterization and the objective quality indicators in noise conditions

ZINTEGROWANY SYSTEM DOMOWEGO MONITORINGU PARAMETRÓW MEDYCZNYCH OSÓB STARSZYCH I CHORYCH

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: AUDIO PROCESSING