Publications
Filters
total: 348
Catalog Publications
-
A Comparison of STI Measured by Direct and Indirect Methods for Interiors Coupled with Sound Reinforcement Systems
PublicationThis paper presents a comparison of STI (Speech Transmission Index) coefficient measurement results carried out by direct and indirect methods. First, acoustic parameters important in the context of public address and sound reinforcement systems are recalled. A measurement methodology is presented that employs various test signals to determine impulse responses. The process of evaluating sound system performance, signals enabling...
-
Methodology and technology for the polymodal allophonic speech transcription
PublicationA method for automatic audiovisual transcription of speech employing: acoustic, electromagnetical articulography and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e., the changes in the articulatory...
-
Ranking Speech Features for Their Usage in Singing Emotion Classification
PublicationThis paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...
-
Accidental wow defect evaluation using sinusoidal analysis enhanced by artificial neural networks
PublicationArtykuł przedstawia metodę do wyznaczania charakterystyki pasożytniczych modulacji częstotliwości (kołysanie) obecnych w archiwalnych nagraniach dźwiękowych. Prezentowane podejście wykorzystuje śledzenie zmian sinusoidalnych komponentów dźwięku które odzwierciedlają przebieg kołysania. Analiza sinusoidalna wykorzystana jest do ekstrakcji składowych tonalnych ze zniekształconych nagrań dźwiękowych. Dodatkowo, w celu zwiększenia...
-
Retrospecting Polish Audio Engineering Society Membership on 20th Anniversary of the Polish Section of the Audio Engineering Society
PublicationIn this article some key events concerning founding Polish Section of the Audio Engineering Society were presented. In addition, the history covering International Symposia on Sound Engineering and Mastering was outlined. Also, papers contained in this issue were shortly reviewed.
-
Waveguide model of the hearing aid earmold system
PublicationBackground The earmold system of the Behind-The-Ear hearing aid is an acoustic system that modifies the spectrum of the propagated sound waves. Improper selection of the earmold system may result in deterioration of sound quality and speech intelligibility. Computer modeling methods may be useful in the process of hearing aid fitting, allowing physician to examine various earmold system configurations and choose the optimum one...
-
Computer-Supported Polysensory Integration Technology for Educationally Handicapped Pupils
PublicationIn this paper, a multimedia system providing technology for hearing and visual attention stimulation is shortly presented. The system aims to support the development of educationally handicapped pupils. The system has been presented in the context of its configuration, architecture, and therapeutic exercise implementation issues. Results of pupils’ improvements after 8 weeks of training with the system are also provided. Training...
-
Measurements and visualization of sound field distribution around organ pipe
PublicationMeasurements and visualization of acoustic field around an organ pipe are presented. Sound intensity technique was applied for this purpose. Measurements were performed in free field. The organ pipe was activated with a constant air flow, produced by an external compressor, aimed at obtaining long-term steady state responses of generated acoustic signal. Sound energy distribution was measured in a defined fixed grid of points...
-
Multimedia polysensory integration training system dedicated to children with educational difficulties
PublicationThis paper aims at presenting a multimedia system providing polysensory train- ing for pupils with educational difficulties. The particularly interesting aspect of the system lies in the sonic interaction with image projection in which sounds generated lead to stim- ulation of a particular part of the human brain. The system architecture, video processing methods, therapeutic exercises and guidelines for children’s interaction...
-
A concept of Signal Equalization Method Based on Music Genre and the Listener's Room Characteristics
PublicationA research study that investigates the influence of the room acoustics environment on the frequency characteristic of the audio signal playback is presented. First, a novel spectral equalization method of the room acoustic conditions is introduced. On the basis of the frequency response of the room, a system for room acoustics compensation based on eight-band equalizer is proposed. The system settings depend on music genre. In...
-
Enhancement of computer character animation utilizing fuzzy rules
PublicationRozdział przedstawia nową metodę przetwarzania komputerowych animacji postaci. Wykorzystuje ona wnioskowanie rozmyte, oparte na regułach i funkcjach przynależności uzyskanych w procesie analizy wyników testów subiektywnej oceny jakości animacji. W trakcie przetwarzania do animacji automatycznie dodawane są nowe fazy ruchu, co skutkuje poprawą jakości wizualnej oraz zmianą płynności i stylizacji ruchu w sposób zamierzony. W referacie...
-
Waveguide model of the hearing aid earmold system
PublicationBackground The earmold system of the Behind-The-Ear hearing aid is an acoustic system that modifies the spectrum of the propagated sound waves. Improper selection of the earmold system may result in deterioration of sound quality and speech intelligibility. Computer modeling methods may be useful in the process of hearing aid fitting, allowing physician to examine various earmold system configurations and choose the optimum one...
-
Multimodal system for diagnosis and polysensory stimulation of subjects with communication disorders
PublicationAn experimental multimodal system, designed for polysensory diagnosis and stimulation of persons with impaired communication skills or even non-communicative subjects is presented. The user interface includes an eye tracking device and the EEG monitoring of the subject. Furthermore, the system consists of a device for objective hearing testing and an autostereoscopic projection system designed to stimulate subjects through their...
-
Multimodal Approach For Polysensory Stimulation And Diagnosis Of Subjects With Severe Communication Disorders
Publicationis evaluated on 9 patients, data analysis methods are described, and experiments of correlating Glasgow Coma Scale with extracted features describing subjects performance in therapeutic exercises exploiting EEG and eyetracker are presented. Performance metrics are proposed, and k-means clusters used to define concepts for mental states related to EEG and eyetracking activity. Finally, it is shown that the strongest correlations...
-
Automatic Rhythm Retrieval from Musical Files
PublicationThis paper presents a comparison of the effectiveness of two computational intelligence approaches applied to the task of retrieving rhythmic structure from musical files. The method proposed by the authors of this paper generates rhythmic levels first, and then uses these levels to compose rhythmic hypotheses. Three phases: creating periods, creating simplified hypotheses and creating full hypotheses are examined within this study....
-
Towards Audio Signal Equalization Based on Spectral Characteristics of a Listening Room and Music Content Reproduced
PublicationThis study presents investigations of the influence of the room acoustics on the frequency characteristic of the audio signal playback. First, the concept of a novel spectral equalization method of the room acoustic conditions is introduced. On the basis of the room spectral response, a system for room acoustics compensation based on an equalizer designed is proposed. The system settings depend on music genre recognized automatically....
-
Automatic Clustering of EEG-Based Data Associated with Brain Activity
PublicationThe aim of this paper is to present a system for automatic assigning electroencephalographic (EEG) signals to appropriate classes associated with brain activity. The EEG signals are acquired from a headset consisting of 14 electrodes placed on skull. Data gathered are first processed by the Independent Component Analysis algorithm to obtain estimates of signals generated by primary sources reflecting the activity of the brain....
-
Knowledge representation of motor activity of patients with Parkinson’s disease
PublicationAn approach to the knowledge representation extraction from biomedical signals analysis concerning motor activity of Parkinson disease patients is proposed in this paper. This is done utilizing accelerometers attached to their body as well as exploiting video image of their hand movements. Experiments are carried out employing artificial neural networks and support vector machine to the recognition of characteristic motor activity...
-
Evaluation of the separation algorithm performance employing ANNs
PublicationCelem niniejszego rozdziału jest przedstawienie metodyki separacji dźwięków muzycznych bez informacji a priori o dźwiękach zawartych w muzycznym miksie. W pracy pokazano, że prawidłowo wytrenowana sztuczna sieć neuronowa (SNN)jest w stanie w sposób automatyczny poprawnie sklasyfikować dźwięki zawarte w zmiksowanym sygnale. Skuteczność klasyfikacji SNN jest porównywalna z oceną subiektywną ekspertów.
-
A study on signal processing methods applied to hearing aids
PublicationThis paper presents a short survey on current technology available in hearing aids with a focus on digital signal processing techniques used. First, factors influencing the hearing aid effectiveness are introduced. Then, examples of the present DSP methods and strategies are provided. Also, a description of current limitations of hearing aids and future trends of development are shown. Finally, the notion of computational auditory...
-
Method for Clustering of Brain Activity Data Derived from EEG Signals
PublicationA method for assessing separability of EEG signals associated with three classes of brain activity is proposed. The EEG signals are acquired from 23 subjects, gathered from a headset consisting of 14 electrodes. Data are processed by applying Discrete Wavelet Transform (DWT) for the signal analysis and an autoencoder neural network for the brain activity separation. Processing involves 74 wavelets from 3 DWT families: Coiflets,...
-
Low-Level Music Feature Vectors Embedded as Watermarks
PublicationIn this paper a method consisting in embedding low-level music feature vectors as watermarks into a musical signal is proposed. First, a review of some recent watermarking techniques and the main goals of development of digital watermarking research are provided. Then, a short overview of parameterization employed in the area of Music Information Retrieval is given. A methodology of non-blind watermarking applied to music-content...
-
Musical Instrument Separation Applied to Music Genre Classification . Separacja instrumentów muzycznych w zastosowaniu do rozpoznawania gatunków muzycznych
PublicationThis paper outlines first issues related to music genre classification and a short description of algorithms used for musical instrument separation. Also, the paper presents proposed optimization of the feature vectors used for music genre recognition. Then, the ability of decision algorithms to properly recognize music genres is discussed based on two databases. In addition, results are cited for another database with regard to...
-
Personalized avatar animation for virtual reality
PublicationThe paper presents a method for creating a personalized animation of avatar for virtual reality application such as multiplayer on-line games. Animation is stored in a simplified version, containing only keyframes for important avatar poses. This version defines key movements, i.e. roughly describes the avatar's action. Animation is enriched by the user with new motion phases utilizing fuzzy descriptors.Various degrees of motion...
-
An Attempt to Create Speech Synthesis Model That Retains Lombard Effect Characteristics
PublicationThe speech with the Lombard effect has been extensively studied in the context of speech recognition or speech enhancement. However, few studies have investigated the Lombard effect in the context of speech synthesis. The aim of this paper is to create a mathematical model that allows for retaining the Lombard effect. These models could be used as a basis of a formant speech synthesizer. The proposed models are based on dividing...
-
Assessment of the Effectiveness of a Short-term Hearing Aid Use in Patients with Different Degrees of Hearing Loss
PublicationThe study presents evaluating the effectiveness of the hearing aid fitting process in the short-term use (7 days). The evaluation method consists of a survey based on the APHAB (Abbreviated Profile of Hearing Aid Benefit) questionnaire. Additional criteria such as a degree of hearing loss, number of hours and days of hearing aid use as well as the user’s experience were also taken into consideration. The outcomes of the benefit...
-
Comparison of Lithuanian and Polish Consonant Phonemes Based on Acoustic Analysis – Preliminary Results
PublicationThe goal of this research is to find a set of acoustic parameters that are related to differences between Polish and Lithuanian language consonants. In order to identify these differences, an acoustic analysis is performed, and the phoneme sounds are described as the vectors of acoustic parameters. Parameters known from the speech domain as well as those from the music information retrieval area are employed. These parameters are...
-
Analysis of allophones based on audio signal recordings and parameterization
PublicationThe aim of this study is to develop an allophonic description of English plosive consonants based on recordings of 600 specially selected words. Allophonic variations addressed in the study may have two sources: positional and contextual. The former one depends on the syllabic or prosodic position in which a particular phoneme occurs. Contextual allophony is conditioned by the local phonetic environment. Co-articulation overlapping...
-
A Study in Experimental Methods of Human-Computer Communication for Patients After Severe Brain Injuries
PublicationExperimental research in the domain of multimedia technology applied to medical practice is discussed, employing a prototype of integrated multimodal system to assist diagnosis and polysensory stimulation of patients after severe brain injury. The system being developed includes among others: eye gaze tracker, and EEG monitoring of non-communicating patients after severe brain injuries. The proposed solutions are used for collecting...
-
Creating a Remote Choir Performance Recording Based on an Ambisonic Approach
PublicationThe aim of this paper is three-fold. First, the basics of binaural and ambisonic techniques are briefly presented. Then, details related to audio-visual recordings of a remote performance of the Academic Choir of the Gdańsk University of Technology are shown. Due to the COVID-19 pandemic, artists had a choice, namely, to stay at home and not perform or stay at home and perform. In fact, staying at home brought in the possibility...
-
New method for personalization of avatar animation
PublicationThe paper presents a method for creating a personalized animation of avatar utilizing fuzzy inference. First the user designs a prototype version of animation, with keyframes only for important poses, roughly describing the action. Then animation is enriched with new motion phases calculated by the fuzzy inference system using descriptors given by the user. Various degrees of motion fluency and naturalness are possible to achieve....
-
Investigating Feature Spaces for Isolated Word Recognition
PublicationThe study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...
-
Improving Objective Speech Quality Indicators in Noise Conditions
PublicationThis work aims at modifying speech signal samples and test them with objective speech quality indicators after mixing the original signals with noise or with an interfering signal. Modifications that are applied to the signal are related to the Lombard speech characteristics, i.e., pitch shifting, utterance duration changes, vocal tract scaling, manipulation of formants. A set of words and sentences in Polish, recorded in silence,...
-
Objectivization of phonological evaluation of speech elements by means of audio parametrization
PublicationThis study addresses two issues related to both machine- and subjective-based speech evaluation by investigating five phonological phenomena related to allophone production. Its aim is to use objective parametrization and phonological classification of the recorded allophones. These allophones were selected as specifically difficult for Polish speakers of English: aspiration, final obstruent devoicing, dark lateral /l/, velar nasal...
-
Modeling and Designing Acoustical Conditions of the Interior – Case Study
PublicationThe primary aim of this research study was to model acoustic conditions of the Courtyard of the Gdańsk University of Technology Main Building, and then to design a sound reinforcement system for this interior. First, results of measurements of the parameters of the acoustic field are presented. Then, the comparison between measured and predicted values using the ODEON program is shown. Collected data indicate a long reverberation...
-
Loudness Scaling Test Based on Categorical Perception
PublicationThe main goal of this research study is focused on creating a method for loudness scaling based on categorical perception. Its main features, such as: way of testing, calibration procedure for securing reliable results, employing natural test stimuli, etc., are described in the paper and assessed against a procedure that uses 1/2-octave bands of noise (LGOB) for the loudness growth estimation. The Mann-Whitney U-test is employed...
-
Expert media approach to hearing aids fitting
PublicationW artykule zaprezentowano problematykę dopasowania protez słuchu. Przedstawiono system ekspercki, który pozwala na znalezienie charakterystyk aparatu słuchowego adekwatnego do uszkodzenia słuchu. System został oparty o metodę zbiorów przybliżonych i logikę rozmytą.
-
A Novel Method for Intelligibility Assessment of Nonlinearly Processed Speech in Spaces Characterized by Long Reverberation Times
PublicationObjective assessment of speech intelligibility is a complex task that requires taking into account a number of factors such as different perception of each speech sub-bands by the human hearing sense or different physical properties of each frequency band of a speech signal. Currently, the state-of-the-art method used for assessing the quality of speech transmission is the speech transmission index (STI). It is a standardized way...
-
Visual Data Encryption for Privacy Enhancement in Surveillance Systems
PublicationIn this paper a methodology for employing reversible visual encryption of data is proposed. The developed algorithms are focused on privacy enhancement in distributed surveillance architectures. First, motivation of the study performed and a short review of preexisting methods of privacy enhancement are presented. The algorithmic background, system architecture along with a solution for anonymization of sensitive regions of interest...
-
Building Knowledge for the Purpose of Lip Speech Identification
PublicationConsecutive stages of building knowledge for automatic lip speech identification are shown in this study. The main objective is to prepare audio-visual material for phonetic analysis and transcription. First, approximately 260 sentences of natural English were prepared taking into account the frequencies of occurrence of all English phonemes. Five native speakers from different countries read the selected sentences in front of...
-
Human-computer interaction approach applied to the multimedia system of polysensory integration
PublicationIn the paper an approach of utilizing an interaction between the human and computer in a therapy of dyslexia and other sensory disorders is presented. Bakker's neuropsychological concept of dyslexia along with therapy methods are reviewed in the context of the Multimedia System of Polysensory Integration, proposed at the Multimedia Systems Department of Gdansk Univ. of Technology. The system is presented along with the training...
-
Comparative analysis of spectral and cepstral feature extraction techniques for phoneme modelling
PublicationPhoneme parameter extraction framework based on spectral and cepstral parameters is proposed. Using this framework, the phoneme signal is divided into frames and Hamming window is used. The performances are evaluated for recognition of Lithuanian vowel and semivowel phonemes. Different feature sets without noise as well as at different level of noise are considered. Two classical machine learning methods (Naive Bayes and Support...
-
Virtual Whiteboard: A gesture-controlled pen-free tool emulating school whiteboard
PublicationIn the paper the so-called Virtual Whiteboard is presented which may be an alternative solution for modern electronic whiteboards based on electronic pens and sensors. The presented tool enables the user to write, draw and handle whiteboard contents using his/her hands only. An additional equipment such as infrared diodes, infrared cameras or cyber gloves is not needed. The user's interaction with the Virtual Whiteboard computer...
-
A study on of music features derived from audio recordings examples – a quantitative analysis
PublicationThe paper presents a comparative study of music features derived from audio recordings, i.e. the same music pieces but representing different music genres, excerpts performed by different musicians, and songs performed by a musician, whose style evolved over time. Firstly, the origin and the background of the division of music genres were shortly presented. Then, several objective parameters of an audio signal were recalled that...
-
Tinnitus Therapy Based on High-Frequency Linearization Principles - Preliminary Results
PublicationThe aim of this work is to present problems related to tinnitus symptoms, its pathogenesis, hypotheses on tinnitus causes, and therapy treatment to reduce or mask the phantom noise. In addition, the hypothesis on the existence of parasitic quantization that accompanies hearing loss has been recalled. Moreover, the paper describes a study carried out by the Authors with the application of high-frequency dither having specially formed...
-
Analysis of impact of lossy audio compression on the robustness of watermark embedded in the DWT domain for non-blind copyright protection
PublicationA methodology of non-blind watermarking of the audio content is proposed. The outline of audio copyright problem and motivation for practical applications are discussed. The algorithmic theory pertaining watermarking techniques is briefly introduced. The system architecture together with employed workflows for embedding and extracting the watermarks are described. The implemented approach is described and obtained results are reported....
-
Improving listeners' experience for movie playback through enhancing dialogue clarity in soundtracks
PublicationThis paper presents a method for improving users' quality of experience through processing of movie soundtracks. The dialogue clarity enhancement algorithms were introduced for detecting dialogue in movie soundtrack mixes and then for amplifying the dialogue components. The front channel signals (left, right, center) are analyzed in the frequency domain. The selected partials in the center channel signal, which yield high disparity...
-
Analyzing the Effectiveness of the Brain–Computer Interface for Task Discerning Based on Machine Learning
PublicationThe aim of the study is to compare electroencephalographic (EEG) signal feature extraction methods in the context of the effectiveness of the classification of brain activities. For classification, electroencephalographic signals were obtained using an EEG device from 17 subjects in three mental states (relaxation, excitation, and solving logical task). Blind source separation employing independent component analysis (ICA) was...
-
Listening to Live Music: Life beyond Music Recommendation Systems
PublicationThis paper presents first a short review on music recommendation systems based on social collaborative filtering. A dictionary of terms related to music recommendation systems, such as music information retrieval (MIR), Query-by-Example (QBE), Query-by-Category (QBC), music content, music annotating, music tagging, bridging the semantic gap in music domain, etc. is introduced. Bases of music recommender systems are shortly presented,...
-
Online Sound Restoration for Digital Library Applications
PublicationA system for sound restoration was conceived and engineered having the following features: no special sound restoration software is needed to perform audio restoration by the user, the process of restoration employs automatic reduction of noise, wow and impulse distortions performed in the online mode, no skills in digital signal processing from the user are needed. The principles of the created system and its features as well...