Publications
Filters
total: 890
Catalog Publications
Year 2017
-
An audio-visual corpus for multimodal automatic speech recognition
Publicationreview of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...
-
Analysis of allophones based on audio signal recordings and parameterization
PublicationThe aim of this study is to develop an allophonic description of English plosive consonants based on recordings of 600 specially selected words. Allophonic variations addressed in the study may have two sources: positional and contextual. The former one depends on the syllabic or prosodic position in which a particular phoneme occurs. Contextual allophony is conditioned by the local phonetic environment. Co-articulation overlapping...
-
ANN for human pose estimation in low resolution depth images
PublicationThe paper presents an approach to localize human body joints in 3D coordinates based on a single low resolution depth image. First a framework to generate a database of 80k realistic depth images from a 3D body model is described. Then data preprocessing and normalization procedure, and DNN and MLP artificial neural networks architectures and training are presented. The robustness against camera distance and image noise is analysed....
-
Assessment of hearing in coma patients employing auditory brainstem response, electroencephalography, and eye-gaze-tracking
PublicationThe results of the study conducted by Tagliaferri et al. in 12 European countries indicate that the ratio of registered brain injury cases in Europe amounts to 150-300 per 100 000 people, with the European mean value of 235 cases per 100 000 people. The project presented in the paper assumes development of a combined metric of patients’ state remaining in coma by intelligent fusion of GCS (subjective Glasgow Coma Scale or its derivatives)...
-
Automatic music set organizatio based on mood of music / Automatyczna organizacja bazy muzycznej na podstawie nastroju muzyki
PublicationThis work is focused on an approach based on the emotional content of music and its automatic recognition. A vector of features describing emotional content of music was proposed. Additionally, a graphical model dedicated to the subjective evaluation of mood of music was created. A series of listening tests was carried out, and results were compared with automatic mood recognition employing SOM (Self Organizing Maps) and ANN (Artificial...
-
Badanie wierności brzmienia dźwięku instrumentów wirtualnych VST/TRTAS
PublicationTematem referatu jest subiektywne badanie wierności brzmienia instrumentów wirtualnych (VST/TRTAS) wykorzystujących próbkowanie dźwięków rzeczywistych instrumentów muzycznych. Na potrzeby przedstawionej pracy wybrano kilka utworów muzyki orkiestrowej z epoki romantyzmu i klasycyzmu, nagranych przy użyciu instrumentów akustycznych. Następnie zaaranżowano fragmenty tych utworów, wykorzystując do tego instrumenty wirtualne i efekty...
-
Building Knowledge for the Purpose of Lip Speech Identification
PublicationConsecutive stages of building knowledge for automatic lip speech identification are shown in this study. The main objective is to prepare audio-visual material for phonetic analysis and transcription. First, approximately 260 sentences of natural English were prepared taking into account the frequencies of occurrence of all English phonemes. Five native speakers from different countries read the selected sentences in front of...
-
Classifying type of vehicles on the basis of data extracted from audio signal characteristics
PublicationThe aim of this study is to find and optimize a feature vector for an automatic recognition of the type of vehicles, extracted form an audio signal. First, the influence of weather-based conditions of road surface on spectral characteristic of the audio signal recorded from a passing vehicle in close proximity to the road is discussed. Next, parameterization of the recorded audio signal is performed. For that purpose, the MIRtoolbox,...
-
Comparative Study of Self-Organizing Maps vs. Subjective Evaluation of Quality of Allophone Pronunciation for Nonnative English Speakers
PublicationThe purpose of this study was to apply Self-Organizing Maps to differentiate between the correct and the incorrect allophone pronunciations and to compare the results with subjective evaluation. Recordings of a list of target words, containing selected allophones of English plosive consonants, the velar nasal and the lateral consonant, were made twice. First, the target words were read from the list by 9 non-native speakers and...
-
Comparison of selected electroencephalographic signal classification methods
PublicationA variety of methods exists for electroencephalographic (EEG) signals classification. In this paper, we briefly review selected methods developed for such a purpose. First, a short description of the EEG signal characteristics is shown. Then, a comparison between the selected EEG signal classification methods, based on the overview of research studies on this topic, is presented. Examples of methods included in the study are: Artificial...
-
Detection of the Incoming Sound Direction Employing MEMS Microphones and the DSP
PublicationA 3D acoustic vector sensor based on MEMS microphones and its application to road traffic monitoring is presented in the paper. The sensor is constructed from three pairs of digital MEMS microphones, mounted on the orthogonal axes. Signals obtained from the microphones are used to compute sound intensity vectors in each direction. With this data, it is possible to compute the horizontal and vertical angle of an incoming sound....
-
Determining Pronunciation Differences in English Allophones Utilizing Audio Signal Parameterization
PublicationAn allophonic description of English plosive consonants, based on audio-visual recordings of 600 specially selected words, was developed. First, several speakers were recorded while reading words from a teleprompter. Then, every word was played back from the previously recorded sample read by a phonology expert and each examined speaker repeated a particular word trying to imitate correct pronunciation. The next step consisted...
-
Evaluation of Face Detection Algorithms for the Bank Client Identity Verification
PublicationResults of investigation of face detection algorithms efficiency in the banking client visual verification system are presented. The video recordings were made in real conditions met in three bank operating outlets employing a miniature industrial USB camera. The aim of the experiments was to check the practical usability of the face detection method in the biometric bank client verification system. The main assumption was to provide...
-
Examining Feature Vector for Phoneme Recognition / Analiza parametrów w kontekście automatycznej klasyfikacji fonemów
PublicationThe aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...
-
Handwritten signature verification system employing wireless biometric pen
PublicationThe handwritten signature verification system being a part of the developed multimodal biometric banking stand is presented. The hardware component of the solution is described with a focus on the signature acquisition and on verification procedures. The signature is acquired employing an accelerometer and a gyroscope built-in the biometric pen plus pressure sensors for the assessment of the proper pen grip and then the signature...
-
Intelligent equalizer solution employing music genre and the room characteristics analysis
PublicationThe paper presents an intelligent equalizer solution based on room acoustic conditions and music genre analysis. A series of acoustic characteristic measurements are performed for checking the concept proposed. White noise (reference signal) and audio excerpts belonging to six music genres are utilized as excitation signals in measurements. This results in registration of frequency responses of rooms and reverberation times. Signals...
-
Komputerowe oko świadomości
PublicationZnane metody badania osób w śpiączce nie dają odpowiedzi na pytanie jak funkcjonuje poznawczo osoba wybudzona ze śpiączki z obniżoną świadomością. Książka podsumowuje wyniki badań przybliżających odpowiedź na powyższe pytanie. Część prezentowanych badań była prowadzona przez autorów i współpracowników już wcześniej, z wykorzystaniem skonstruowanego urządzenia do śledzenia wzroku, zaś nowsze prezentowane badania dostarczyły wyników...
-
Measurement and visualization of sound intensity vector distribution in proximity of acoustic diffusers
PublicationIn this work, we would like to present analyses and visualizations of sound intensity distribution measured in proximity of an acoustic diffuser. Such distribution may be used for estimation of basic acoustic parameters of a diffuser. Measurement is performed with the use of a logarithmic sine sweep which allows for the analysis of waves scattered by the diffuser and rejecting the direct sound signal component. Pressure and sound...
-
METODA OCENY EFEKTYWNOŚCI KRÓTKOTERMINOWEGO STOSOWANIA APARATÓW SŁUCHOWYCH Z WYKORZYSTANIEM APLIKACJI INTERNETOWEJ
PublicationW pracy przedstawiono opracowanie metody oceny efektywności protezowania osób niedosłyszących aparatami słuchowymi. Metoda polega na badaniu ankietowym opartym na kwestionariuszu oceny APHAB uzupełnionym testem rozumienia słów jednosylabowych w polu swobodnym. Uwzględniono dodatkowe kryteria, takie jak: stopień ubytku słuchu, pomiar liczby dni i godzin korzystania z aparatów słuchowych oraz doświadczenia pacjenta. Metoda została...
-
Modelowanie procesów słuchowych w celu oceny ryzyka uszkodzeń słuchu
PublicationW rozdziale przedstawiono sposób modelowania fizjologicznych procesów zachodzących w systemie słuchowym. Opisano szczegółowo autorską metodę oceny szkodliwego oddziaływania hałasu na słuch. W rozdziale opisano również kocepcję psychoakustycznej dozymetrii hałasowej. Bazuje ona na wyznaczaniu wartości czasowego przesunięcia progu słyszenia w pasmach krytycznych słuchu. Uwzględnia również efekty wywołane przez refleks akustyczny....
-
Modified dynamic time warping method applied to handwritten signature authenticity verification
PublicationA signature verification system based on static features and time-domain functions of signals obtained using a tablet has been presented in the paper. The signature verification method, based mainly on dynamic time warping coupled with some signature image features, has been described. The FRR measures reflecting the method’s efficiency have been evaluated for verification attempts performed directly after obtaining model signatures...
-
Multimodal Approach For Polysensory Stimulation And Diagnosis Of Subjects With Severe Communication Disorders
Publicationis evaluated on 9 patients, data analysis methods are described, and experiments of correlating Glasgow Coma Scale with extracted features describing subjects performance in therapeutic exercises exploiting EEG and eyetracker are presented. Performance metrics are proposed, and k-means clusters used to define concepts for mental states related to EEG and eyetracking activity. Finally, it is shown that the strongest correlations...
-
Multimodal system for diagnosis and polysensory stimulation of subjects with communication disorders
PublicationAn experimental multimodal system, designed for polysensory diagnosis and stimulation of persons with impaired communication skills or even non-communicative subjects is presented. The user interface includes an eye tracking device and the EEG monitoring of the subject. Furthermore, the system consists of a device for objective hearing testing and an autostereoscopic projection system designed to stimulate subjects through their...
-
OCENA EFEKTYWNOŚCI KRÓTKOTERMINOWEGO ZASTOSOWANIA APARATÓW SŁUCHOWYCH Z WYKORZYSTANIEM APLIKACJI INTERNETOWEJ
PublicationW pracy przedstawiono opracowanie metody oceny efektywności protezowania osób niedosłyszących aparatami słuchowymi. Metoda polega na badaniu ankietowym opartym na kwestionariuszu oceny APHAB uzupełnionym testem rozumienia słów jednosylabowych w polu swobodnym. Uwzględniono dodatkowe kryteria, takie jak: stopień ubytku słuchu, po-miar liczby godzin i dni korzystania z aparatów słuchowych oraz doświadczenia pacjenta. Metoda została...
-
Performance Evaluation of Selected Parallel Object Detection and Tracking Algorithms on an Embedded GPU Platform
PublicationPerformance evaluation of selected complex video processing algorithms, implemented on a parallel, embedded GPU platform Tegra X1, is presented. Three algorithms were chosen for evaluation: a GMM-based object detection algorithm, a particle filter tracking algorithm and an optical flow based algorithm devoted to people counting in a crowd flow. The choice of these algorithms was based on their computational complexity and parallel...
-
Pilot Testing of Developed Multimodal Biometric Identity Verification System
PublicationThe bank client identity verification system developed in the course of the IDENT project is presented. The total number of five biometric modalities including: dynamic signature proofing, voice recognition, face image verification, face contour extraction and hand blood vessels distribution comparison have been developed and studied. The experimental data were acquired employing multiple biometric sensors installed at engineered...
-
Real and imaginary motion classification based on rough set analysis of EEG signals for multimedia applications
PublicationRough set-based approach to the classification of EEG signals of real and imaginary motion is presented. The pre-processing and signal parametrization procedures are described, the rough set theory is briefly introduced, and several classification scenarios and parameters selection methods are proposed. Classification results are provided and discussed with their potential utilization for multimedia applications controlled by the...
-
Road surface roughness estimation employing integrated position and acceleration sensor
PublicationAssessment of a surface quality being an essential task for the authorities supervising the roads provides the subject of the paper. Information about riding quality of a pavement, important for drivers, both in terms of their comfort and safety is collected during experiments employing mobile sensors. The paper describes the use of a miniature position and acceleration sensor for evaluation of the roughness of the road surface....
-
Shape-Based Pose Estimation of Robotic Surgical Instruments
PublicationWe describe a detector of robotic instrument parts in image-guided surgery. The detector consists of a huge ensemble of scale-variant and pose-dedicated, rigid appearance templates. The templates, which are equipped with pose-related keypoints and segmentation masks, allow for explicit pose estimation and segmentation of multiple end-effectors as well as fine-grained non-maximum suppression. We train the templates by grouping examples...
-
Sound intensity distribution around organ pipe
PublicationThe aim of the paper was to compare acoustic field around the open and stopped organ pipes. The wooden organ pipe was located in the anechoic chamber and activated with a constant air flow, produced by an external air-compressor. Thus, long-term steady state response was possible to obtain. Multichannel acoustic vector sensor was used to measure the sound intensity distribution of radiated acoustic energy. Measurements have been...
-
Stworzenie stereofonicznej ścieżki dźwiękowej do filmu symulującej dźwięk wielokanałowy
PublicationCelem referatu pracy jest przedstawienie procesu tworzenia stereofonicznej ścieżki dźwiękowej do filmu, symulującej dźwięk wielokanałowy w odsłuchu słuchawkowym. Opracowana symulacja dźwięku wielokanałowego wykorzystuje filtrację HRTF (ang. Head-Related-Transfer-Function). W celu umożliwienia jednoczesnego odsłuchu kilku partii instrumentalnych składających się na ścieżkę dźwiękową stworzona została aplikacja wraz z graficznym...
-
Surgical tool tracking by on-line selection of structural correlation filters
PublicationIn visual tracking of surgical instruments, correlation filtering finds the best candidate with maximal correlation peak. However, most trackers only consider capturing target appearance but not target structure. In this paper we propose surgical instrument tracking approach that integrates prior knowledge related to rotation of both shaft and tool tips. To this end, we employ rigid parts mixtures model of an instrument. The rigidly...
-
SYMULACJA DŹWIĘKU PRZESTRZENNEGO W ŚCIEŻCE DŹWIĘKOWEJ W ODSŁUCHU BINAURALNYM
PublicationCelem pracy jest przedstawienie aplikacji umożliwiającej tworzenie stereofonicznej ścieżki dźwiękowej do filmu, symulującej dźwięk przestrzenny w odsłuchu słuchawkowym. Interfejs przygotowanej aplikacji pozwala użytkownikowi na wybór rozmieszczenia konkretnych partii instrumentalnych w odpowiednich miejscach w przestrzeni dźwiękowej oraz jednoczesny odsłuch wszystkich ścieżek wraz z przygotowanym materiałem filmowym. Symulacja...
-
The project IDENT: Multimodal biometric system for bank client identity verification
PublicationBiometric identity verification methods are implemented inside a real banking environment comprising: dynamic handwritten signature verification, face recognition, bank cli-ent voice recognition and hand vein distribution verification. A secure communication system based on an intra-bank client-server architecture was designed for this purpose. Hitherto achieved progress within the project is reported in this paper with a focus...
-
Tracking Moving Objects in Video Surveillance Systems with Kalman and Particle Filters – A Practical Approach
PublicationThis Chapter focuses on the first type of object tracking algorithms, namely on Kalman and particle filters. A theory of these algorithms may be found in many publications, there are also reports on implementation of these approaches to object tracking in video. However, developers of VCA systems still face two important problems. The first one is related to obtaining accurate measurements of positions and sizes of the tracked...
-
Traffic Noise Analysis Applied to Automatic Vehicle Counting and Classification
PublicationProblems related to determining traffic noise characteristics are discussed in the context of automatic dynamic noise analysis based on noise level measurements and traffic prediction models. The obtained analytical results provide the second goal of the study, namely automatic vehicle counting and classification. Several traffic prediction models are presented and compared to the results of in-situ noise level measurements. Synchronized...
-
Voiceless Stop Consonant Modelling and Synthesis Framework Based on MISO Dynamic System
PublicationA voiceless stop consonant phoneme modelling and synthesis framework based on a phoneme modelling in low-frequency range and high-frequency range separately is proposed. The phoneme signal is decomposed into the sums of simpler basic components and described as the output of a linear multiple-input and single-output (MISO) system. The impulse response of each channel is a third order quasi-polynomial. Using this framework, the...
-
Wspomaganie komunikacji w procesie neurorehabilitacji z wykorzystaniem śledzenia wzroku i analizy sygnałów EEG
PublicationW pracy przedstawiono charakterystykę systemu do wspomagania komunikacji w procesie neurorehabilitacji osób w stanie ograniczonej świadomości. Przygotowana aplikacja komputerowa wykorzystuje metodę śledzenia wzroku wspomaganą analizą sygnału EEG. W pracy podano genezę powstania systemu, scharakteryzowano zaimplementowane ćwiczenia oraz pozostałe funkcjonalności, a także zamieszczono wyniki wstępnych badań dokonanych w kilku polskich...
Year 2018
-
A comparative study of English viseme recognition methods and algorithm
PublicationAn elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...
-
A comparative study of English viseme recognition methods and algorithms
PublicationAn elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector construction...
-
A Comparison of STI Measured by Direct and Indirect Methods for Interiors Coupled with Sound Reinforcement Systems
PublicationThis paper presents a comparison of STI (Speech Transmission Index) coefficient measurement results carried out by direct and indirect methods. First, acoustic parameters important in the context of public address and sound reinforcement systems are recalled. A measurement methodology is presented that employs various test signals to determine impulse responses. The process of evaluating sound system performance, signals enabling...
-
A Device for Measuring Auditory Brainstem Responses to Audio
PublicationStandard ABR devices use clicks and tone bursts to assess subjects’ hearing in an objective way. A new device was developed that extends the functionality of a standard ABR audiometer by collecting and analyzing auditory brainstem responses (ABR). The developed accessory allows for the use of complex sounds (e.g., speech or music excerpts) as stimuli. Therefore, it is possible to find out how efficiently different types of sounds...
-
A fast time-frequency multi-window analysis using a tuning directional kernel
PublicationIn this paper, a novel approach for time-frequency analysis and detection, based on the chirplet transform and dedicated to non-stationary as well as multi-component signals, is presented. Its main purpose is the estimation of spectral energy, instantaneous frequency (IF), spectral delay (SD), and chirp rate (CR) with a high time-frequency resolution (separation ability) achieved by adaptive fitting of the transform kernel. We...
-
A Stand for Measurement and Prediction of Scattering Properties of Diffusers
PublicationIn this paper we present a set of solutions which may be used for prototyping and simulation of acoustic scattering devices. A system proposed is capable of measuring sound field. Also a way to use an open source solution for simulation of scattering phenomena occurring in proximity of acoustic diffusers is shown. The result of our work are measurement procedure and a prototype of the simulation script based on FEniCS - an open source...
-
A Study on Audio Signal Processed by "Instant Mastering"
PublicationAn increasing amount of music produced in home- and project-studios results in development and growth of "automatic mastering services". The presented investigation explores changes introduced to audio signal by various online mastering platforms. A music set consisting of 10 songs produced in small facilities was processed by eight on-line automatic mastering services. Additionally, some laboratory-constructed signals were tested....
-
A study on of music features derived from audio recordings examples – a quantitative analysis
PublicationThe paper presents a comparative study of music features derived from audio recordings, i.e. the same music pieces but representing different music genres, excerpts performed by different musicians, and songs performed by a musician, whose style evolved over time. Firstly, the origin and the background of the division of music genres were shortly presented. Then, several objective parameters of an audio signal were recalled that...
-
Adaptacja akustyczna pomieszczenia wykładowego - studium przypadku
PublicationW niniejszej pracy przedstawiono analizę rozkładu pola akustycznego sali wykładowej znajdującej się w budynku Wydziału Elektroniki i Telekomunikacji Politechniki Gdańskiej. Badania przeprowadzono metodą pomiarową oraz symulacyjną z wykorzystaniem programu Odeon. Wybór parametrów oceny akustyki wnętrz sugerowany jest wymaganiami stawianymi pomieszczeniom lekcyjnym z zaznaczeniem multimedialnego charakteru wykładów prowadzonych...
-
Akustyczna analiza parametrów ruchu drogowego z wykorzystaniem informacji o hałasie oraz uczenia maszynowego
PublicationCelem rozprawy było opracowanie akustycznej metody analizy parametrów ruchu drogowego. Zasada działania akustycznej analizy ruchu drogowego zapewnia pasywną metodę monitorowania natężenia ruchu. W pracy przedstawiono wybrane metody uczenia maszynowego w kontekście analizy dźwięku (ang.Machine Hearing). Przedstawiono metodologię klasyfikacji zdarzeń w ruchu drogowym z wykorzystaniem uczenia maszynowego. Przybliżono podstawowe...
-
An application of acoustic sensors for the monitoring of road traffic
PublicationAssessment of road traffic parameters for the developed intelligent speed limit setting decision system constitutes the subject addressed in the paper. Current traffic conditions providing vital data source for the calculation of the locally fitted speed limits are assessed employing an economical embedded platform placed at the roadside. The use of the developed platform employing a low-powered processing unit with a set of microphones,...
-
Analiza Nagrań Ruchu Drogowego w Kontekście Akustycznej Klasyfikacji Typu Pojazdu
PublicationCelem niniejszej pracy jest przeprowadzenie analizy sygnału fonicznego w kontekście klasyfikacji typu pojazdu. Część teoretyczna zawiera krytyczny przegląd systemów monitorowania ruchu drogowego, w szczególności systemów ITS (Intelginet Transport System). Część praktyczna przedstawia założenia dotyczące przygotowania bazy nagrań testowych, uwzględniających różne scenariusze ruchu drogowego. Zarejestrowane sesje nagraniowe przetworzono,...
-
Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition
Publicationconvolutional neural network (CNN) which is a class of deep, feed-forward artificial neural network. We decided to analyze audio signal feature maps, namely spectrograms, linear and Mel-scale cepstrograms, and chromagrams. The choice was made upon the fact that CNN performs well in 2D data-oriented processing contexts. Feature maps were employed in the Lithuanian word recognition task. The spectral analysis led to the highest word...
-
Analysis of Lombard speech using parameterization and the objective quality indicators in noise conditions
PublicationThe aim of the work is to analyze Lombard speech effect in recordings and then modify the speech signal in order to obtain an increase in the improvement of objective speech quality indicators after mixing the useful signal with noise or with an interfering signal. The modifications made to the signal are based on the characteristics of the Lombard speech, and in particular on the effect of increasing the fundamental frequency...
-
Analysis of results of large-scale multimodal biometric identity verification experiment
PublicationAn analysis of a large set of biometric data obtained during the enrolment and the verification phase in an experimental biometric system installed in bank branches is presented. Subjective opinions of bank clients and of bank tellers were also surveyed concerning the studied biometric methods in order to discover and to explore relations emerging from the obtained multimodal dataset. First, data acquisition and identity verification...
-
Aparat słuchowy a alternatywne urządzenia poprawiające słyszenie
PublicationW opracowaniu dokonano przeglądu dostępnych prac dotyczących różnych rodzajów urządzeń poprawiających słyszenie, które w szczególnych przypadkach mogą być traktowane jako rozwiązania alternatywne w stosunku do klasycznych aparatów słuchowych. Praca zawiera dyskusję na temat nowego rodzaju aparatu słuchowego wstępnie zaprogramowanego, który może być dystrybuowany korespondencyjnie lub bezpośrednio potencjalnym użytkownikom. Ponadto...
-
Assessment of Therapeutic Progress After Acquired Brain Injury Employing Electroencephalography and Autoencoder Neural Networks
PublicationA method developed for parametrization of EEG signals gathered from participants with acquired brain injuries is shown. Signals were recorded during therapeutic session consisting of a series of computer assisted exercises. Data acquisition was performed in a neurorehabilitation center located in Poland. The presented method may be used for comparing the performance of subjects with acquired brain injuries (ABI) who are involved...
-
AUDIO SIGNAL EQUALIZATION BASED ON IMPULSE RESPONSE OF A LISTENING ROOM AND MUSIC CONTENT REPRODUCED
PublicationA research study presents investigations of the influence of the room acoustics on the frequency characteristic of the audio signal playback. First, a concept of a novel spectral equalization method of the room acoustic conditions is introduced. On the basis of the room spectral response, a system for room acoustics compensation based on an equalizer designed is proposed. The system settings depend on music genre recognized automatically....
-
Audio-visual aspect of the Lombard effect and comparison with recordings depicting emotional states.
PublicationIn this paper an analysis of audio-visual recordings of the Lombard effect is shown. First, audio signal is analyzed indicating the presence of this phenomenon in the recorded sessions. The principal aim, however, was to discuss problems related to extracting differences caused by the Lombard effect, present in the video , i.e. visible as tension and work of facial muscles aligned to an increase in the intensity of the articulated...
-
Automatic Clustering of EEG-Based Data Associated with Brain Activity
PublicationThe aim of this paper is to present a system for automatic assigning electroencephalographic (EEG) signals to appropriate classes associated with brain activity. The EEG signals are acquired from a headset consisting of 14 electrodes placed on skull. Data gathered are first processed by the Independent Component Analysis algorithm to obtain estimates of signals generated by primary sources reflecting the activity of the brain....
-
Automatic music genre classification based on musical instrument track separation / Automatyczna klasyfikacja gatunku muzycznego wykorzystująca algorytm separacji dźwięku instrumentó muzycznych
PublicationThe aim of this article is to investigate whether separating music tracks at the pre-processing phase and extending feature vector by parameters related to the specific musical instruments that are characteristic for the given musical genre allow for efficient automatic musical genre classification in case of database containing thousands of music excerpts and a dozen of genres. Results of extensive experiments show that the approach...
-
Badanie stanu nawierzchni drogowej z wykorzystaniem uczenia maszynowego
PublicationW artykule opisano budowę systemu informowania o stanie nawierzchni drogowej z wykorzystaniem metod cyfrowego przetwarzania obrazów oraz uczenia maszynowego. Efektem wykonanych prac badawczych jest eksperymentalna platforma, pozwalająca na rejestrację uszkodzeń na drogach, system do analizy, przetwarzania i klasyfikacji danych oraz webowa aplikacja użytkownika do przeglądu stanu nawierzchni w wybranej lokalizacji.
-
Bimodal classification of English allophones employing acoustic speech signal and facial motion capture
PublicationA method for automatic transcription of English speech into International Phonetic Alphabet (IPA) system is developed and studied. The principal objective of the study is to evaluate to what extent the visual data related to lip reading can enhance recognition accuracy of the transcription of English consonantal and vocalic allophones. To this end, motion capture markers were placed on the faces of seven speakers to obtain lip...
-
Calibration of acoustic vector sensor based on MEMS microphones for DOA estimation
PublicationA procedure of calibration of a custom 3D acoustic vector sensor (AVS) for the purpose of direction of arrival (DoA) estimation, is presented and validated in the paper. AVS devices working on a p-p principle may be constructed from standard pressure sensors and a signal processing system. However, in order to ensure accurate DoA estimation, each sensor needs to be calibrated. The proposed algorithm divides the calibration process...
-
Classification of Music Genres by Means of Listening Tests and Decision Algorithms
PublicationThe paper compares the results of audio excerpt assignment to a music genre obtained in listening tests and classification by means of decision algorithms. A short review on music description employing music styles and genres is given. Then, assumptions of listening tests to be carried out along with an online survey for assigning audio samples to selected music genres are presented. A framework for music parametrization is created...
-
Closed-loop stimulation of temporal cortex rescues functional networks and improves memory
PublicationMemory failures are frustrating and often the result of ineffective encoding. One approach to improving memory outcomes is through direct modulation of brain activity with electrical stimulation. Previous efforts, however, have reported inconsistent effects when using open-loop stimulation and often target the hippocampus and medial temporal lobes. Here we use a closed-loop system to monitor and decode neural activity from direct...
-
CNN Architectures for Human Pose Estimation from a Very Low Resolution Depth Image
PublicationThe paper is dedicated to proposing and evaluating a number of convolutional neural network architectures for calculating a multiple regression on 3D coordinates of human body joints tracked in a single low resolution depth image. The main challenge was to obtain a high precision in case of a noisy and coarse scan of the body, as observed by a depth sensor from a large distance. The regression network was expected to reason about...
-
Comparative analysis of spectral and cepstral feature extraction techniques for phoneme modelling
PublicationPhoneme parameter extraction framework based on spectral and cepstral parameters is proposed. Using this framework, the phoneme signal is divided into frames and Hamming window is used. The performances are evaluated for recognition of Lithuanian vowel and semivowel phonemes. Different feature sets without noise as well as at different level of noise are considered. Two classical machine learning methods (Naive Bayes and Support...
-
Comparative analysis of various transformation techniques for voiceless consonants modeling
PublicationIn this paper, a comparison of various transformation techniques, namely Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT) and Discrete Walsh Hadamard Transform (DWHT) are performed in the context of their application to voiceless consonant modeling. Speech features based on these transformation techniques are extracted. These features are mean and derivative values of cepstrum coefficients, derived from each transformation....
-
Comparison of Classification Methods for EEG Signals of Real and Imaginary Motion
PublicationThe classification of EEG signals provides an important element of brain-computer interface (BCI) applications, underlying an efficient interaction between a human and a computer application. The BCI applications can be especially useful for people with disabilities. Numerous experiments aim at recognition of motion intent of left or right hand being useful for locked-in-state or paralyzed subjects in controlling computer applications....
-
Comparison of Methods for Real and Imaginary Motion Classification from EEG Signals
PublicationA method for feature extraction and results of classification of EEG signals obtained from performed and imagined motion are presented. A set of 615 features was obtained to serve for the recognition of type and laterality of motion using 8 different classifications approaches. A comparison of achieved classifiers accuracy is presented in the paper, and then conclusions and discussion are provided. Among applied algorithms the...
-
Counting and tracking vehicles using acoustic vector sensors
PublicationA method is presented for counting vehicles and for determining their movement direction by means of acoustic vector sensor application. The assumptions of the method employing spatial distribution of sound intensity determined with the help of an integrated 3D intensity probe are discussed. The intensity probe developed by the authors was used for the experiments. The mode of operation of the algorithm is presented in conjunction...
-
Determination of the Vehicles Speed Using Acoustic Vector Sensor
PublicationThe method for determining the speed of vehicles using acoustic vector sensor and sound intensity measurement technique was presented in the paper. First, the theoretical basis of the proposed method was explained. Next, the details of the developed algorithm of sound intensity processing both in time domain and in frequency domain were described. Optimization process of the method was also presented. Finally, the proposed measurement...
-
Economical methods for measuring road surface roughness
PublicationTwo low-cost methods of estimating the road surface condition are presented in the paper, the first one based on the use of accelerometers and the other on the analysis of images acquired from cameras installed in a vehicle. In the first method, miniature positioning and accelerometer sensors are used for evaluation of the road surface roughness. The device designed for installation in vehicles is composed of a GPS receiver and...
-
Editor's note and 2018 reviewers
PublicationPrzedmiotem pracy jest odniesienie do prac opublikowanych w 2018 roku, jak również do serii artykułów w ramach specjalnego wydania: Special Issue on Augmented and Participatory Sound and Music Interaction Using Semantic Audio.
-
Eksternalizacja w binauralnej ambisonicznej auralizacji źródeł kierunkowych
PublicationW artykule przedstawiono najważniejsze składniki procesu skutecznego renderowania trójwymiarowego obrazu dźwiękowego za pomocą słuchawek. W tym celu badany jest stopień oddziaływania poszczególnych czynników wpływających na eksternalizację dźwięku: śledzenie położenia głowy (ang. head tracking), indywidualne funkcje przenoszenia głowy (HRTF – Head Related Transfer Function, odnoszące się do matematycznej funkcji propagacji dźwięku...
-
Electrical Stimulation Modulates High Gamma Activity and Human Memory Performance
PublicationDirect electrical stimulation of the brain has emerged as a powerful treatment for multiple neurological diseases, and as a potential technique to enhance human cognition. Despite its application in a range of brain disorders, it remains unclear how stimulation of discrete brain areas affects memory performance and the underlying electrophysiological activities. Here, we investigated the effect of direct electrical stimulation...
-
Employing economical methods for pavement defects estimation
PublicationIt is a common practise that measurements of road surface conditions are made using professional and expensive apparatus. Typically a van or a truck equipped with a set of professional sensors i.e. laser scanners of surface is used, therefore the measurement update period is often quite long. Two alternative low-cost methods for estimating road pavement defects and failures were proposed and investigated by the authors. The first...
-
Eulerian motion magnification applied to structural health monitoring of wind turbines
PublicationSeveral types of defects may occur in wind turbines, as physical damage of blades or gearbox malfunction. A wind farm monitoring and damage prediction system is built to observe abnormal vibrations of elements of wind turbine: blades, nacelle, and tower. Contactless methods are developed which do not require turbine stopping. In this work, structural health monitoring of a wind turbine is evaluated using a conversion from the captured...
-
EVALUATION OF SOUND QUALITY FEATURES ON ENVIRONMENTAL NOISE EFFECTS – A CASE STUDY APPLIED TO ROAD TRAFFIC NOISE
PublicationThe paper shows a study on the relationship between noise measures and sound quality (SQ) features that are related to annoyance caused by the traffic noise. First, a methodology to perform analyses related to the traffic noise annoyance is described including references to parameters of the assessment of road noise sources. Next, the measurement setup, location and results are presented along with the derived sound quality features....
-
Examination of the factors influencing binaural rendering on headphones with the use of directivity patterns
PublicationThis paper presents a study on the influence of the directional sound sources with the use of the directivity patterns. This contribution also includes a comparison to the work done by Wendt et al., where several directivity pattern designs used to gradually control the auditory source distance in a room were showed. While the tests of Wendt et al. were done by auralizing source and room using a loudspeaker ring in an anechoic...
-
Examining Feature Vector for Phoneme Recognition
PublicationThe aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...
-
Improving the quality of speech in the conditions of noise and interference
PublicationThe aim of the work is to present a method of intelligent modification of the speech signal with speech features expressed in noise, based on the Lombard effect. The recordings utilized sets of words and sentences as well as disturbing signals, i.e., pink noise and the so-called babble speech. Noise signal, calibrated to various levels at the speaker's ears, was played over two loudspeakers located 2 m away from the speaker. In...
-
In Memoriam Professors Marianna Sankiewicz-Budzyński and Gustaw K.E. Budzyński - Founders of the Polish Audio Engineering
PublicationBiography and scientific achievements of Professors Marianna Sankiewicz-Budzyński and Gustaw K.E. Budzyński - Founders of the Polish Audio Engineering.
-
INFLUENCE OF DATA NORMALIZATION ON THE EFFECTIVENESS OF NEURAL NETWORKS APPLIED TO CLASSIFICATION OF PAVEMENT CONDITIONS – CASE STUDY
PublicationIn recent years automatic classification employing machine learning seems to be in high demand for tele-informatic-based solutions. An example of such solutions are intelligent transportation systems (ITS), in which various factors are taken into account. The subject of the study presented is the impact of data pre-processing and normalization on the accuracy and training effectiveness of artificial neural networks in the case...
-
Instrument detection and pose estimation with rigid part mixtures model in video-assisted surgeries
PublicationLocalizing instrument parts in video-assisted surgeries is an attractive and open computer vision problem. A working algorithm would immediately find applications in computer-aided interventions in the operating theater. Knowing the location of tool parts could help virtually augment visual faculty of surgeons, assess skills of novice surgeons, and increase autonomy of surgical robots. A surgical tool varies in appearance due to...
-
Investigating Feature Spaces for Isolated Word Recognition
PublicationMuch attention is given by researchers to the speech processing task in automatic speech recognition (ASR) over the past decades. The study addresses the issue related to the investigation of the appropriateness of a two-dimensional representation of speech feature spaces for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and timefrequency signal representation...
-
Listening to Live Music: Life beyond Music Recommendation Systems
PublicationThis paper presents first a short review on music recommendation systems based on social collaborative filtering. A dictionary of terms related to music recommendation systems, such as music information retrieval (MIR), Query-by-Example (QBE), Query-by-Category (QBC), music content, music annotating, music tagging, bridging the semantic gap in music domain, etc. is introduced. Bases of music recommender systems are shortly presented,...
-
Machine Learning Applied to Aspirated and Non-Aspirated Allophone Classification—An Approach Based on Audio "Fingerprinting"
PublicationThe purpose of this study is to involve both Convolutional Neural Networks and a typical learning algorithm in the allophone classification process. A list of words including aspirated and non-aspirated allophones pronounced by native and non-native English speakers is recorded and then edited and analyzed. Allophones extracted from English speakers’ recordings are presented in the form of two-dimensional spectrogram images and...
-
Marking the Allophones Boundaries Based on the DTW Algorithm
PublicationThe paper presents an approach to marking the boundaries of allophones in the speech signal based on the Dynamic Time Warping (DTW) algorithm. Setting and marking of allophones boundaries in continuous speech is a difficult issue due to the mutual influence of adjacent phonemes on each other. It is this neighborhood on the one hand that creates variants of phonemes that is allophones, and on the other hand it affects that the border...
-
Measurement of Latency in the Android Audio Path
PublicationThis paper provides a description of experimental investigations concerning comparison between the audio path characteristics of various Android versions. First, information about the changes in each system version in the context of latency caused by them is presented. Then, a measurement procedure employing available applications to measure latency is described comparing to results contained in the Internet. Finally, a comparison...
-
Metodyka tworzenia dynamicznych map hałasu w środowisku aglomeracji miejskiej z zastosowaniem gridu superkomputerowego
PublicationW rozprawie przedstawiono i zweryfikowano opracowaną przez autora metodę sporządzania aktualizowanych dynamicznie map hałasu. Oryginalnym podejściem jest zastosowanie potencjału gridu superkomputerowego jako środowiska do przeprowadzania obliczeń numerycznych w procesie modelowania źródeł i propagacji dźwięku. Dzięki temu możliwe stało się przeliczanie mapy hałasu obszaru wielkości dużego miasta w krótkich odstępach czasu. Autor...
-
Modelling of Objects Behaviour for Their Re-identification in Multi-camera Surveillance System Employing Particle Filters and Flow Graphs
PublicationAn extension of the re-identification method of modeling objects behavior in muti-camera surveillance systems, related to adding a particle filter to the decision-making algorithm is covered by the paper. A variety of tracking methods related to a single FOV (Field of Vision) are known, proven to be quite different for inter-camera tracking, especially in case of non-overlapping FOVs. The re-identification methods refer to the...
-
Objectivization of phonological evaluation of speech elements by means of audio parametrization
PublicationThis study addresses two issues related to both machine- and subjective-based speech evaluation by investigating five phonological phenomena related to allophone production. Its aim is to use objective parametrization and phonological classification of the recorded allophones. These allophones were selected as specifically difficult for Polish speakers of English: aspiration, final obstruent devoicing, dark lateral /l/, velar nasal...
-
Performance Analysis of Developed Multimodal Biometric Identity Verification System
PublicationThe bank client identity verification system developed in the course of the IDENT project is presented. The total number of five biometric modalities including: dynamic handwritten signature proofing, voice recognition, face image verification, face contour extraction and hand blood vessels distribution comparison have been developed and studied. The experimental data were acquired employing multiple biometric sensors installed...
-
Pomiary wartości opóźnień w torze audio urządzeń z systemem Android
PublicationPoniższy artykuł opisuje metody pomiarów wartości opóźnienia w torze fonicznym urządzeń pracujących na różnych wersjach systemu Android. W pierwszej części artykułu podano krótką charakterystykę środowiska Android w kontekście opóźnień w torze fonicznym. Następnie przedstawiono sposób pomiaru opóźnienia w torze fonicznym za pomocą aplikacji SuperPowered Latency oraz Dr. Rick O’Rang Loopback. W końcowej...
-
POPRAWA OBIEKTYWNYCH WSKAŹNIKÓW JAKOŚCI MOWY W WARUNKACH HAŁASU
PublicationCelem pracy jest modyfikacja sygnału mowy, aby uzyskać zwiększenie poprawy obiektywnych wskaźników jakości mowy po zmiksowaniu sygnału użytecznego z szumem bądź z sygnałem zakłócającym. Wykonane modyfikacje sygnału bazują na cechach mowy lombardzkiej, a w szczególności na efekcie podniesienia częstotliwości podstawowej F0. Sesja nagraniowa obejmowała zestawy słów i zdań w języku polskim, nagrane w warunkach ciszy, jak również w...
-
Potencjał wdrożeniowy systemu netBaltic - scenariusze wykorzystania i perspektywy dalszego rozwoju
PublicationPrzedstawiono potrzeby związane z wdrażaniem usług e-nawigacji . Dokonano też krótkiej analizy wymagań w zakresie pożądanych parametrów transmisyjnych systemów zdolnych do przenoszenia rosnącej ilości informacji wymienianych pomiędzy stacjami brzegowymi i statkami na morzu. Dokonano też krótkiego przeglądu szerokiego zakresu systemów wykorzystywanych na morzu. Sformułowano wnioski związane z potrzebą opracowania uniwersalnego...
-
Przykład zastosowania przetworników piezoelektrycznych do stworzenia elektronicznych padów na platformie sprzętowej Arduino
PublicationW pracy zaprezentowano autorskie urządzenie umożliwiające sterowania procesem wyzwalania dowolnych próbek dźwiękowych przy użyciu tak zwanych padów perkusyjnych w zewnętrznym samplerze. Pady stworzono za pomocą zestawu zabawkowej perkusji, przetworników piezoelektrycznych oraz specjalnie zaprogramowanej platformy sprzętowej Arduino.
-
Pupil size reflects successful encoding and recall of memory in humans
PublicationPupil responses are known to indicate brain processes involved in perception, attention and decision-making. They can provide an accessible biomarker of human memory performance and cognitive states in general. Here we investigated changes in the pupil size during encoding and recall of word lists. Consistent patterns in the pupil response were found across and within distinct phases of the free recall task. The pupil was most...
-
REJESTRACJA, PARAMETRYZACJA I KLASYFIKACJA ALOFONÓW Z WYKORZYSTANIEM BIMODALNOŚCI
PublicationPraca dotyczy rejestracji i parametryzacji alofonów w języku angielskim z wykorzystaniem dwóch modalności. W badaniach dokonano rejestracji wypowiedzi w języku angielskim mówców, których znajomość tego języka odpowiada poziomowi rodowitego mówcy. W kolejnym etapie wyodrębnione zostały alofony z nagrań fonicznych i odpowiadające im sygnały wizyjne. W procesie tworzenia wektorów cech wykorzystano odrębne systemy parametryzacji,...
-
Selection of Features for Multimodal Vocalic Segments Classification
PublicationEnglish speech recognition experiments are presented employing both: audio signal and Facial Motion Capture (FMC) recordings. The principal aim of the study was to evaluate the influence of feature vector dimension reduction for the accuracy of vocalic segments classification employing neural networks. Several parameter reduction strategies were adopted, namely: Extremely Randomized Trees, Principal Component Analysis and Recursive...