Publikacje
Filtry
wszystkich: 891
Katalog Publikacji
Rok 2019
-
MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES
PublikacjaAutomatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and selforganizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’...
-
Method for Clustering of Brain Activity Data Derived from EEG Signals
PublikacjaA method for assessing separability of EEG signals associated with three classes of brain activity is proposed. The EEG signals are acquired from 23 subjects, gathered from a headset consisting of 14 electrodes. Data are processed by applying Discrete Wavelet Transform (DWT) for the signal analysis and an autoencoder neural network for the brain activity separation. Processing involves 74 wavelets from 3 DWT families: Coiflets,...
-
Metoda i system adaptacyjnego sterowania parametrami algorytmu syntezy niskich częstotliwości dźwięków muzycznych
PublikacjaW ostatnich latach można zaobserwować bardzo wyraźny i systematyczny wzrost wykorzystywania urządzeń mobilnych jako środka do odtwarzania muzyki, czy odtwarzania filmów w dowolnych warunkach akustycznych. Ich użytkownicy oczekują przy tym jak najlepszych walorów brzmieniowych dźwięku. W niniejszej rozprawie zostały zaproponowane metody, mające na celu poprawę brzmienia urządzeń mobilnych w zakresie niskich częstotliwości i korekcji...
-
MULTIMODALNE POMIARY DRGAŃ STRUNY
PublikacjaW artykule zostały przedstawione badania drgań struny zrealizowane przy użyciu szybkich kamer wizyjnych, mikrofonu oraz akcelerometru. Obiektem badań były instrumenty muzyczne. Opisano zjawiska zachodzące w instrumencie podczas tworzenia się i wydobywania z niego dźwięku. Celem pracy było zbadanie różnic w wynikach otrzymanych poprzez pomiary wykonane z użyciem zróżnicowanych reprezentacji obrazowych i sygnałowych. Zaproponowano...
-
Music information retrieval—The impact of technology, crowdsourcing, big data, and the cloud in art.
PublikacjaThe exponential growth of computer processing power, cloud data storage, and crowdsourcing model of gathering data bring new possibilities to music information retrieval (mir) field. Mir is no longer music content retrieval only; the area also comprises the discovery of expressing feelings and emotions contained in music, incorporating other than hearing modalities for helping this issue, users’ profiling, merging music with social...
-
Music signal equalization in a changing environment
PublikacjaThe paper presents the concept of an automatic system for music signal correction, considering room frequency response and music genre being played. The proposed algorithm, based on the room frequency response, compensates acoustic conditions surrounding the sound source. Additionally, the compensation process considers the signal content by recognizing music genre. As part of the described research, a series of subjective tests...
-
New applications of sound and vision engineering
PublikacjaMultimedia, Sound & Vision Engineering are relatively new fields within the area of science and technology, but teaching and research in this area has been carried out at Gdansk University of Technology (Gdansk, Poland) for nearly 5 decades. Current project carried-out in the Multimedia Systems Department are in the scope of the paper.
-
Porównanie klasycznych i ewolucyjnych metod projektowanie dyfuzorów akustycznych w warunkach odsłuchowych symulowanych metodą FDTD
PublikacjaW niniejszym rozdzaiel przedstawione zostanie porównanie dwóch podejść do projektowania dyfuzorów akustycznych. Pierwsze z nich bazuje na klasycznych założeniach dotyczących wykorzystywania sekwewncji pseudolosowych. Drugie z nich wykorzystuje automatyczne podejście bazujące na wykorzystaniu algorytmów genetycznych. Porównanie takie pozwala na określenie zalet i wad podejśc klasycznych oraz podejśc bazujących na zastosowaniu algorytmu...
-
Post-comatose patients with minimal consciousness tend to preserve reading comprehension skills but neglect syntax and spelling
PublikacjaModern eye tracking technology provides a means for communication with patients suffering from disorders of consciousness (DoC) or remaining in locked-in-state. However, being able to use an eye tracker for controlling text-based contents by such patients requires preserved reading ability in the first place. To our knowledge, this aspect, although of great social importance, so far has seemed to be neglected. In the paper, we...
-
Projektowanie oraz implementacja cyfrowego multiefektu gitarowego z wykorzystaniem procesora sygnałowego
PublikacjaW artykule został przedstawiony proces projektowania i realizacji cyfrowego multiefektu gitarowego z wykorzystaniem procesora sygnałowegoTMS320C5535 firmy Texas Instruments, dla którego oprogramowanie napisano w języku C. Omówiono zasady działania oraz algorytmy wybranych efektów dźwiękowych, które zostały zaimplementowane w procesorze sygnałowym. Zaprojektowano również uniwersalny moduł wejściowy zawierający wzmacniacz z regulowanym...
-
Real and Virtual Instruments in Machine Learning – Training and Comparison of Classification Results
PublikacjaThe continuous growth of the computing power of processors, as well as the fact that computational clusters can be created from combined machines, allows for increasing the complexity of algorithms that can be trained. The process, however, requires expanding the basis of the training sets. One of the main obstacles in music classification is the lack of high-quality, real-life recording database for every instrument with a variety...
-
Recovering Sound Produced by Wind Turbine Structures Employing Video Motion Magnification
PublikacjaThe recordings were made with a fast video camera and with a microphone. Using fast cameras allowed for observation of the micro vibrations of the object structure. Motion-magnified video recordings of wind turbines on a wind farm were made for the purpose of building a damage prediction system. An idea was to use video to recover sound & vibrations in order to obtain a contactless diagnostic method for wind turbines. The recovered signals...
-
Relationship between album cover design and music genres.
PublikacjaThe aim of the study is to find out whether there exists a relationship between typographic, compositional and coloristic elements of the music album cover design and music contained in the album. The research study involves basic statistical analysis of the manually extracted data coming from the worldwide album covers. The samples represent 34 different music genres, coming from nine countries from around the world. There are...
-
Sound engineering as our commitment to its creators in Poland
PublikacjaSound engineering is an interdisciplinary and rapidly expanding domain. It covers many aspects, such as sound perception, studio and sound mastering technology, music information retrieval including content-based search systems and automatic music transcription frameworks, sound synthesis, sound restoration, electroacoustics, and other ones constituting multimedia technology. Moreover, machine learning methods applied to the topics...
-
Special techniques and future perspectives: Simultaneous macro- and micro-electrode recordings
PublikacjaThere are many approaches to studying the inner workings of the brain and its highly interconnected circuits. One can look at the global activity in different brain structures using non-invasive technologies like positron emission tomography (PET) or functional magnetic resonance imaging (fMRI), which measure physiological changes, e.g. in the glucose uptake or blood flow. These can be very effectively used to localize active patches...
-
Speech Analytics Based on Machine Learning
PublikacjaIn this chapter, the process of speech data preparation for machine learning is discussed in detail. Examples of speech analytics methods applied to phonemes and allophones are shown. Further, an approach to automatic phoneme recognition involving optimized parametrization and a classifier belonging to machine learning algorithms is discussed. Feature vectors are built on the basis of descriptors coming from the music information...
-
Style Transfer for Detecting Vehicles with Thermal Camera
PublikacjaIn this work we focus on nighttime vehicle detection for intelligent traffic monitoring from the thermal camera. To train a Convolutional Neural Network (CNN) detector we create a stylized version of COCO (Common Objects in Context) dataset using Style Transfer technique that imitates images obtained from thermal cameras. This new dataset is further used for fine-tuning of the model and as a result detection accuracy on images...
-
Subjective tests for gathering knowledge for applying color grading to video clips automatically
PublikacjaThe analysis of film music concerning caused emotions may allow for a more accurate adaptation of the color of the film in the context of color grading. Therefore, this paper aims to gather knowledge on the correlation between the applied color palette to a video clip, music associated with a particular shot, and emotions evoked. For that purpose, subjective tests are prepared in which several video clips are presented with or...
-
Subjective tests for gathering konwledge for applaying color grading to video clips automatically
PublikacjaThe analysis of film music concerning caused emotions may allow for a more accurate adaptation of the color of the film in the context of color grading. Therefore, this paper aims to gather knowledge on the correlation between the applied color palette to a video clip, music associated with a particular shot,and emotions evoked. For that purpose, subjective tests are prepared in which several video clips are presented with...
-
Unsupervised machine-learning classification of electrophysiologically active electrodes during human cognitive task performance
PublikacjaIdentification of active electrodes that record task-relevant neurophysiological activity is needed for clinical and industrial applications as well as for investigating brain functions. We developed an unsupervised, fully automated approach to classify active electrodes showing event-related intracranial EEG (iEEG) responses from 115 patients performing a free recall verbal memory task. Our approach employed new interpretable...
-
Validating data acquired with experimental multimodal biometric system installed in bank branches
PublikacjaAn experimental system was engineered and implemented in 100 copies inside a real banking environment comprising: dynamic handwritten signature verification, face recognition, bank client voice recognition and hand vein distribution verification. The main purpose of the presented research was to analyze questionnaire responses reflecting user opinions on: comfort, ergonomics, intuitiveness and other aspects of the biometric enrollment...
-
Variable length sliding models for banking clients face biometry
PublikacjaAn experiment was organized in 100 bank branches to acquire biometric samples from nearly 5000 clients including face images. A procedure for creating face verification models based on continuously expanding database of biometric samples is proposed, implemented, and tested. The presented model applies to circumstances where it is possible to collect and to take into account new biometric samples after each positive verification...
-
Vehicle detector training with minimal supervision
PublikacjaRecently many efficient object detectors based on convolutional neural networks (CNN) have been developed and they achieved impressive performance on many computer vision tasks. However, in order to achieve practical results, CNNs require really large annotated datasets for training. While many such databases are available, many of them can only be used for research purposes. Also some problems exist where such datasets are not...
-
Weryfikacja autentyczności kolorów na zdjęciach wykonanych w technice analogowej
PublikacjaW artykule opisano zagadnienie odróżniania historycznych fotografii pomiędzy oryginalnie kolorowe a koloryzowane. Rozważono problem doboru zdjęć pod względem technologii, w jakiej zostały wykonane. Następnie wykorzystując sieci neuronowe już w części wyuczone na innych zbiorach danych, sprawdzono ich efektywność w rozwiązywaniu badanego problemu. Rozważono wpływ rozmiaru obrazu podanego na wejściu, architektury zastosowanej sieci,...
-
Wind Turbines Modeling as the Tool for Developing Algorithms of Processing their Video Recordings
PublikacjaIn the real world, many factors exist disturbing observation of the examined phenomena and causing various noises and distortions in recorded signals. It very often makes it difficult or even impossible to optimize various signal processing algorithms, through finding appropriate parameters. In this paper, we show an application, that retrieves wind turbine rotor speed from recorded video. Next, we describe the process of reduction...
-
Wind Turbines Modeling as the Tool for Developing Algorithms of Processing their Video Recordings
PublikacjaIn the real world, many factors exist disturbing observation of the examined phenomena and causing various noises and distortions in recorded signals. It very often makes it difficult or even impossible to optimize various signal processing algorithms, through finding appropriate parameters. In this paper, we show an application, that retrieves wind turbine rotor speed from recorded video. Next, we describe the process of reduction...
-
Wpływ kolorystyki ujęć oraz ścieżki dźwiękowej na emocje widza - wstępne eksperymenty
PublikacjaBrak
-
Wykorzystanie sieci neuronowych do syntezy mowy wyrażającej emocje
PublikacjaW niniejszym artykule przedstawiono analizę rozwiązań do rozpoznawania emocji opratych na mowie i możliwości ich wykprzystania w syntezie mowy z emocjami stosując do tego celu sieci neuronowe. Wskazano również przydatnośc parametrów typowo stosowanych do rozpoznawania mowy w detekcji emocji w śpiewie i rozróżnianiu tych emocji w obu przypadkach. Przedstawiono aktualne rozwiązania dotyczące rozpoznawania emocji w mowie i metod syntezy...
Rok 2018
-
A comparative study of English viseme recognition methods and algorithm
PublikacjaAn elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...
-
A comparative study of English viseme recognition methods and algorithms
PublikacjaAn elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector construction...
-
A Comparison of STI Measured by Direct and Indirect Methods for Interiors Coupled with Sound Reinforcement Systems
PublikacjaThis paper presents a comparison of STI (Speech Transmission Index) coefficient measurement results carried out by direct and indirect methods. First, acoustic parameters important in the context of public address and sound reinforcement systems are recalled. A measurement methodology is presented that employs various test signals to determine impulse responses. The process of evaluating sound system performance, signals enabling...
-
A Device for Measuring Auditory Brainstem Responses to Audio
PublikacjaStandard ABR devices use clicks and tone bursts to assess subjects’ hearing in an objective way. A new device was developed that extends the functionality of a standard ABR audiometer by collecting and analyzing auditory brainstem responses (ABR). The developed accessory allows for the use of complex sounds (e.g., speech or music excerpts) as stimuli. Therefore, it is possible to find out how efficiently different types of sounds...
-
A fast time-frequency multi-window analysis using a tuning directional kernel
PublikacjaIn this paper, a novel approach for time-frequency analysis and detection, based on the chirplet transform and dedicated to non-stationary as well as multi-component signals, is presented. Its main purpose is the estimation of spectral energy, instantaneous frequency (IF), spectral delay (SD), and chirp rate (CR) with a high time-frequency resolution (separation ability) achieved by adaptive fitting of the transform kernel. We...
-
A Stand for Measurement and Prediction of Scattering Properties of Diffusers
PublikacjaIn this paper we present a set of solutions which may be used for prototyping and simulation of acoustic scattering devices. A system proposed is capable of measuring sound field. Also a way to use an open source solution for simulation of scattering phenomena occurring in proximity of acoustic diffusers is shown. The result of our work are measurement procedure and a prototype of the simulation script based on FEniCS - an open source...
-
A Study on Audio Signal Processed by "Instant Mastering"
PublikacjaAn increasing amount of music produced in home- and project-studios results in development and growth of "automatic mastering services". The presented investigation explores changes introduced to audio signal by various online mastering platforms. A music set consisting of 10 songs produced in small facilities was processed by eight on-line automatic mastering services. Additionally, some laboratory-constructed signals were tested....
-
A study on of music features derived from audio recordings examples – a quantitative analysis
PublikacjaThe paper presents a comparative study of music features derived from audio recordings, i.e. the same music pieces but representing different music genres, excerpts performed by different musicians, and songs performed by a musician, whose style evolved over time. Firstly, the origin and the background of the division of music genres were shortly presented. Then, several objective parameters of an audio signal were recalled that...
-
Adaptacja akustyczna pomieszczenia wykładowego - studium przypadku
PublikacjaW niniejszej pracy przedstawiono analizę rozkładu pola akustycznego sali wykładowej znajdującej się w budynku Wydziału Elektroniki i Telekomunikacji Politechniki Gdańskiej. Badania przeprowadzono metodą pomiarową oraz symulacyjną z wykorzystaniem programu Odeon. Wybór parametrów oceny akustyki wnętrz sugerowany jest wymaganiami stawianymi pomieszczeniom lekcyjnym z zaznaczeniem multimedialnego charakteru wykładów prowadzonych...
-
Akustyczna analiza parametrów ruchu drogowego z wykorzystaniem informacji o hałasie oraz uczenia maszynowego
PublikacjaCelem rozprawy było opracowanie akustycznej metody analizy parametrów ruchu drogowego. Zasada działania akustycznej analizy ruchu drogowego zapewnia pasywną metodę monitorowania natężenia ruchu. W pracy przedstawiono wybrane metody uczenia maszynowego w kontekście analizy dźwięku (ang.Machine Hearing). Przedstawiono metodologię klasyfikacji zdarzeń w ruchu drogowym z wykorzystaniem uczenia maszynowego. Przybliżono podstawowe...
-
An application of acoustic sensors for the monitoring of road traffic
PublikacjaAssessment of road traffic parameters for the developed intelligent speed limit setting decision system constitutes the subject addressed in the paper. Current traffic conditions providing vital data source for the calculation of the locally fitted speed limits are assessed employing an economical embedded platform placed at the roadside. The use of the developed platform employing a low-powered processing unit with a set of microphones,...
-
Analiza Nagrań Ruchu Drogowego w Kontekście Akustycznej Klasyfikacji Typu Pojazdu
PublikacjaCelem niniejszej pracy jest przeprowadzenie analizy sygnału fonicznego w kontekście klasyfikacji typu pojazdu. Część teoretyczna zawiera krytyczny przegląd systemów monitorowania ruchu drogowego, w szczególności systemów ITS (Intelginet Transport System). Część praktyczna przedstawia założenia dotyczące przygotowania bazy nagrań testowych, uwzględniających różne scenariusze ruchu drogowego. Zarejestrowane sesje nagraniowe przetworzono,...
-
Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition
Publikacjaconvolutional neural network (CNN) which is a class of deep, feed-forward artificial neural network. We decided to analyze audio signal feature maps, namely spectrograms, linear and Mel-scale cepstrograms, and chromagrams. The choice was made upon the fact that CNN performs well in 2D data-oriented processing contexts. Feature maps were employed in the Lithuanian word recognition task. The spectral analysis led to the highest word...
-
Analysis of Lombard speech using parameterization and the objective quality indicators in noise conditions
PublikacjaThe aim of the work is to analyze Lombard speech effect in recordings and then modify the speech signal in order to obtain an increase in the improvement of objective speech quality indicators after mixing the useful signal with noise or with an interfering signal. The modifications made to the signal are based on the characteristics of the Lombard speech, and in particular on the effect of increasing the fundamental frequency...
-
Analysis of results of large-scale multimodal biometric identity verification experiment
PublikacjaAn analysis of a large set of biometric data obtained during the enrolment and the verification phase in an experimental biometric system installed in bank branches is presented. Subjective opinions of bank clients and of bank tellers were also surveyed concerning the studied biometric methods in order to discover and to explore relations emerging from the obtained multimodal dataset. First, data acquisition and identity verification...
-
Aparat słuchowy a alternatywne urządzenia poprawiające słyszenie
PublikacjaW opracowaniu dokonano przeglądu dostępnych prac dotyczących różnych rodzajów urządzeń poprawiających słyszenie, które w szczególnych przypadkach mogą być traktowane jako rozwiązania alternatywne w stosunku do klasycznych aparatów słuchowych. Praca zawiera dyskusję na temat nowego rodzaju aparatu słuchowego wstępnie zaprogramowanego, który może być dystrybuowany korespondencyjnie lub bezpośrednio potencjalnym użytkownikom. Ponadto...
-
Assessment of Therapeutic Progress After Acquired Brain Injury Employing Electroencephalography and Autoencoder Neural Networks
PublikacjaA method developed for parametrization of EEG signals gathered from participants with acquired brain injuries is shown. Signals were recorded during therapeutic session consisting of a series of computer assisted exercises. Data acquisition was performed in a neurorehabilitation center located in Poland. The presented method may be used for comparing the performance of subjects with acquired brain injuries (ABI) who are involved...
-
AUDIO SIGNAL EQUALIZATION BASED ON IMPULSE RESPONSE OF A LISTENING ROOM AND MUSIC CONTENT REPRODUCED
PublikacjaA research study presents investigations of the influence of the room acoustics on the frequency characteristic of the audio signal playback. First, a concept of a novel spectral equalization method of the room acoustic conditions is introduced. On the basis of the room spectral response, a system for room acoustics compensation based on an equalizer designed is proposed. The system settings depend on music genre recognized automatically....
-
Audio-visual aspect of the Lombard effect and comparison with recordings depicting emotional states.
PublikacjaIn this paper an analysis of audio-visual recordings of the Lombard effect is shown. First, audio signal is analyzed indicating the presence of this phenomenon in the recorded sessions. The principal aim, however, was to discuss problems related to extracting differences caused by the Lombard effect, present in the video , i.e. visible as tension and work of facial muscles aligned to an increase in the intensity of the articulated...
-
Automatic Clustering of EEG-Based Data Associated with Brain Activity
PublikacjaThe aim of this paper is to present a system for automatic assigning electroencephalographic (EEG) signals to appropriate classes associated with brain activity. The EEG signals are acquired from a headset consisting of 14 electrodes placed on skull. Data gathered are first processed by the Independent Component Analysis algorithm to obtain estimates of signals generated by primary sources reflecting the activity of the brain....
-
Automatic music genre classification based on musical instrument track separation / Automatyczna klasyfikacja gatunku muzycznego wykorzystująca algorytm separacji dźwięku instrumentó muzycznych
PublikacjaThe aim of this article is to investigate whether separating music tracks at the pre-processing phase and extending feature vector by parameters related to the specific musical instruments that are characteristic for the given musical genre allow for efficient automatic musical genre classification in case of database containing thousands of music excerpts and a dozen of genres. Results of extensive experiments show that the approach...
-
Badanie stanu nawierzchni drogowej z wykorzystaniem uczenia maszynowego
PublikacjaW artykule opisano budowę systemu informowania o stanie nawierzchni drogowej z wykorzystaniem metod cyfrowego przetwarzania obrazów oraz uczenia maszynowego. Efektem wykonanych prac badawczych jest eksperymentalna platforma, pozwalająca na rejestrację uszkodzeń na drogach, system do analizy, przetwarzania i klasyfikacji danych oraz webowa aplikacja użytkownika do przeglądu stanu nawierzchni w wybranej lokalizacji.